CN115526758A - Hadamard transform screen-shot-resistant watermarking method based on deep learning - Google Patents
Hadamard transform screen-shot-resistant watermarking method based on deep learning Download PDFInfo
- Publication number
- CN115526758A CN115526758A CN202211210100.8A CN202211210100A CN115526758A CN 115526758 A CN115526758 A CN 115526758A CN 202211210100 A CN202211210100 A CN 202211210100A CN 115526758 A CN115526758 A CN 115526758A
- Authority
- CN
- China
- Prior art keywords
- watermark
- image
- attack
- module
- screen shot
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000013135 deep learning Methods 0.000 title claims abstract description 18
- 238000000605 extraction Methods 0.000 claims description 42
- 238000012549 training Methods 0.000 claims description 40
- 238000006243 chemical reaction Methods 0.000 claims description 29
- 238000010586 diagram Methods 0.000 claims description 26
- 230000006870 function Effects 0.000 claims description 21
- 238000004088 simulation Methods 0.000 claims description 21
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 11
- 230000000087 stabilizing effect Effects 0.000 claims description 2
- 230000008569 process Effects 0.000 abstract description 18
- 230000000694 effects Effects 0.000 abstract description 5
- 239000000284 extract Substances 0.000 abstract description 4
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 abstract description 3
- 238000013528 artificial neural network Methods 0.000 abstract description 2
- 230000009466 transformation Effects 0.000 description 23
- 238000012360 testing method Methods 0.000 description 8
- 238000012545 processing Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 6
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 4
- 230000018109 developmental process Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/0021—Image watermarking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Processing (AREA)
- Editing Of Facsimile Originals (AREA)
Abstract
The invention discloses a Hadamard transform screen-shot-resisting watermark method based on deep learning, and belongs to the technical field of watermarks. The invention combines the Convolution Neural Network (CNN) and the residual block, and realizes the end-to-end process of embedding and extracting the watermark in the Hadamard domain. The invention adds the screen shot attack into the middle of the embedding layer and the extracting layer by simulating the screen shot attack so as to ensure that the network can stably embed the watermark. The method can efficiently extract the watermark information according to the photos of the candid, and carry out copyright protection on the digital media. The invention uses the transform domain to embed the watermark, so that the watermark is diffused in a wider image range, and the characteristic has a better effect of improving the robustness of the watermark algorithm. The superiority of the invention in imperceptibility and robustness is demonstrated by comparison with the related art.
Description
Technical Field
The invention belongs to the technical field of watermarks, and particularly relates to a deep learning-based Hadamard transform screen-shot-resistant watermarking method.
Background
Electronic screen has become the indispensable part of people's daily life as the novel medium that bears knowledge and spread information, and electronic screen has very big change people's reading mode, and more selection screens of contemporary people read and replace paper to read.
Meanwhile, with the wide use of smart phones, the photographing process becomes simple and easy to implement. In the field of information security, stealing sensitive information in a unit by means of screen photographing, video recording and the like has gradually become a main way for leaking current important information.
Screen shots (i.e., screenshots) are a hot problem to evidence divulgence. In order to solve the problem of secret leakage, many departments and companies need to deploy a screen shooting traceability system to obtain evidence of the secret leakage behavior, so that the deterrence capability is formed, and the emission of the secret leakage event is avoided. Specifically, it is mainly classified into proactive defense and documented later. The controlled terminal is pre-hidden in the traceability information, the client is arranged for warning, the system deployment forms a huge frightening effect, and the willingness of a divulger is effectively reduced. On the other hand, after essential information is leaked, a divulgence source and a responsible person are found through the tracing system, so that the divulgence person can not repudiate the essential information, the responsibility of the related person is effectively traced, the copyright information of the mechanism to which the essential information belongs is guaranteed, and the information safety is guaranteed. At present, the application of deep learning technology in the field of watermarking algorithms is rapidly increasing, because it can effectively solve the process of watermark embedding and extracting. The existing digital watermarking technology can effectively solve the problems of multimedia data copyright protection and the like, but the problem of how to design a digital watermarking algorithm capable of resisting screen shot attack still is difficult. The traditional digital watermarking algorithm can effectively resist common attack types such as image processing, geometric transformation and the like, but because screen shot attack is a complex process, when an image displayed on a screen is shot, the image and the watermark are subjected to a series of analog-to-digital conversion and digital-to-analog conversion processes, and the processes can be expressed as a powerful attack combination. At present, the digital watermarking algorithm for resisting screen shot attack is less researched. Therefore, how to improve the screen shot robustness of the algorithm in the digital watermarking algorithm is a technical problem to be solved urgently at present.
The use of convolutional neural networks for watermarking has recently emerged due to the rapid development of machine learning tools and deep networks in various computer vision and image processing fields.
Disclosure of Invention
The invention aims to solve the problem that a digital watermarking algorithm in the prior art is difficult to resist screen shot attack, and provides a method for resisting screen shot watermarking based on Hadamard transform of deep learning. The invention combines a Convolutional Neural Network (CNN) and a residual block, and realizes the end-to-end process of embedding and extracting the watermark in a Hadamard domain. In addition, the invention ensures that the network steadily embeds the watermark and can extract the watermark information from the leaked photos by adding the module simulating the screen shooting attack to the middle of the watermark embedding layer and the watermark extracting layer, thereby realizing the copyright protection.
The technical scheme adopted by the invention is as follows:
a Hadamard transform anti-screen watermarking method based on deep learning comprises the following steps:
s1, constructing a watermark model framework consisting of a watermark embedding module, an attack simulation module and a watermark extraction module;
the watermark embedding module is formed by connecting a first Hadamard transform layer, a first convolution module and a Hadamard inverse transform layer, and the input of the watermark embedding module is a watermark to be embedded and an original image needing to be embedded with the watermark; the method comprises the steps that a single-channel original image is divided into a series of first image blocks with the same size in advance, each first image block is input into a first Hadamard conversion layer and is converted into a frequency domain from a space domain through Hadamard conversion, the two-dimensional conversion result of each first image block is spliced along the channel dimension to obtain a first conversion characteristic diagram, a watermark to be embedded is embedded into the first conversion characteristic diagram to obtain a second conversion characteristic diagram, the second conversion characteristic diagram is input into a first convolution module to be subjected to convolution operation, so that a third conversion characteristic diagram with the same channel number as the first conversion characteristic diagram is obtained, the third conversion characteristic diagram is input into an Hadamard inverse conversion layer channel by channel and is converted back into the space domain from the frequency domain through Hadamard inverse conversion, the two-dimensional conversion result of each channel is spliced and reduced again according to the splitting sequence to obtain an intermediate image with the same size as the original image, and the intermediate image is output as a single-channel water-containing image after being superposed with the original image;
the attack simulation module is internally provided with a plurality of attack operations including screen shot attack, the input of the attack simulation module is the watermark-containing image, each attack operation can attack the watermark-containing image and generate a single-channel attacked watermark-containing image; the moir é attack in the screen shot attack is realized by a moir é attack network, the moir é attack network is obtained by training of a U-Net network, the input is a watermark-containing image, and the output is the watermark-containing image after noise is added through the moir é attack;
the watermark extraction module is formed by cascading a second Hadamard transform layer and a second convolution module, and a single-channel water-containing watermark image after being attacked is input into the watermark extraction module; the attacked water mark-containing image is pre-segmented into a series of second image blocks with the same size, each second image block is input into a second Hadamard transform layer and is transformed from a space domain to a frequency domain through Hadamard transform, the two-dimensional transform result of each second image block is spliced along the channel dimension to obtain a fourth transform characteristic diagram, and the fourth transform characteristic diagram is input into a second convolution module to be subjected to convolution operation to obtain a watermark extraction result;
s2, iterative training is carried out on the watermark model framework through a minimum total loss function by utilizing training data, the watermark extraction module selects different attack operations in different training rounds to attack the watermark-containing image output by the watermark embedding module, one attack operation is selected in each round of training, and all the training rounds cover all the attack operations; the total loss function is formed by weighting the inverse numbers with offset of the normalized cross-correlation loss and the structural similarity index loss;
and S3, after the watermark model framework is trained, utilizing a watermark embedding module to embed the watermark, outputting the water-containing image with the watermark, and directly inputting the image needing watermark extraction into a watermark extraction module to extract the watermark.
Preferably, the expression of the total loss function is as follows:
wherein: l is w Normalized cross-correlation coefficient, L, representing the original embedded watermark and the watermark extraction result I Representing structural similarity index, C, of original image and watermarked image 3 And C 4 Two weak variables for stabilizing the denominator, respectively.
Preferably, α and β are each a decimal number greater than 0 and less than 1, and satisfy α + β =1.
Preferably, C is 1 、C 2 、C 3 、C 4 Respectively has a value of 10 -4 、9×10 -4 、10 -2 And 3X 10 -2 。
Preferably, the watermark to be embedded is embedded into the first transformation feature map in a manner of splicing the watermark to be embedded and the first transformation feature map along the channel dimension.
Preferably, the first convolution module and the second convolution module each include 5 convolution layers.
Preferably, the screen shot attack comprises a plurality of attack operations of perspective transformation, ray distortion, JPEG distortion and moir é mode.
Preferably, the attack simulation module further includes a non-screen shot attack, specifically including various attack operations of blurring, clipping, gaussian noise, mosaic noise, scaling, rotation, sharpening, watermarking, display distortion, brightness and contrast on an image.
Preferably, in the training process of the moir é attack network, a series of watermark-containing image samples x are used i And the corresponding watermark-containing image y after noise is increased through moir é attack i As input samples by minimizing a loss function L m Training the U-Net network; wherein the loss function L m In the form of:
in the formula: m is the total number of samples used for training, f (x) i ) Watermarking image samples x for input for U-Net network i And outputting the prediction result.
Preferably, in S3, the image to be subjected to watermark extraction is a confidential photo that is generated by the watermark embedding module and that is leaked by a screen shooting method.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides a deep end-to-end robust screen shot watermark resisting method which can learn a new watermark algorithm in a Hadamard transformation space. The algorithm framework consists of two full convolution neural networks with residual blocks, processing the embedding and extraction operations in real time. And performing end-to-end training on the whole deep network, and performing blind security watermarking. The screen shot watermark resisting method provided by the invention takes the simulated screen shot attack as a micro network layer to facilitate end-to-end training, and simultaneously enhances the safety and robustness of the algorithm by diffusing the watermark data to a wider area in the image through a transform domain. Comparison with the latest research results shows that the algorithm framework provided by the invention has advantages in concealment, robustness and speed.
Drawings
Fig. 1 is a general flow diagram of a watermark model framework.
Fig. 2 shows an image PSNR obtained by an embedding algorithm in the watermark embedding module.
Fig. 3 shows the difference change of the watermark before and after embedding into the original input image, wherein the first line is the original image, the second line is the image containing the watermark, and the third line is the difference image before and after embedding the watermark.
Fig. 4 is a Hadamard partial transform coefficient.
Fig. 5 is a screen shot attack presentation.
FIG. 6 is a schematic diagram of a U-Net network structure trained by a Moire data set.
FIG. 7 is a diagram of Moire network test results, where the first row is a graph of clean input and the second row is a graph of output of a additive Moire pattern.
Detailed Description
The invention will be further elucidated and described with reference to the drawings and the detailed description.
The development of multimedia technology is closely related to the need for leakage sources. The development of multimedia technologies such as audio, image and video brings new challenges to the positioning method of the leakage source, so that the digital watermarking algorithm receives wide attention as an important means for realizing the leakage. For different multimedia technologies, audio watermarking schemes, image watermarking schemes, and video watermarking schemes have been proposed in the prior art. However, with the development of digital technology, the transmission process of multimedia information has changed greatly, which puts new requirements on leakage sources. For conventional ways of stealing information, such as scanning and when sending commercial or copying electronic documents, the source of the leak can be tracked using conventional robust image watermarking schemes that are used for image processing attacks. However, with the popularization of smart phones, photography becomes the simplest and most effective way of information transfer, which brings new challenges to the problem of leakage tracking. Anyone with access to the file can reveal the information by taking a picture without leaving any records. In addition, the shooting process of the camera is not easily monitored or prevented by the outside, so that designing a screen shooting attack resisting watermark scheme is important for solving the problem. The screen shot image watermarking scheme can provide powerful guarantee for leakage tracking. The invention can embed related watermark information in the original image, and can extract corresponding information from the picture when the files are taken by candid camera, thereby positioning the leaked equipment or employee information, and narrowing the investigation range according to the information positioning, so as to realize the accountability process.
The following describes a specific implementation manner of the Hadamard transform screen-shot watermark resisting method based on deep learning in detail.
S1, constructing a watermark model framework consisting of a watermark embedding module, an attack simulation module and a watermark extraction module.
The watermark model framework adopted by the invention consists of three parts, which are respectively: the system comprises a watermark embedding module, an attack simulation module and a watermark extraction module. And embedding the watermark into the Hadamard coefficient of the image by utilizing the embedding component in the watermark embedding module, and modifying the image to obtain the Hadamard coefficient characteristic of the image. The attack simulation module is used for simulating a series of screen shooting attacks generated in the screen capturing process and distortions generated in traditional attacks, such as perspective transformation, ray distortion, JPEG distortion, moir é patterns and the like. In particular, a moire attack network is designed in an attack simulation module for simulating moire phenomenon, which is the most common screen shot attack, so that the capability of restoring distortion of a watermark image in a real screen shot scene is improved. A watermark extraction module extracts a watermark from the captured photograph. The general flow chart of the framework is shown in fig. 1, and the specific data processing procedures in the three modules are described in detail below.
The watermark embedding module is used for embedding the watermark into the original image, so that the perception difference between the original image and the watermark image is minimized, and the imperceptibility and the safety of the watermark image are improved. As shown in FIG. 1, the Watermark embedding module is formed by cascading a first Hadamard Transform layer (Hadamard Transform), a first convolution module and an Inverse Hadamard Transform layer (Inverse Hadamard Transform), and the input of the Watermark embedding module is a Watermark to be embedded (Watermark) W o And an original image I to be embedded with a watermark o . Original image I of a single channel o Pre-sliced into a series of first image blocks I of equal size p Each first image block I p Input into the first Hadamard transform layer, and transform from spatial domain to frequency domain by Hadamard transform, each first image block I p Of two-dimensional transformation result I' p Splicing along the channel dimension to obtain a first transformation characteristic diagram H o Then the watermark W to be embedded o Embedding a first transformation profile H o To obtain a second transformation feature map H 1 Converting the second feature map H 1 Inputting the data into a first convolution module to carry out convolution operation, thereby obtaining a first transformation feature map H o Third conversion character with same channel numberSign graph H 2 Transforming the third feature map H 2 Two-dimensional transform result H 'of each channel is transformed from the frequency domain back to the space domain by Hadamard inverse transform in a channel-by-channel input Hadamard inverse transform layer' 2i Splicing and restoring again according to the segmentation sequence to obtain an intermediate image I ' with the same size as the original image, overlapping the intermediate image I ' and the original image, and outputting the intermediate image I ' and the original image as a single-channel watermark-containing image I o 。
It should be noted that the third transformation feature map H is described above 2 After Hadamard inverse transform is carried out on each channel, each channel obtains a two-dimensional transform result H' 2i And these two-dimensional transform results H' 2i When splicing the intermediate images, the intermediate images need to be spliced according to the original image I o Is divided into first image blocks I p The corresponding sequences are spliced, so that the image restoration is realized.
In the invention, the way of embedding the watermark to be embedded into the first transformation feature map is as follows: and splicing the watermark to be embedded and the first transformation characteristic graph along the channel dimension. Due to the watermark image W o Needs to be spliced in the first transformation characteristic diagram, so in order to ensure that the size of the first transformation characteristic diagram meets the splicing requirement, the original image I o First image block I formed by cutting p Size requirement and watermark image W o And (5) the consistency is achieved. If the original image I o Is X Y, the watermark image W o Is H × G, then X × Y size of the original image I o The first image block I needs to be divided into a series of H × G sizes p . In the invention, the length and width sizes of the images are set to be consistent by considering the actual characteristics of the images and the requirements of watermark embedding, namely X = Y = M, and the watermark image W o Also the length and width of (c) are consistent, i.e. H × G = N, the specific values of M and N can be adjusted according to the actual situation, but M should be divided by N, for example, M =512, and N =32.
As shown in fig. 1, the attack simulation module is located between the watermark embedding module and the watermark extraction module, and is used for simulating attack operation during training of the watermark embedding module and the watermark extraction module so as to obtain a watermark-containing image I o Introducing noise. The attack simulation module is internally provided with various attack operations including screen shot attackThe input is a watermark-containing image I o Every attack operation can be applied to the watermark-containing image I o An attack is performed and a single channel, attacked water-containing print image is generated. And because the moir é attack is the most common attack type, the moir é attack in the screen shot is set to be realized by a moir é attack network, the moir é attack network is trained by a U-Net network, and the input is a watermark-containing image I o And outputting the image containing the watermark after noise is increased through moir é attack.
The concrete network structure of the U-Net network belongs to the prior art, and the training mode of the network also belongs to the prior art. In the invention, a series of watermark-containing image samples x are used in the training process of moir é attack network i And the corresponding watermark-containing image y after noise is increased through moir é attack i As input samples by minimizing a loss function L m And training the U-Net network. Wherein the loss function L m In the form of:
in the formula: m is the total number of samples used for training, f (x) i ) Watermarking image samples x for input for U-Net network i And outputting the prediction result.
It should be noted that the specific form of the screen shot attack can be adjusted according to the actual situation, and the present invention can include perspective transformation, ray distortion, JPEG distortion and moir é mode multiple attack operations. In addition, in order to enhance robustness, the attack simulation module needs to include a non-screen shot attack in addition to the screen shot attack, specifically including various attack operations such as blurring, clipping, gaussian noise, mosaic noise, scaling, rotation, sharpening, watermarking, display distortion, brightness, contrast and the like on an image. In addition to moir é attacks, the rest of the attack operations can be implemented by directly calling image processing functions or operations.
As shown in fig. 1, the watermark extraction module is formed by cascading a second Hadamard transform layer and a second convolution module, and the input of the watermark extraction module is a single channel attackedThe watermarked image of (a). The attacked water-containing watermark image is pre-divided into a series of second image blocks with the same size, each second image block is input into a second Hadamard transform layer and is transformed from a space domain to a frequency domain through Hadamard transform, and the two-dimensional transform results of each second image block are spliced along the channel dimension to obtain a fourth transform characteristic diagram H 3 Then, the fourth transformation feature map H is used 3 Inputting the watermark into a second convolution module for convolution operation to obtain a watermark extraction result w e 。
It should be noted that, the first convolution module and the second convolution module of the present invention respectively include 5 convolutional layers, and of course, the specific number of convolutional layers and the convolution kernel parameters of each layer may also be optimized according to actual requirements.
And S2, after the watermark model frame shown in the figure 1 is constructed, iterative training can be carried out on the watermark model frame through a minimum total loss function by utilizing training data, and in order to ensure that the trained watermark extraction model can resist different attacks, the watermark extraction module selects different attack operations in different training rounds to attack the watermark-containing image output by the watermark embedding module, but one attack operation is selected in each training round, and all the training rounds cover all the attack operations.
The specific total loss function used for iterative training of the watermark model framework can be optimized according to the practice. In the present invention, the total loss function may be weighted by the inverse of the band offset of the normalized cross-correlation loss and the structural similarity index loss. The specific overall loss function is expressed as follows:
wherein: l is w Represents the normalized cross-correlation loss, and is calculated as:
L I the structural similarity index loss is expressed by the following formula:
in the formula: w is a o (H, G) is the pixel value at coordinate position (H, G) in the original embedded watermark of size H G, w e (h, g) is the pixel value at the coordinate position (h, g) in the watermark extraction result; I.C. A o (X, Y) is an original image I of size X Y o Pixel value at the (x, y) middle coordinate position, I w (x, y) is a water-containing printed image I w The pixel value at the (x, y) middle coordinate position,andare respectively all of I o Average value of (x, y) and all I w (x, y) average value of (x, y),andare respectively all of I o Variance of (x, y) and all I w Variance of (x, y), C 1 、C 2 、C 3 And C 4 Four weak variable superparameters are respectively, and alpha and beta are two weight superparameters.
In the present invention, α and β may be both a decimal number greater than 0 and smaller than 1, and α + β =1 is satisfied, and preferably α = β =0.5. In addition, C 1 、C 2 、C 3 、C 4 Respectively, can be selected to be 10 -4 、9×10 -4 、10 -2 And 3X 10 -2 。
And S3, after the watermark model framework is trained, the watermark embedding module can be used for embedding the watermark, the watermark-containing image with the watermark is output, and the image needing watermark extraction is directly input into the watermark extraction module for watermark extraction.
However, it should be noted that in the above S3, the attack simulation module in fig. 1 is no longer needed, because in the actual application scenario, the attack noise at this time is introduced by the actual screen capture secret process. Therefore, in S3, the image to be input to the watermark extraction module for watermark extraction is a confidential picture generated when the image containing the watermark generated by the watermark embedding module is leaked in a screen shooting manner. When the confidential photos are leaked by candid shooting, corresponding information can still be extracted from the photos, so that the leaked equipment or employee information can be located.
The methods shown in the above-mentioned S1 to S3 are applied to a specific example to show the technical effects that can be achieved.
Examples
The specific method process in this embodiment is shown in S1 to S3, which are not described in detail, and specific implementation details and technical effects thereof are mainly shown below.
First, a watermark model framework is constructed. The watermark model framework adopted in the embodiment is composed of three parts, which are respectively: the system comprises a watermark embedding module, an attack simulation module and a watermark extraction module. The specific data processing procedures in the three modules are as described above, and are not described herein again.
(1) Watermark embedding module
The watermark embedding module is responsible for embedding the watermark into the transform coefficients of the image. In this embodiment, a CNN architecture is used, and an original image in the form of a single-channel gray scale pattern of 512 × 512 pixels is used as an input, and a single-channel watermark-containing image is output. Where the input watermark is represented as a two-dimensional image of 32 x 32 size, the original image is therefore also sliced into 16 x 16 image blocks of 32 x 32 size. Splicing the image block after Hadamard transformation and the watermark image into (16 × 16) + 1-dimensional tensor along the channel direction, inputting the tensor into a first convolution module for convolution, and restoring the convolution result after Inverse Hadamard transformation channel by channel to obtain the watermark-containing image. In this embodiment, the Hadamard transform layer is implemented using a convolution kernel with a size of 1 × 1. And the first convolution module comprises 5 convolution layers with convolution kernel sizes of 1 × 1, 2 × 2 and 2 × 2 in sequence. Fig. 2 shows the imperceptibility of the embedding algorithm in the watermark embedding module by using PSNR, and it can be seen from fig. 2 that the imperceptibility of the algorithm is ideal, and the PNSR average value is 36.62. Fig. 3 shows the difference change of the watermark before and after embedding into the original input image, and from a subjective point of view, the invisibility of the watermark in the image after embedding the watermark is better. Since the embedding algorithm is watermark embedding in the Hadamard transform domain, fig. 4 shows part of the Hadamard transform coefficients, and it can be seen that the absolute values of the Hadamard transform coefficients are the same.
(2) Attack simulation module
By carrying out theoretical analysis on the screen shooting attack, the attack caused in the screen shooting process can be divided into the following steps: perspective transformation, optical distortion, brightness variation, JPEG compression, moir é patterns, visualization of different attack result graphs, as shown in FIG. 5. The moir é attack is the most common attack type, and therefore the U-Net network is constructed for training and testing the moir attack, and the structure diagram of the network is shown in FIG. 6. The loss function of the moire training network is as described above. FIG. 7 is a test result graph obtained by a test chart after the U-Net network is trained, and it can be easily observed from FIG. 7 that a moir é pattern can be automatically added to an original pure graph after the original pure graph passes through the U-Net network, so that moir é simulation attack generated in a screen shot process is realized. The trained network is added between the watermark embedding module and the watermark extracting module to serve as a noise layer, so that the robustness of the screen shot algorithm and the diversity of attacks are improved. In addition, in the screen shooting process, a mixed attack phenomenon occurs in many cases, including both traditional attacks and screen shooting attacks, and all the attacks are simulated in the attack simulation layer in the embodiment, as shown in table 1. The screenshots include Perspective transformation (Perspective), light distortion (Light), JPEG distortion (JPEG), and moire pattern (moire), and the non-screenshots specifically include Blurring (Blurring), cropping (Cropping), gaussian noise (Gaussian noise), mosaic noise (Block noise), scaling (Scaling), rotation (Rotation), sharpening (Sharpening), watermark (Visible watermark), display distortion (Display distortion), brightness, and contrast (Display distortion).
TABLE 1 different attack parameters of the attack simulation layer
(3) Watermark extraction module
The watermark extraction module is used for extracting the watermark from the watermark-containing image subjected to the screen shot attack. The watermark is embedded in a Hadamard domain, and the watermark image is transformed to the Hadamard domain before the watermark is extracted. The watermarked image is subjected to a Hadamard transform layer and a watermark extracted by a series of convolution layers in a second convolution module. In this embodiment, the Hadamard transform layers are implemented by convolution kernels of size 1 × 1, and the first convolution module includes 5 convolution layers with convolution kernel sizes of 1 × 1, 2 × 2, and 1 × 1 in this order.
The algorithm needs to ensure that the quality of the watermark image is maximum and the error rate of watermark extraction is minimum so as to balance the imperceptibility and robustness of the algorithm. Therefore, the invention in this embodiment introduces Peak signal-to-noise ratio (PSNR) and Structural Similarity Index (SSIM) to ensure the image imperceptibility, and normalized cross-correlation (NC) and Bit Error Rate (BER) are used to describe the robustness of the algorithm. In order to balance the robustness and the imperceptibility of the algorithm, a loss function based on the coupling of NC and SSIM is used here. The specific description is as follows:
equation (1) describes the mean square error, I o (X, Y) is the original image of size X Y, I w (x, y) is the watermarked image after embedding the watermark. Equation (2) is an expression of PSNR. Equation (3) is an expression of NC, w o (H, G) is the original embedded watermark map of size H G, w e (h, g) is the extracted watermark pattern. Equation (4) describes the expression form of SSIM,andis the average value of the values of the average,andare each I o (x, y) and I w The variance of (x, y), C1 and C2, is two weak variables used to stabilize the denominator. Equation (5) is the overall loss function, consisting of two parts, NC and SSIM, α = β =0.5, and C 1 And C 2 Respectively has a value of 10 -4 ,9*10 -4, C 3 、C 4 Respectively has a value of 10 -2 And 3X 10 -2 。
Using a training data set by minimizing a total loss function L t And (3) performing iterative training on the watermark model framework, selecting different attack operations by the watermark extraction module in different training rounds to attack the watermark-containing image output by the watermark embedding module, selecting one attack operation by each training round, and covering all the attack operations by all the training rounds. And finishing the training after the training reaches the maximum iteration round number.
After the watermark model frame is trained, only testing the trained watermark model frame. In the testing process, the watermark embedding module is used for embedding the watermark, the watermark-containing image with the watermark is output, the watermark model framework is continuously used for introducing noise into the watermark-containing image generated by the watermark embedding module in different attack forms, and then the noise is directly input into the watermark extracting module for extracting the watermark so as to test the attack resistance robustness of the watermark extracting module. In this embodiment, a plurality of attack types are tested, and BER indexes are used to demonstrate the capability of the network layer to extract watermarks, where a smaller BER value indicates a better extraction effect. The test results are shown in table 2:
table 2 watermark extraction capability presentation
Therefore, the algorithm framework provided by the invention has advantages in the aspects of concealment, robustness and speed for various screen shot attacks and non-screen shot attacks.
The above-described embodiments are merely preferred embodiments of the present invention, and are not intended to limit the present invention. Various changes and modifications may be made by one of ordinary skill in the pertinent art without departing from the spirit and scope of the present invention. Therefore, the technical scheme obtained by adopting the mode of equivalent replacement or equivalent transformation is within the protection scope of the invention.
Claims (10)
1. A method for resisting screen shot watermark based on Hadamard transform of deep learning is characterized by comprising the following steps:
s1, constructing a watermark model framework consisting of a watermark embedding module, an attack simulation module and a watermark extraction module;
the watermark embedding module is formed by connecting a first Hadamard transform layer, a first convolution module and a Hadamard inverse transform layer, and the input of the watermark embedding module is a watermark to be embedded and an original image in which the watermark is to be embedded; the method comprises the steps that a single-channel original image is divided into a series of first image blocks with the same size in advance, each first image block is input into a first Hadamard conversion layer and is converted into a frequency domain from a space domain through Hadamard conversion, the two-dimensional conversion result of each first image block is spliced along the channel dimension to obtain a first conversion characteristic diagram, a watermark to be embedded is embedded into the first conversion characteristic diagram to obtain a second conversion characteristic diagram, the second conversion characteristic diagram is input into a first convolution module to be subjected to convolution operation, so that a third conversion characteristic diagram with the same channel number as the first conversion characteristic diagram is obtained, the third conversion characteristic diagram is input into an Hadamard inverse conversion layer channel by channel and is converted back into the space domain from the frequency domain through Hadamard inverse conversion, the two-dimensional conversion result of each channel is spliced and reduced again according to the splitting sequence to obtain an intermediate image with the same size as the original image, and the intermediate image is output as a single-channel water-containing image after being superposed with the original image;
the attack simulation module is internally provided with a plurality of attack operations including screen shot attack, the input of the attack simulation module is the watermark-containing image, each attack operation can attack the watermark-containing image and generate a single-channel attacked watermark-containing image; the moir attack in the screen shot attack is realized by a moir attack network, the moir attack network is obtained by training of a U-Net network, the moir attack network is input into a watermark-containing image, and the watermark-containing image is output after noise is increased through the moir attack;
the watermark extraction module is formed by cascading a second Hadamard transform layer and a second convolution module, and a single-channel water-containing watermark image after being attacked is input into the watermark extraction module; the attacked water mark-containing image is pre-segmented into a series of second image blocks with the same size, each second image block is input into a second Hadamard transform layer and is transformed from a space domain to a frequency domain through Hadamard transform, the two-dimensional transform result of each second image block is spliced along the channel dimension to obtain a fourth transform characteristic diagram, and the fourth transform characteristic diagram is input into a second convolution module to be subjected to convolution operation to obtain a watermark extraction result;
s2, iterative training is carried out on the watermark model framework through a minimum total loss function by utilizing training data, the watermark extraction module selects different attack operations in different training rounds to attack the watermark-containing image output by the watermark embedding module, one attack operation is selected in each training round, and all the training rounds cover all the attack operations; the total loss function is formed by weighting the inverse numbers with offset of the normalized cross-correlation loss and the structural similarity index loss;
and S3, after the watermark model framework is trained, utilizing a watermark embedding module to embed the watermark, outputting the water-containing image with the watermark, and directly inputting the image needing watermark extraction into a watermark extraction module to extract the watermark.
2. The method for resisting screen shot watermarking by Hadamard transform based on deep learning as claimed in claim 1, wherein the expression of the total loss function is as follows:
wherein: l is w Normalized cross-correlation coefficient, L, representing the original embedded watermark and the watermark extraction result I Representing structural similarity index, C, of original image and watermarked image 3 And C 4 Two weak variables for stabilizing the denominator, respectively.
3. The method for resisting screen shot watermark by Hadamard transform based on deep learning as claimed in claim 2, wherein α and β are both decimal numbers greater than 0 and smaller than 1, and α + β =1 is satisfied.
4. The deep learning-based Hadamard transform anti-screenshots method of claim 2, wherein C is 3 、C 4 Respectively, is 10 -2 And 3X 10 -2 。
5. The method for resisting screen shot watermark based on Hadamard transform of deep learning as claimed in claim 1, wherein the watermark to be embedded is embedded in the first transform feature map by splicing the watermark to be embedded and the first transform feature map along the channel dimension.
6. The method for resisting the screen shot watermark by the Hadamard transform based on the deep learning as claimed in claim 1, wherein the first convolution module and the second convolution module respectively comprise 5 convolution layers.
7. The method for resisting the screen shot watermark by the Hadamard transform based on the deep learning as claimed in claim 1, wherein the screen shot attack comprises a plurality of attack operations of perspective transform, ray distortion, JPEG distortion and moir é mode.
8. The method for resisting screen shot watermark based on Hadamard transform of deep learning as claimed in claim 1, wherein the attack simulation module further comprises a non-screen shot attack, specifically comprising a plurality of attack operations of blurring, clipping, gaussian noise, mosaic noise, scaling, rotation, sharpening, watermarking, display distortion, brightness and contrast on an image.
9. The method for resisting the screen shot watermark based on the Hadamard transform of the deep learning as claimed in claim 1, wherein a series of image samples x containing the watermark are obtained in the course of training the moir é attack network i And the corresponding watermark-containing image y after noise is increased through moir é attack i As input samples by minimizing a loss function L m Training the U-Net network; wherein the loss function L m Is of the form:
in the formula: m is the total number of samples used for training, f (x) i ) Watermarking image samples x for input for U-Net network i And outputting the prediction result.
10. The method for resisting screen shot watermark based on Hadamard transform of deep learning as claimed in claim 1, wherein in S3, the image to be subjected to watermark extraction is a confidential photo with watermark image generated by the watermark embedding module leaked by screen shot mode.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211210100.8A CN115526758A (en) | 2022-09-30 | 2022-09-30 | Hadamard transform screen-shot-resistant watermarking method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211210100.8A CN115526758A (en) | 2022-09-30 | 2022-09-30 | Hadamard transform screen-shot-resistant watermarking method based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115526758A true CN115526758A (en) | 2022-12-27 |
Family
ID=84701105
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211210100.8A Pending CN115526758A (en) | 2022-09-30 | 2022-09-30 | Hadamard transform screen-shot-resistant watermarking method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115526758A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116308986A (en) * | 2023-05-24 | 2023-06-23 | 齐鲁工业大学(山东省科学院) | Hidden watermark attack algorithm based on wavelet transformation and attention mechanism |
-
2022
- 2022-09-30 CN CN202211210100.8A patent/CN115526758A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116308986A (en) * | 2023-05-24 | 2023-06-23 | 齐鲁工业大学(山东省科学院) | Hidden watermark attack algorithm based on wavelet transformation and attention mechanism |
CN116308986B (en) * | 2023-05-24 | 2023-08-04 | 齐鲁工业大学(山东省科学院) | Hidden watermark attack algorithm based on wavelet transformation and attention mechanism |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Korus | Digital image integrity–a survey of protection and verification techniques | |
Guan et al. | DeepMIH: Deep invertible network for multiple image hiding | |
Moosazadeh et al. | A new DCT-based robust image watermarking method using teaching-learning-based optimization | |
Kundur et al. | Toward robust logo watermarking using multiresolution image fusion principles | |
Sadek et al. | Video steganography: a comprehensive review | |
Cedillo-Hernández et al. | Robust hybrid color image watermarking method based on DFT domain and 2D histogram modification | |
CN112529758B (en) | Color image steganography method based on convolutional neural network | |
Swaminathan et al. | Digital image forensics via intrinsic fingerprints | |
US8792675B2 (en) | Color image or video processing | |
Roy et al. | A hybrid domain color image watermarking based on DWT–SVD | |
Gourrame et al. | A zero-bit Fourier image watermarking for print-cam process | |
CN110796586B (en) | Blind watermarking method and system based on digital dot matrix and readable storage medium | |
Mhala et al. | Contrast enhancement of progressive visual secret sharing (pvss) scheme for gray-scale and color images using super-resolution | |
CN115526758A (en) | Hadamard transform screen-shot-resistant watermarking method based on deep learning | |
Cao et al. | Screen-shooting resistant image watermarking based on lightweight neural network in frequency domain | |
Juarez-Sandoval et al. | Digital image ownership authentication via camouflaged unseen-visible watermarking | |
Fang et al. | DeNoL: A Few-Shot-Sample-Based Decoupling Noise Layer for Cross-channel Watermarking Robustness | |
CN113628090B (en) | Anti-interference message steganography and extraction method, system, computer equipment and terminal | |
Reed et al. | Closed form non-iterative watermark embedding | |
Sharma et al. | Image watermarking in frequency domain using Hu’s invariant moments and firefly algorithm | |
CN117014625A (en) | Video watermarking method | |
Zhang et al. | A convolutional neural network-based blind robust image watermarking approach exploiting the frequency domain | |
Sandoval Orozco et al. | Image source acquisition identification of mobile devices based on the use of features | |
CN114648436A (en) | Screen shot resistant text image watermark embedding and extracting method based on deep learning | |
Bonanomi et al. | I3D: a new dataset for testing denoising and demosaicing algorithms |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |