CN113436050A

CN113436050A - Visible watermark detection and erasure method based on double-input convolution fusion

Info

Publication number: CN113436050A
Application number: CN202110569103.XA
Authority: CN
Inventors: 李彤; 冯丙文; 魏凯敏; 刘志全
Original assignee: Jinan University
Current assignee: Jinan University; University of Jinan
Priority date: 2021-05-25
Filing date: 2021-05-25
Publication date: 2021-09-24

Abstract

The invention discloses a double-input convolution fused visible watermark detection and erasure method, which comprises the following steps: constructing a visible watermark detection and erasure network which comprises a multitask full-convolution watermark segmentation network, a watermark removal network based on partial convolution and a double-input branch watermark removal network based on common convolution; inputting a watermark image to be processed, and processing the watermark image through a multitask full convolution watermark segmentation network, a watermark removal network based on partial convolution and a double-input branch watermark removal network based on common convolution to obtain a final watermark-free image. The method is suitable for removing the visible watermarks with various embedding strengths, and can be suitable for removing the watermarks with different colors and different embedding strengths compared with the traditional scheme or the CNN network scheme using single convolution.

Description

Visible watermark detection and erasure method based on double-input convolution fusion

Technical Field

The invention belongs to the technical field of digital watermark removal, and particularly relates to a visible watermark detection and erasure method based on double-input convolution fusion.

Background

The visible watermarking technology is widely applied to intellectual property protection of digital images and videos. It superimposes a visible watermark pattern on the original image so that any user can perceive the watermark from the suspect content. Furthermore, visible watermark patterns are generally considered to be robust and resistant to watermark removal attacks, because they often contain complex structures such as thin lines and shadows, which are difficult to remove completely without destroying the visual quality of the original image.

Conventional watermark removal methods require complex human interaction and a priori knowledge that is dependent on experience and expertise, thus limiting their application. With the rapid development of deep convolutional networks (CNNs), methods for learning CNNs to automatically remove visible watermarks have appeared. However, the schemes still have respective limitations, many CNN-based networks have poor processing effects on opaque watermarks with high embedding strength, and watermark traces remain in watermark coverage areas after the watermarks are removed.

Disclosure of Invention

The invention mainly aims to overcome the defects and shortcomings of the prior art, provides a visible watermark detection and erasure method based on double-input convolution fusion, is suitable for removing visible watermarks with various embedding strengths, and can be suitable for removing watermarks with different colors and different embedding strengths compared with the traditional scheme or a CNN network scheme using single convolution.

In order to achieve the purpose, the invention adopts the following technical scheme:

a visible watermark detection and erasure method based on double-input convolution fusion comprises the following steps:

s1, constructing a visible watermark detection and erasure network, which comprises a multitask full convolution watermark segmentation network, a watermark removal network based on partial convolution and a double-input branch watermark removal network based on common convolution;

and S2, inputting the watermark image to be processed, and processing the watermark image through a multitask full convolution watermark segmentation network, a partial convolution-based watermark removal network and a common convolution-based double-input branch watermark removal network to obtain a final watermark-free image.

Further, the structure of the multitask fully-convolution watermark segmentation network is specifically as follows:

the backbone network comprises 7 convolution groups, and each convolution group comprises a plurality of convolution operations;

the first 5 convolution groups respectively comprise 2 convolution layers, 3 convolution kernels and 64 convolution kernels, 128 convolution kernels, 256 convolution kernels, 512 kernels and 512 kernels, the convolution kernels are respectively 64 convolution kernels, 128 convolution kernels, 256 convolution kernels, 512 kernels and 512 kernels, each convolution layer is followed by a Batchnormalization layer and a ReLU activation layer, and each convolution group is followed by a maximum pooling layer with the step length of 2;

the 6 th convolution group and the 7 th convolution group respectively comprise 1 convolution layer, the convolution layer is followed by a ReLU activation layer and a 50% Dropout layer, the sizes of convolution kernels are respectively 7 and 1, and the number of the convolution kernels is 4096.

Further, the multitask full-convolution watermark segmentation network further comprises two output branches, the two output branches have the same structure, and the specific structure is as follows:

after the backbone network, a convolution layer with convolution kernel size of 1 and convolution kernel number of 2 and an upsampling layer with upsampling factor of 2 are connected, and the layer is called x 1;

the 4 th maximum pooling layer is followed by convolution layers with convolution kernel size of 1 and convolution kernel number of 2 and a Crop layer which is called x 2;

adding x1 and x2 according to elements, then connecting an upsampling layer with an upsampling factor of 2, and connecting a convolution layer with a convolution kernel size of 1 and a convolution kernel number of 2 and a Crop layer after the 3 rd maximum pooling layer, wherein the Crop layer is called x 3;

x2 and x3 are added element by element, followed by an upsampling layer with an upsampling factor of 8 and a Crop layer.

Further, the watermark removing network based on partial convolution specifically has the following structure:

the watermark removing network based on partial convolution comprises 7 down-sampling layers and 7 corresponding up-sampling layers, wherein jump connection is arranged between the down-sampling layers and the up-sampling layers and used for feature transfer;

each downsampling layer comprises a partial convolution layer, a Batchnormalization normalization layer and a ReLU active layer except for a first downsampling layer, the first downsampling layer does not comprise the Batchnormalization layer, the sizes of convolution kernels are 7, 5, 3 and 3 respectively, the numbers of convolution kernels are 64, 128, 256, 512 and 512 respectively, convolution step lengths stride are 1 respectively, and padding is 3, 2, 1 and 1 respectively;

the method comprises the following steps that except for the last upsampling layer, the upsampling layers all comprise a partial convolution layer, a BatchNormalization normalization layer and a LeakyReLU activation layer, wherein the partial convolution is preceded by upsampling operation with an upsampling factor of 2, the convolution kernel sizes of the upsampling layers are all 3, the convolution numbers are 512, 256, 128, 64 and 3 respectively, and the step size stride and the padding are all 1;

the processing procedure of the partial convolution-based watermark removal network comprises a down-sampling encoding stage and an up-sampling decoding stage, wherein the encoding stage and the decoding stage respectively adopt 7 convolution layers for processing.

Further, it specifically is to be equipped with the jump connection between downsampling layer and the upsampling layer:

the up-sampling layer 6 and the down-sampling layer 1, the up-sampling layer 5 and the down-sampling layer 2, the up-sampling layer 4 and the down-sampling layer 3, the up-sampling layer 3 and the down-sampling layer 4, the up-sampling layer 2 and the down-sampling layer 5, and the up-sampling layer 1 and the down-sampling layer 6 have jump connection.

Further, the structure of the dual-input branch watermark removing network based on the common convolution is specifically as follows:

the double-input branch watermark removing network based on the common convolution comprises two input branches, the two input branches are identical in structure, each input branch is composed of 7 down-sampling layers, except for a first down-sampling layer, each down-sampling layer comprises a convolution layer, a Batchnormalization normalization layer and a LeakyReLU active layer, the first down-sampling layer does not comprise the Batchnormalization normalization layer, and the number of convolution kernels is 64, 128, 256, 512 and 512.

Furthermore, the output structure of the double-input branch watermark removing network based on the common convolution is composed of 1 convolution module and 9 up-sampling layers; the convolution module comprises a convolution layer with 1024 convolution kernels and a ReLU active layer; the up-sampling layers comprise an anti-convolution layer, a Batchnormalization normalization layer and a ReLU activation layer except the first up-sampling layer and the last up-sampling layer, wherein the first up-sampling layer is not provided with the normalization layer, and the last up-sampling layer is not provided with the normalization layer and the activation layer;

except for the last upsampling layer, the sizes of convolution kernels in a common convolution-based dual-input branch watermark removal network are 4, the sizes of step lengths stride are 2, the sizes of filling padding are 1, the sizes of the convolution kernels of the last upsampling layer are 3, the sizes of the step lengths stride are 1, and the sizes of the filling padding are 1.

Further, the input watermark image is processed by the following processing steps in a multitask full convolution watermark segmentation network:

selecting a watermark image I to be processed_w；

Watermark image I_wInputting a multi-task full-convolution watermark segmentation network to obtain a pixel level binary watermark pattern mask I of a watermark image area_m，eThe method specifically comprises the following steps:

dividing network of multitask full convolution watermark according to input watermark image I_wOutputting a surface mask I of the watermark pattern_m，s；

Dividing network of multitask full convolution watermark according to input watermark image I_wOutputting boundary mask I of watermark pattern_m，e；

The multitask full convolution watermark segmentation network carries out hole filling on the boundary mask code and fills the boundary mask code I_m，eFill to one surface mask I'_m，s；

Multitask full convolution watermark segmentation network to two surface masks I_m，sAnd l'_m，sPerforming threshold crossing processing to finally generate a pixel level binary watermark pattern mask I of a watermark image_m。

Further, the input of the multitask full convolution watermark segmentation network is processed by the following processing steps in the partial convolution-based watermark removal network:

selecting a watermark image I to be processed_wAnd a pixel-level binary watermark pattern mask I of the corresponding watermark image_m；

Watermark image I_wAnd a pixel-level binary watermark pattern mask I of the corresponding watermark image_mInputting the image into a watermark removing network based on partial convolution to obtain a watermark image with the watermark removed preliminarily

Further, the input of the watermark removing network based on partial convolution is processed by the following processing steps in a double-input branch watermark removing network based on common convolution:

selecting preliminary de-watermarked image to be processed

And watermark image I_w；

Removing the watermark image preliminarily

Inputting an input branch of a common convolution-based double-input branch watermark removal network to obtain a watermark image I_wInputting another input branch of the double-input branch watermark removing network based on the common convolution to obtain the watermark image area with the final removed watermark

Wherein the content of the first and second substances,

namely the image without watermark which is finally recovered by the visible watermark detection erasing network.

Compared with the prior art, the invention has the following advantages and beneficial effects:

1. the method uses partial convolution in the field of watermark removal, finishes the watermark removal as an interleaving task, has no advantage of common convolution for removing the processed deep watermark, further provides a double-input branch Y-net, takes the output result of partial convolution network as the input of one branch, uses the other branch for realizing the network based on common convolution, the sizes of convolution kernels used in the watermark removal network based on partial convolution are respectively 7, 5 and 3, the size of the convolution kernel in the double-input branch Y-net is 4, different convolution kernels extract the characteristics of different scales in the image, and fuses the characteristics extracted by 2 branches in an up-sampling stage; the adoption of the technical characteristics finally obtains the visible watermark detection and removal network suitable for multi-color multi-embedding strength, the network can well detect the position of the shipping watermark and remove the watermark on the premise of not damaging the original host image for the watermarks with any color, any embedding strength and any embedding position, and the removal effect is superior to that of the current mainstream watermark removal network.

Drawings

FIG. 1 is an overall flow diagram of the present invention;

fig. 2 is a process flow diagram of a two-input branch watermark removal network based on normal convolution.

Detailed Description

The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.

Examples

The invention discloses a visible watermark detection and erasure method based on double-input convolution fusion, which comprises the following steps:

the structure of the multitask full-convolution watermark segmentation network is specifically as follows:

the first 5 convolution groups respectively comprise 2 convolution layers, 3 convolution kernels and 64 convolution normalization layers, wherein the convolution kernels are 3 in size, the convolution kernels are 64, 128, 256, 512 and 512 respectively, and each convolution layer is connected with a Batchnormalization normalization layer and a ReLU activation layer; each convolution group is followed by a maximum pooling layer with a step size of 2;

The multitask full-convolution watermark segmentation network further comprises two output branches, the two output branches are the same in structure, and the specific structure is as follows:

In this embodiment, as shown in table 1 below, the structure of the multitask full-convolution watermark segmentation network is specifically:

Layer	k	s	p	channels	N	AF	input
								con1_1	3	1	1	3/64	BN	ReLU	I_w
con1_2	3	1	1	64/64	BN	ReLU	con1_1
								MaxPool1	2	1					con1_2
con2_1	3	1	1	64/128	BN	ReLU	MaxPool1
								con2_2	3	1	1	128/128	BN	ReLU	con2_1
MaxPool2	2	1					con2_2
								con3_1	3	1	1	128/256	BN	ReLU	MaxPool2
con3_2	3	1	1	256/256	BN	ReLU	con3_1
								con3_3	3	1	1	256/256	BN	ReLU	con3_2
MaxPool3	2	1					con3_3
								con4_1	3	1	1	256/512	BN	ReLU	MaxPool3
con4_2	3	1	1	512/512	BN	ReLU	con4_1
								con4_3	3	1	1	512/512	BN	ReLU	con4_2
MaxPool4	2	1					con4_3
								con5_1	3	1	1	256/512	BN	ReLU	MaxPool4
con5_2	3	1	1	512/512	BN	ReLU	con5_1
								con5_3	3	1	1	512/512	BN	ReLU	con5_2
MaxPool5	2	1					con5_3
								con6	7	1	1	512/4096		ReLU	MaxPool5
Dropout1							con6
								con7	1	1	1	4096/4096		ReLU	Dropout1
Dropout2
								S con8	1	1	0	4096/3			Dropout2
S conT8	4	2	0	3/3			S con8
								S con9	1	1	0	512/3			Maxpool4
S conT9	4	2	0	3/3			S conT8+S con9
								S con10	1	1	0	256/3			Maxpool3
S conT10	16	2	0	3/3			S conT9+S con10
								E con8	1	1	0	4096/3			Dropout2
E conT8	4	2	0	3/3			E con8
								E con9	1	1	0	512/3			Maxpool4
E conT9	4	2	0	3/3			E conT8+E con9
								E con10	1	1	0	256/3			Maxpool3
E conT10	16	2	0	3/3			E conT9+E con10

TABLE 1

Wherein Layer represents a network Layer, k represents a convolution kernel and a pooled sampling kernel, s represents a step length of convolution and pooling, p represents a filling number, channel represents a characteristic dimension number passing through the network Layer, N represents a used normalization operation, AF represents a used activation function, and Input represents an Input of the current network Layer;

con denotes the convolutional layer, conT denotes the deconvolution layer, the numbers following con and conT are the numbers of the network layers, BN denotes the batch normalization operation named "BatchNorm", ReLU denotes the ReLU activation function, I_wRepresenting a watermark image, MaxPool representing a pooling operation named "MaxPool", the following numbers being their designations, Dropout being a random deactivation operation named "Dropout", and "+" representing a pixel level stitching operation; s represents a surface mask output branch of the multitask full convolution watermark segmentation network, and E represents a boundary mask output branch of the multitask full convolution watermark segmentation network.

The watermark removing network based on partial convolution has the following structure:

the setting of the jump connection is specifically as follows:

In this embodiment, as shown in table 2 below, the structure of the watermark removal network based on partial convolution is specifically:

TABLE 2

Where k denotes the number of convolution kernels, BN denotes a batch normalization operation named "BatchNorm", ReLU denotes an activation function named "ReLU", LkReLU denotes an activation function named "LeakyReLU", Tanh denotes an activation function named "Tanh", I_wRepresenting a watermark image, I_mPixel level binary watermark pattern mask of watermark image, "+" indicates splicing operation at pixel level, en indicates downsampling encoding stage of partial convolution-based watermark removal network, and de indicates partial convolution-based watermark removal networkThe up-sampling decoding stage of the network.

The structure of the double-input branch watermark removing network based on the common convolution specifically comprises the following steps:

The output structure of the double-input branch watermark removing network based on the common convolution is composed of 1 convolution module and 9 up-sampling layers; the convolution module comprises a convolution layer with 1024 convolution kernels and a ReLU active layer; the up-sampling layers comprise an anti-convolution layer, a Batchnormalization normalization layer and a ReLU activation layer except the first up-sampling layer and the last up-sampling layer, wherein the first up-sampling layer is not provided with the normalization layer, and the last up-sampling layer is not provided with the normalization layer and the activation layer;

In this embodiment, as shown in table 3 below, the structure of the two-input branch watermark removal network based on the ordinary convolution is specifically:

TABLE 3

Wherein, I_wRepresenting the area of the watermark image after cropping,

the method comprises the steps of representing a watermark image area of a preliminary removal watermark generated by a partial convolution-based watermark removal network, "+" represents a pixel-level splicing operation, en represents a down-sampling encoding stage of a common convolution-based dual-input branch watermark removal network, the number behind en is the label of a dual-input branch, en1 represents a first input branch of the common convolution-based dual-input branch watermark removal network, en2 represents a second input branch of the common convolution-based dual-input branch watermark removal network, and de represents an up-sampling decoding stage of the common convolution-based dual-input branch watermark removal network.

As shown in fig. 1, specifically:

s21, multitask full convolution watermark segmentation network processing, including the following steps:

s211, selecting the watermark image I needing to be processed_w；

S212, watermarking the image I_wInputting a multi-task full-convolution watermark segmentation network to obtain a pixel level binary watermark pattern mask I of a watermark image area_m，eThe method specifically comprises the following steps:

s2121, dividing the network according to the input watermark image I by the multitask full convolution watermark_wOutputting a surface mask I of the watermark pattern_m，s；

S2122, dividing the network according to the input watermark image I by the multitask full convolution watermark_wOutputting boundary mask I of watermark pattern_m，e；

S2123, the multitask full convolution watermark segmentation network carries out hole filling on the boundary mask code and fills the boundary mask code I_m，eFill to one surface mask I'_m，s；

S2124, dividing two surface masks I by a multitask full convolution watermark dividing network_m，sAnd l'_m，sPerforming threshold crossing processing to finally generate a pixel level binary watermark pattern mask I of a watermark image_m。

S22, partial convolution-based watermark removal network processing, comprising the following steps:

s221, selecting the watermark image I needing to be processed_wAnd a pixel-level binary watermark pattern mask I of the corresponding watermark image_m；

S222, watermarking the image I_wAnd a pixel-level binary watermark pattern mask I of the corresponding watermark image_mInputting the image into a watermark removing network based on partial convolution to obtain a watermark image with the watermark removed preliminarily

S23, the two-input branch watermark removal network processing based on the ordinary convolution, as shown in fig. 2, includes the following steps:

s231, selecting the primary watermark removal image needing to be processed

And watermark image I_w；

S232, removing the watermark image preliminarily

Wherein the content of the first and second substances,

It should also be noted that in this specification, terms such as "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A visible watermark detection and erasure method based on double-input convolution fusion is characterized by comprising the following steps:

2. The method for detecting and erasing the visible watermark based on the double-input convolution fusion as claimed in claim 1, wherein the structure of the multitask full-convolution watermark segmentation network is specifically as follows:

3. The method for detecting and erasing the visible watermark based on the double-input convolution fusion as claimed in claim 2, wherein the multitask full-convolution watermark segmentation network further comprises two output branches, and the two output branches have the same structure, and the specific structure is as follows:

4. The method for detecting and erasing the visible watermark based on the fusion of the double-input convolution as claimed in claim 1, wherein the structure of the watermark removing network based on the partial convolution is specifically as follows:

5. The method for detecting and erasing the visible watermark based on the double-input convolution fusion as claimed in claim 4, wherein the jump connection between the down-sampling layer and the up-sampling layer is specifically:

6. The method for detecting and erasing the visible watermark based on the fusion of the double-input convolution as claimed in claim 1, wherein the structure of the double-input branch watermark removing network based on the common convolution is specifically as follows:

7. The visible watermark detection and erasure method based on the fusion of the double-input convolution is characterized in that the output structure of the double-input branch watermark removal network based on the common convolution is composed of 1 convolution module and 9 upsampling layers; the convolution module comprises a convolution layer with 1024 convolution kernels and a ReLU active layer; the up-sampling layers comprise an anti-convolution layer, a Batchnormalization normalization layer and a ReLU activation layer except the first up-sampling layer and the last up-sampling layer, wherein the first up-sampling layer is not provided with the normalization layer, and the last up-sampling layer is not provided with the normalization layer and the activation layer;

8. The method for detecting and erasing the visible watermark based on the double-input convolution fusion as claimed in claim 1, wherein the input watermark image is processed by the following processing steps in a multitask full convolution watermark segmentation network:

selecting a watermark image I to be processed_w；

Dividing network of multitask full convolution watermark according to input watermark image I_wOutputting watermark patternIs masked by a boundary mask I_m，e；

9. The method for detecting and erasing the visible watermark based on the fusion of the double-input convolution as claimed in claim 8, wherein the input of the multitask full-convolution watermark segmentation network is processed by the following processing steps in the watermark removal network based on the partial convolution:

10. The method for detecting and erasing the visible watermark based on the fusion of the double-input convolution as claimed in claim 9, wherein the input of the watermark removing network based on the partial convolution is processed by the following steps in the watermark removing network based on the common convolution and the double-input branch:

selecting preliminary de-watermarked image to be processed

And watermark image I_w；

Removing the watermark image preliminarily

Wherein the content of the first and second substances,