CN115170378A - Video digital watermark embedding and extracting method and system based on deep learning - Google Patents

Video digital watermark embedding and extracting method and system based on deep learning Download PDF

Info

Publication number
CN115170378A
CN115170378A CN202210661160.5A CN202210661160A CN115170378A CN 115170378 A CN115170378 A CN 115170378A CN 202210661160 A CN202210661160 A CN 202210661160A CN 115170378 A CN115170378 A CN 115170378A
Authority
CN
China
Prior art keywords
watermark
video
extracting
img
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210661160.5A
Other languages
Chinese (zh)
Inventor
王晗
张志伟
胡海
崔凯元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Forestry University
Original Assignee
Beijing Forestry University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Forestry University filed Critical Beijing Forestry University
Priority to CN202210661160.5A priority Critical patent/CN115170378A/en
Publication of CN115170378A publication Critical patent/CN115170378A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0021Image watermarking
    • G06T1/005Robust watermarking, e.g. average attack or collusion attack resistant
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/16Program or content traceability, e.g. by watermarking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/37Determination of transform parameters for the alignment of images, i.e. image registration using transform domain methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20048Transform domain processing
    • G06T2207/20064Wavelet transform [DWT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Technology Law (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Medical Informatics (AREA)
  • Image Processing (AREA)
  • Editing Of Facsimile Originals (AREA)

Abstract

The invention relates to a video digital watermark embedding and extracting method and a system based on deep learning, wherein the method comprises the following steps: s1: constructing a training set of images using the public video; s2: embedding and extracting a training set input video digital watermark into a network for training to obtain a trained model; the video digital watermark embedding and extracting network comprises: the system comprises a watermark embedding network, an image transformation module and a watermark extraction network; s3: extracting a key frame of a video to be embedded with the watermark, inputting the key frame and the watermark into a trained video digital watermark embedding network, outputting the key frame embedded with the watermark, and then putting back the video to be embedded with the watermark; s4: and extracting the frame to be detected containing the digital watermark video, correcting the frame, inputting the frame to the trained video digital watermark extraction network, and extracting the watermark. The invention provides a video digital watermark embedding and extracting method capable of tracing a source, which has strong robustness and makes a contribution to digital video leakage tracing and video intellectual property protection under a new media environment.

Description

Video digital watermark embedding and extracting method and system based on deep learning
Technical Field
The invention relates to the field of digital watermarks, in particular to a video digital watermark embedding and extracting method and system based on deep learning.
Background
With the development of computer and network technologies, multimedia products are becoming digital, and digital audio-video products are getting into people's lives. Although digitization makes multimedia information easier to edit, make, store, transmit, and improves the quality of audio-visual products, it also brings new copyright problem. For example, unlimited copying of a highly valued work without the consent of the owner of the work would result in considerable economic loss to the producer and content provider. Moreover, due to the advantage of digitization, the video information is extremely easy to tamper, and the integrity of the original work is seriously threatened. Some information with special significance, such as information related to judicial litigation, government agencies, etc., is subject to malicious attack and falsification. The negative effects caused by the characteristics of the series of digital technologies become a great obstacle to the health and continuous development of the information industry.
Therefore, copyright protection for digital products is becoming increasingly important. It is often considered that the implementation of copyright protection can be done by encryption. The method includes the steps that firstly, a multimedia data file is encrypted into a ciphertext and then is issued, so that an illegal attacker who appears in the network transmission process cannot obtain confidential information from the ciphertext, and the purposes of copyright protection and information security are achieved. On one hand, the encrypted file hinders the propagation of multimedia information due to the unintelligibility of the file; on the other hand, multimedia information is easy to attract curiosity and attention of attackers after being encrypted, and has the possibility of being cracked, and once the encrypted file is cracked, the content of the encrypted file is completely transparent. Cryptography has been considered and valued as the primary means of information security in the field of communications research applications, and this has not changed until the last few years. The existing copyright protection system mostly adopts a password authentication technology (such as a security password of a DVD optical disc), but the problem of copyright protection cannot be completely solved only by adopting the password, and the password can only carry out data encryption protection in the transmission process of data from a sender to a receiver. However, when the information is received and decrypted, all encrypted documents are the same as ordinary documents and are not protected any more, and thus the encrypted documents cannot survive piracy. Therefore, how to perform digital product copyright protection and data security maintenance becomes a problem to be solved urgently.
Disclosure of Invention
In order to solve the technical problem, the invention provides a video digital watermark embedding and extracting method and system based on deep learning.
The technical solution of the invention is as follows: a video digital watermark embedding and extracting method based on deep learning comprises the following steps:
step S1: extracting a preset number of video frames from the public video and cutting to obtain an input image; generating a random binary string as watermark information data, and constructing a training set by the input image and the watermark information data;
step S2: inputting the images and watermark information data in the training set into a video digital watermark embedding and extracting network together for training to obtain a trained video digital watermark embedding and extracting network; wherein, the video digital watermark embedding and extracting network comprises: the video digital watermark embedding network is used for embedding the watermark W into the input image Img to obtain the image Img containing the watermark encoded (ii) a The image conversion module is used for converting Img encoded Performing attack transformation to obtain Img' encoded (ii) a And the video digital watermark extraction network is used for extracting Img' encoded Medium watermark W' based on Img encoded Respectively constructing loss functions with Img and W' and W, and updating network parameters until the trained video digital watermark is embedded into and extracted from the network;
and step S3: extracting key frames of a video to be embedded with a watermark, inputting the key frames and the watermark containing user information into a trained video digital watermark embedding network, outputting the key frames embedded with the watermark, and putting the key frames embedded with the watermark back into the video to be embedded with the watermark;
and step S4: and extracting the frame to be detected of the video containing the watermark, inputting the trained video digital watermark extraction network, extracting the watermark and acquiring the user information in the watermark.
Compared with the prior art, the invention has the following advantages:
the invention discloses a video digital watermark embedding and extracting method based on deep learning, which is realized by a video digital watermark algorithm based on DWT (Discrete Wavelet Transform) domain and deep learning and an image registration method based on SIFT (scale invariant feature Transform) characteristics.
Drawings
Fig. 1 is a flowchart of a video digital watermark embedding and extracting method based on deep learning in an embodiment of the present invention;
FIG. 2 is a schematic diagram of a video digital watermark embedding network according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a network structure for extracting video digital watermarks in an embodiment of the present invention;
fig. 4 is a block diagram of a video digital watermark embedding and extracting system based on deep learning according to an embodiment of the present invention.
Detailed Description
The invention provides a video digital watermark embedding and extracting method based on deep learning, which is used for embedding and extracting watermarks of traceable sources for videos and making contributions to digital video leakage tracing and video intellectual property protection in a new media environment.
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings.
For a better understanding of the embodiments of the present invention, the Discrete Wavelet Transform (DWT) used in the embodiments of the present invention is explained:
discrete Wavelet Transform (DWT) can represent information of an image in a frequency domain and a time domain at the same time, and is a convenient image processing mode. The wavelet transform is a breakthrough to the fourier transform and the short-time fourier transform, and the change is that finite length wavelet which can be attenuated is used to replace infinite length trigonometric function base, and in numerical analysis and functional analysis, the discrete wavelet transform is any wavelet transform which performs discrete sampling on the wavelet. One key advantage of discrete wavelet transforms over fourier transforms, as with other wavelet transforms, is the ability to time resolve, i.e., the discrete wavelet transform captures frequency and location information based on time. In computer vision systems, discretization of the input variables is required for ease of storage and computation, which requires discretization of the wavelet, i.e., DWT. The DWT is obtained by discretizing the scale factor and the shifting factor in the wavelet transform.
The DWT has the following transformation formula:
Figure BDA0003690924820000031
Figure BDA0003690924820000032
wherein
Figure BDA0003690924820000033
To approximate the coefficients, W ψ Is a wavelet detail coefficient;
Figure BDA0003690924820000034
for the scale function, ψ (t) is a wavelet function, and the summation range n is 0,1,2.
Figure BDA0003690924820000035
ψ j,k (t)=2 j/2 ψ(2 j t-k),j,k≥0
Where j represents the scale and expansion of the wavelet function in the frequency domain and k represents the translation of the function in the time domain.
The two-dimensional image is subjected to DWT transformation and is decomposed into four different components, namely horizontal LH (low frequency, high frequency), vertical HL (high frequency, low frequency), diagonal HH (high frequency ) and low frequency LL (low frequency ), wherein the low frequency components are concentrated on the left side, and the DWT transformation can be continuously performed on the four components in a recursive manner. The image information contained in the low-frequency component is more important, and the quality of the image is greatly influenced if the disturbance is too large; the high frequency components are details of the image, and the image quality is not affected much if the high frequency components are discarded in the image compression.
The formula for inverse transformation of IDWT is as follows:
Figure BDA0003690924820000041
wherein j is generally calculated 0 Put 0,M as a power of 2, sum range k is 0,1,2 j -1。
Analysis on the image transformed by the DWT shows that if the image is attacked, the low-frequency region is generally not affected much, and almost complete information can be retained, so the DWT-based watermark usually writes the content in the low-frequency region to cope with the possible attack. But with the problem that the quality of the image may be affected considerably.
Example one
As shown in fig. 1, a video digital watermark embedding and extracting method based on deep learning according to an embodiment of the present invention includes the following steps:
step S1: extracting a preset number of video frames from the public video and cutting to obtain an input image; generating a random binary string as watermark information data, and constructing a training set by the input image and the watermark information data;
step S2: inputting the images and watermark information data in the training set into a video digital watermark embedding and extracting network together for training to obtain a trained video digital watermark embedding and extracting network; the video digital watermark embedding and extracting network comprises: the video digital watermark embedding network is used for embedding the watermark W into the input image Img to obtain the image Img containing the watermark encoded (ii) a The image conversion module is used for converting Img encoded Performing attack transformation to obtain Img' encoded (ii) a And the video digital watermark extraction network is used for extracting Img' encoded Medium watermark W' based on Img encoded Respectively constructing loss functions with Img and W' and W, and updating network parameters until the trained video digital watermark is embedded into and extracted from the network;
and step S3: extracting key frames of a video to be embedded with the watermark, inputting the key frames and the watermark containing user information into a trained video digital watermark embedding network, outputting the key frames embedded with the watermark, and embedding the key frames embedded with the watermark into the video to be embedded with the watermark;
and step S4: and extracting the frame to be detected of the video containing the watermark, inputting the trained video digital watermark extraction network, extracting the watermark and acquiring the user information in the watermark.
In one embodiment, the step S1: extracting a preset number of video frames from the public video and cutting the video frames to obtain a training set of key frames for constructing the input image, wherein the training set specifically comprises the following steps:
extracting a certain number of video frames from the public video as key frames, cutting the key frames according to a preset size, normalizing to obtain preprocessed key frames and obtain input images; and meanwhile, generating a random binary string containing user information as watermark information data, and constructing a training set by the input image and the watermark information data.
In one embodiment, the video digital watermark embedding network in step S2 embeds the watermark W into the input image Img to obtain the image Img containing the watermark encoded The method specifically comprises the following steps:
step S201: acquiring watermark information data and converting the watermark information data into a watermark array W e {0,1} N N is the length of W; if the length of the watermark array is not enough, then 0 is supplemented until the preset length N; a first bit of W is provided with a flag bit for identifying whether the watermark is correct or not;
step S202: acquiring a picture Img in a training set, selecting a region with a preset size from the center of the Img, converting the region from an RGB color space to a YCbCr color space, and extracting a Y component matrix H in the region, wherein the size of the H is H multiplied by w; partitioning a component matrix H into 8 x 8 sized sets of sub-blocks B y (i) I ∈ N, N = (h × w)/(8 × 8); to B y (i) Carrying out DWT to obtain a transform subblock set B dwt (i);
As shown in fig. 2, W (i) represents the ith binary watermark in the watermark array W, and W (i) becomes a 1 × 4 × 4 data block after being expanded; b y (i) Y-channel data blocks representing a YCbCr color space, the size of which is 1 × 8 × 8; b is y (i) The data blocks are transformed into 4 multiplied by 4 after DWT;
step S203: two-dimensional watermark array W and transformation sub-block set B dwt (i) Inputting a video digital watermark embedding network, outputting a transformed sub-block set B 'containing a watermark after operation of a watermark embedding convolution module' dwt (i) As shown in fig. 2, the watermark embedding convolution module includes 5 two-dimensional convolution layers, a Batch Normalization (Batch Normalization) layer and a ReLU activation layer are used between the two-dimensional convolution layers, and the output channel depths of the 5 two-dimensional convolution layers are 16, and 4, respectively;
inputting the 1 × 4 × 4 data block obtained in step S202 and the 4 × 4 × 4 data block together into a video digital watermark embedding network, and then generating a data block of 5 × 4 × 4 after splicing in the first dimension; then, a data block B 'of 4X 4 in size is generated by passing through 5 two-dimensional convolution layers' dwt (i)。
Step S204: for the coded transform subblocksCollection B' dwt (i) Each sub-block in the color space is respectively subjected to inverse DWT conversion to obtain a Y 'component containing the watermark, and the Y' component is combined into a YCbCr color space and then converted into an RGB color space; obtaining an image Img containing a watermark encoded
Step S205: calculating Img according to equation (1) encoded And Loss value of Img Loss img
Loss img =LPIPS(Img,Img encoded ) (1)。
In one embodiment, the image transformation module in step S2 is configured to transform Img encoded Subjected to conversion enhancement to obtain Img' encoded The method specifically comprises the following steps:
will Img encoded Input image conversion module for Img encoded Adding random noise, gaussian blur, JPEG image compression or brightness change to obtain the watermark-containing image Img 'after transformation enhancement' encoded
In order to improve the robustness of a video digital watermark extraction network in the training process, aggressivity is added to a watermark-containing image, such as adding random noise, gaussian blur, JPEG image compression or changing brightness. When the trained video digital watermark is used for embedding and extracting the network, an image transformation module is not needed.
In one embodiment, the video digital watermark extraction network in the step S2 is used to extract Img' encoded The medium watermark W' specifically includes:
step S211: img' encoded Converting the RGB color space into the YCbCr color space, and extracting a Y component matrix H in the YCbCr color space, wherein the size of H is H multiplied by w; partitioning a component matrix H into 8 x 8 sized sets of sub-blocks B y (i) I ∈ N, N = (h × w)/(8 × 8); DWT conversion is carried out on B (i) to obtain a conversion subblock set B dwt (i);
As shown in FIG. 3, B y (i) Y-channel data blocks representing a YCbCr color space, the size of which is 1 × 8 × 8; b y (i) After DWT, the data become 4 multiplied by 4 data block B dwt (i);
Step S212: transforming sub-block set B dwt (i) Transmitting the video digital watermark to a video digital watermark extraction network, and extracting the watermarkAfter the convolution module operates, outputting a watermark array W', wherein, as shown in fig. 3, the watermark extraction convolution module includes 4 two-dimensional convolution layers, 1 average pooling layer and 1 full-connection layer, a Batch Normalization (Batch Normalization) layer and a ReLU activation layer are used between the 4 two-dimensional convolution layers, and the output channel depths of the 4 two-dimensional convolution layers are respectively 16, 16 and 1;
the step S211 is to obtain 4 × 4 × 4 data block B dwt (i) Inputting a video digital watermark extraction network, and generating a data block B 'with the size of 1 multiplied by 4 through 4 two-dimensional convolution' dwt (i) (ii) a Then generating W' (i) with the size of 1 through global average pooling (mean pooling) and full connected layers (full connected layers);
step S213: calculating the mean square error Loss value Loss of W' and W according to the formula (2) msg
Figure BDA0003690924820000061
Step S214: and (3) constructing a total loss function as shown in a formula (3), and updating parameters of the video digital watermark embedding network and the video digital watermark extracting network through back propagation:
Loss total =γ img Loss imgmsg Loss msg (3)
wherein, gamma is img 、γ msg Are respectively Loss img And Loss msg And (4) weighting.
And continuously updating and optimizing parameters of the video digital watermark embedding network and the video digital watermark extracting network according to the total loss function until the trained video digital watermark embedding and extracting network is obtained.
In one embodiment, the step S3: extracting a key frame of a video to be embedded with a watermark, inputting the key frame and the watermark into a trained video digital watermark embedding network, outputting the key frame embedded with the watermark, and embedding the key frame embedded with the watermark into the video to be embedded with the watermark, wherein the method specifically comprises the following steps:
step S301: according to the frame extraction rule, the video to be embedded with the watermark: every predetermined frame, decimatingTaking a frame as a key frame F origin Converting the watermark character string to be embedded into a binary string watermark array W;
step S302: f is to be origin W, inputting the trained video digital watermark and embedding the video digital watermark into a network to obtain a video frame F containing the watermark water
Step S303: marking the watermarked video frame F according to the extraction frame rule water And putting back the video to be embedded with the watermark.
In one embodiment, the step S4: extracting a frame to be detected of a video containing a watermark, inputting the trained video digital watermark extraction network, and extracting the watermark, specifically comprising:
step S401: acquiring the first 200 frames of the video containing the watermark as a frame F to be detected for the watermark;
step S402: inputting F into the trained video digital watermark extraction network, and when the flag bit of the extracted watermark sequence is matched with the embedded watermark flag bit, ending the watermark extraction operation, and turning to the step S406; if the matched watermark can not be extracted after the F extraction is finished, the step S403 is carried out to carry out video deep watermark extraction;
step S403: each video frame F to be detected and all key frames F origin Carrying out comparison and correction;
step S404: comparing image similarity based on SIFT characteristics, matching by using a K nearest neighbor algorithm, and aligning the video to be detected with a correction frame by using a homography matrix in a rotating and converting mode; wherein, the homography matrix H is shown as formula (4):
Figure BDA0003690924820000071
wherein, [ x ] 1 y 1 1] T And [ x ] 2 y 2 1] T Respectively representing the homogeneous coordinates of the video frame to be detected and the correction frame, and calculating the average value of the coordinates 22 If the homography matrix is set to be 1, 8 unknown parameters exist in the homography matrix, each corresponding pixel point can generate 2 equations, one x equation and one y equation, and therefore four pixel points are needed to solve the problemSolving a homography matrix H; selecting qualified pixel points as inner group points through a random sampling consistency algorithm;
step S405: comparing each video frame F to be detected with the correction frame F origin If the number of the inner cluster points is set to be less than or equal to 25% of all the characteristic points, skipping the current correction frame; if the correction frame is larger than 25%, temporarily storing the matched correction frame; after the comparison of the video frame to be detected and all correction frames is completed, sequentially aligning the frames before and after the video frame to be detected with the correction frames according to a matching strategy from inside to outside, extracting the watermark through steps S401 and S402, if the watermark is not extracted, switching to the next video frame to be detected, and repeating the steps until the watermark is extracted or the complete video frame to be detected is detected;
step S406: the user information contained in the watermark is extracted.
The invention discloses a video digital watermark embedding and extracting method based on deep learning, which is realized by a video digital watermark algorithm based on DWT (Discrete Wavelet Transform) domain and deep learning and an image registration method based on SIFT (scale invariant feature Transform) characteristics.
Example two
As shown in fig. 4, an embodiment of the present invention provides a deep learning-based video digital watermark embedding and extracting system, which includes the following modules:
constructing a training set module 1: extracting a preset number of video frames from the public video and cutting to obtain an input image; generating a random binary string as watermark information data, and constructing a training set by the input image and the watermark information data;
the network training module 2: inputting the images and watermark information data in the training set into the video digital watermark embedding and extracting network for training to obtain the trained video digital watermark embedding and extracting networkComplexing; the video digital watermark embedding and extracting network comprises: the video digital watermark embedding network is used for embedding the watermark W into the input image Img to obtain the image Img containing the watermark encoded (ii) a The image conversion module is used for converting Img encoded Performing attack transformation to obtain Img' encoded (ii) a And the video digital watermark extraction network is used for extracting Img' encoded Medium watermark W' based on Img encoded Respectively constructing loss functions with Img and W' and W, and updating network parameters until the trained video digital watermark is embedded into and extracted from the network;
video digital watermark embedding module 3: extracting key frames of a video to be embedded with the watermark, inputting the key frames and the watermark containing user information into a trained video digital watermark embedding network, outputting the key frames embedded with the watermark, and putting the key frames embedded with the watermark back into the video to be embedded with the watermark;
the video digital watermark extraction module 4: and extracting the frame to be detected of the video containing the watermark, inputting the trained video digital watermark extraction network, extracting the watermark and acquiring the user information in the watermark.
The above examples are provided only for the purpose of describing the present invention, and are not intended to limit the scope of the present invention. The scope of the invention is defined by the appended claims. Various equivalent substitutions and modifications can be made without departing from the spirit and principles of the invention, and are intended to be within the scope of the invention.

Claims (7)

1. A video digital watermark embedding and extracting method based on deep learning is characterized by comprising the following steps:
step S1: extracting a preset number of video frames from the public video and cutting to obtain an input image; generating a random binary string containing user information as watermark information data, and constructing a training set by the input image and the watermark information data;
step S2: inputting the images and watermark information data in the training set into a video digital watermark embedding and extracting network together for training to obtain a trained video digital watermark embedding and extracting network; wherein the video frequencyThe word watermark embedding and extracting network comprises: the video digital watermark embedding network is used for embedding the watermark W into the input image Img to obtain the image Img containing the watermark encoded (ii) a The image conversion module is used for converting Img encoded Performing attack transformation to obtain Img' encoded (ii) a And the video digital watermark extraction network is used for extracting Img' encoded Medium watermark W' based on Img encoded Respectively constructing loss functions with Img and W' and W, and updating network parameters until the trained video digital watermark is embedded into and extracted from the network;
and step S3: extracting key frames of a video to be embedded with a watermark, inputting the key frames and watermark data containing user information into a trained video digital watermark embedding network, outputting the key frames embedded with the watermark, and putting the key frames embedded with the watermark back into the video to be embedded with the watermark;
and step S4: and extracting the frame to be detected of the video containing the watermark, inputting the frame to be detected into a trained video digital watermark extraction network, extracting the watermark and acquiring the user information in the watermark.
2. The deep learning-based video digital watermark embedding and extracting method of claim 1, wherein in step S2, the video digital watermark embedding network embeds the watermark W into the input image Img to obtain the image Img with the watermark encoded The method specifically comprises the following steps:
step S201: acquiring the watermark information data and converting the watermark information data into a watermark array W e {0,1} N N is the length of W; if the length of the watermark array is insufficient, then supplementing 0 to a preset length N; a first bit of W is provided with a flag bit for identifying whether the watermark is correct or not;
step S202: acquiring an image Img in the training set, selecting a region with a preset size from the center of the Img, converting the region from an RGB color space to a YCbCr color space, and extracting a Y component matrix H in the region, wherein the size of the H is H multiplied by w; partitioning the component matrix H into a set of subblocks B (i) of size 8 × 8, i ∈ N, N = (H × w)/(8 × 8); carrying out DWT (discrete wavelet transform) on the B (i) to obtain a transform subblock set B dwt (i);
Step S203: will be describedThe two-dimensional watermark array W and the transform sub-block set B dwt (i) Inputting the video digital watermark embedding network, outputting a transformed sub-block set B 'containing the watermark after the operation of the watermark embedding convolution module' dwt (i) Wherein the watermark embedding convolution module comprises: 5 two-dimensional convolutional layers, wherein a batch normalization layer and a ReLU activation layer are used between the two-dimensional convolutional layers;
step S204: pair of the encoded transform subblock set B' dwt (i) Each sub-block is respectively subjected to inverse DWT conversion to obtain a Y 'component containing the watermark, and the Y' component is combined into a YCbCr color space and then converted into an RGB color space; obtaining an image Img containing a watermark encoded
Step S205: calculating Img according to equation (1) encoded And Loss value of Img Loss img
Loss img =LPIPS(Img,Img encoded ) (1)。
3. The deep learning-based video digital watermark embedding and extracting method of claim 2, wherein the image transformation module in step S2 is configured to apply Img encoded Carrying out conversion attack to obtain Img' encoded The transformation attack specifically includes:
will Img encoded Input to the image conversion module, for Img encoded Adding random noise, gaussian blur, JPEG image compression or brightness change to obtain the watermark-containing image Img 'after transformation enhancement' encoded
4. The deep learning-based video digital watermark embedding and extracting method of claim 3, wherein the video digital watermark extracting network in the step S2 is used for extracting Img' encoded The medium watermark W' specifically includes:
step S211: img' encoded Converting the RGB color space into the YCbCr color space, and extracting a Y component matrix H in the YCbCr color space, wherein the size of H is H multiplied by w; partitioning the component matrix H into a set of subblocks B (i) of size 8 × 8, i ∈ N, N = (H × w)/(8 × 8); DWT conversion of B (i)Transforming to obtain transformed subblock set B dwt (i);
Step S212: transforming sub-block set B dwt (i) The video digital watermark is transmitted into the video digital watermark extraction network, and a watermark array W' is output after the operation of a watermark extraction convolution module, wherein the watermark convolution module comprises: 4 two-dimensional convolution layers, 1 average pooling layer and 1 full-connection layer, wherein a batch normalization layer and a ReLU activation layer are used between the two-dimensional convolution layers;
step S213: calculating the mean square error Loss value Loss of W' and W according to the formula (2) msg
Figure FDA0003690924810000021
Step S214: and (3) constructing a total loss function as shown in a formula (3), and updating parameters of the video digital watermark embedding network and the video digital watermark extracting network through back propagation:
Loss total =γ img Loss imgmsg Loss msg (3)
wherein, γ img 、γ msg Are respectively Loss img And Loss msg And (4) weighting.
5. The deep learning based video digital watermark embedding and extracting method according to claim 4, wherein the step S3: extracting a key frame of a video to be embedded with a watermark, inputting the key frame and the watermark into a trained video digital watermark embedding network, outputting the key frame embedded with the watermark, and putting the key frame embedded with the watermark back into the video to be embedded with the watermark, specifically comprising:
step S301: according to the frame image extraction rule, the watermark video to be embedded is: every predetermined frame, extracting a frame as a key frame F origin Converting the watermark character string to be embedded into a binary string watermark array W;
extracting one frame as a key frame F for every 120 frames of the video to be embedded with the watermark origin (ii) a And will contain user informationConverting the watermark character string to be embedded into a binary string watermark array W;
step S302: f is to be origin And W inputting the trained video digital watermark embedding network to obtain a watermark-containing video frame F water
Step S303: watermark-containing video frames F according to the extraction frame rule water And putting back the video to be embedded with the watermark.
Putting back the video frame F containing the watermark every 120 frames of the video to be embedded with the watermark water
6. The deep learning-based video digital watermark embedding and extracting method according to claim 4, wherein the step S4: extracting a frame to be detected of a video containing a watermark, inputting the frame to be detected into a trained video digital watermark extraction network, and extracting the watermark, specifically comprising:
step S401: acquiring the first 200 frames of the video containing the watermark as a frame F to be detected for the watermark;
step S402: inputting F into the trained video digital watermark extraction network, and when the extracted flag bit of the watermark sequence is matched with the embedded watermark flag bit, ending the watermark extraction operation, and turning to the step S406; if the matched watermark can not be extracted after the F extraction is finished, the step S403 is carried out to carry out video deep watermark extraction;
step S403: each video frame F to be detected and all key frames F origin Carrying out comparison and correction;
step S404: comparing image similarity based on SIFT features, matching by using a K nearest neighbor algorithm, and aligning and correcting the to-be-detected video and the correction frame by using a homography matrix in a rotating and converting mode; wherein the homography matrix H is shown as formula (4):
Figure FDA0003690924810000031
wherein, [ x ] 1 y 1 1] T And [ x ] 2 y 2 1] T Respectively representing the homogeneous coordinates of the video frame to be detected and the correction frame, and calculating h 22 Setting the homography matrix to be 1, wherein the homography matrix has 8 unknown parameters, and each corresponding pixel point can generate 2 equations, namely an x equation and a y equation, so that four pixel points are needed to solve the homography matrix H; selecting qualified pixel points as inner group points through a random sampling consistency algorithm;
step S405: comparing each video frame F to be detected with the correction frame F origin If the number of the inner cluster points is set to be less than or equal to 25% of all the feature points, skipping the current correction frame; if the correction frame is larger than 25%, temporarily storing the matched correction frame; after the comparison of the video frame to be detected and all the correction frames is completed, sequentially aligning the frames before and after the video frame to be detected with the correction frames according to a matching strategy from inside to outside, and then extracting the watermark through steps S401 and S402, if the watermark is not extracted, switching to the next video frame to be detected and repeating the steps until the watermark is extracted or the complete part of the video frame to be detected is detected;
step S406: and extracting user information contained in the watermark.
7. A video digital watermark embedding and extracting system based on deep learning is characterized by comprising:
constructing a training set module: extracting a preset number of video frames from the public video and cutting to obtain an input image; generating a random binary string as watermark information data, and constructing a training set by the input image and the watermark information data;
a network training module: inputting the images and watermark information data in the training set into a video digital watermark embedding and extracting network together for training to obtain a trained video digital watermark embedding and extracting network; wherein, the video digital watermark embedding and extracting network comprises: the video digital watermark embedding network is used for embedding the watermark W into the input image Img to obtain the image Img containing the watermark encoded (ii) a Image transformation attackThe click module is used for sending Img encoded Performing conversion attack to obtain Img' encoded (ii) a And the video digital watermark extraction network is used for extracting Img' encoded Medium watermark W' based on Img encoded Respectively constructing loss functions with Img and W' and W, and updating network parameters until the trained video digital watermark is embedded into and extracted from the network;
video digital watermark embedding module: extracting key frames of a video to be embedded with a watermark, inputting the key frames and a watermark binary string containing user information into a trained video digital watermark embedding network, outputting the key frames embedded with the watermark, and putting the key frames embedded with the watermark back into the video to be embedded with the watermark;
video digital watermark extraction module: and extracting the frame to be detected of the video containing the watermark, inputting the frame to be detected into a trained video digital watermark extraction network, extracting the watermark and acquiring the user information in the watermark.
CN202210661160.5A 2022-06-13 2022-06-13 Video digital watermark embedding and extracting method and system based on deep learning Pending CN115170378A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210661160.5A CN115170378A (en) 2022-06-13 2022-06-13 Video digital watermark embedding and extracting method and system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210661160.5A CN115170378A (en) 2022-06-13 2022-06-13 Video digital watermark embedding and extracting method and system based on deep learning

Publications (1)

Publication Number Publication Date
CN115170378A true CN115170378A (en) 2022-10-11

Family

ID=83485030

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210661160.5A Pending CN115170378A (en) 2022-06-13 2022-06-13 Video digital watermark embedding and extracting method and system based on deep learning

Country Status (1)

Country Link
CN (1) CN115170378A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117094872A (en) * 2023-10-20 2023-11-21 中科亿海微电子科技(苏州)有限公司 FPGA processing system and method for digital watermarking
CN117395474A (en) * 2023-12-12 2024-01-12 法序(厦门)信息科技有限公司 Locally stored tamper-resistant video evidence obtaining and storing method and system
CN117974414A (en) * 2024-03-28 2024-05-03 中国人民解放军国防科技大学 Digital watermark signature verification method, device and equipment based on converged news material

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111311472A (en) * 2020-01-15 2020-06-19 中国科学技术大学 Property right protection method for image processing model and image processing algorithm
CN111445378A (en) * 2020-04-21 2020-07-24 焦点科技股份有限公司 Neural network-based image blind watermark embedding and detecting method and system
CN113613073A (en) * 2021-08-04 2021-11-05 北京林业大学 End-to-end video digital watermarking system and method
CN114255151A (en) * 2020-09-25 2022-03-29 浙江工商大学 High-resolution image robust digital watermarking method based on key point detection and deep learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111311472A (en) * 2020-01-15 2020-06-19 中国科学技术大学 Property right protection method for image processing model and image processing algorithm
CN111445378A (en) * 2020-04-21 2020-07-24 焦点科技股份有限公司 Neural network-based image blind watermark embedding and detecting method and system
CN114255151A (en) * 2020-09-25 2022-03-29 浙江工商大学 High-resolution image robust digital watermarking method based on key point detection and deep learning
CN113613073A (en) * 2021-08-04 2021-11-05 北京林业大学 End-to-end video digital watermarking system and method

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117094872A (en) * 2023-10-20 2023-11-21 中科亿海微电子科技(苏州)有限公司 FPGA processing system and method for digital watermarking
CN117094872B (en) * 2023-10-20 2023-12-26 中科亿海微电子科技(苏州)有限公司 FPGA processing system and method for digital watermarking
CN117395474A (en) * 2023-12-12 2024-01-12 法序(厦门)信息科技有限公司 Locally stored tamper-resistant video evidence obtaining and storing method and system
CN117395474B (en) * 2023-12-12 2024-02-27 法序(厦门)信息科技有限公司 Locally stored tamper-resistant video evidence obtaining and storing method and system
CN117974414A (en) * 2024-03-28 2024-05-03 中国人民解放军国防科技大学 Digital watermark signature verification method, device and equipment based on converged news material
CN117974414B (en) * 2024-03-28 2024-06-07 中国人民解放军国防科技大学 Digital watermark signature verification method, device and equipment based on converged news material

Similar Documents

Publication Publication Date Title
Mahto et al. A survey of color image watermarking: State-of-the-art and research directions
Reddy et al. High capacity and security steganography using discrete wavelet transform
Shen et al. A DWT-SVD based adaptive color multi-watermarking scheme for copyright protection using AMEF and PSO-GWO
Amrit et al. Survey on watermarking methods in the artificial intelligence domain and beyond
CN115170378A (en) Video digital watermark embedding and extracting method and system based on deep learning
Gulve et al. An image steganography method hiding secret data into coefficients of integer wavelet transform using pixel value differencing approach
Shen et al. A novel data hiding for color images based on pixel value difference and modulus function
Hajjaji et al. A Watermarking of Medical Image: Method Based" LSB"
CN112085016A (en) Robust watermarking method based on ROI and IWT medical image tampering authentication and recovery
Avci et al. A novel reversible data hiding algorithm based on probabilistic XOR secret sharing in wavelet transform domain
Thanikaiselvan et al. High security image steganography using IWT and graph theory
AlShaikh et al. A novel CT scan images watermarking scheme in DWT transform coefficients
Narula et al. Comparative analysis of DWT and DWT-SVD watermarking techniques in RGB images
Ouyang et al. A semi-fragile reversible watermarking method based on qdft and tamper ranking
Kumar et al. Image steganography using index based chaotic mapping
Juang et al. Histogram modification and wavelet transform for high performance watermarking
Ling et al. Watermarking for image authentication
Alkhliwi Huffman encoding with white tailed eagle algorithm-based image steganography technique
Prajwalasimha et al. Digital image watermarking based onSine hyperbolic transformation
Kejgir et al. Robust multichannel colour image watermarking using lifting wavelet transform with singular value decomposition
Li et al. A robust watermarking scheme based on maximum wavelet coefficient modification and optimal threshold technique
Meikap et al. Context pixel-based reversible data hiding scheme using pixel value ordering
Roy et al. A robust reversible image watermarking scheme in DCT domain using Arnold scrambling and histogram modification
Grandhe et al. Improving The Hiding Capacity of Image Steganography with Stego-Analysis
Ayubi et al. A chaos based blind digital image watermarking in the wavelet transform domain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20221011

WD01 Invention patent application deemed withdrawn after publication