CN112866820B

CN112866820B - Robust HDR video watermark embedding and extracting method and system based on JND model and T-QR and storage medium

Info

Publication number: CN112866820B
Application number: CN202011644779.2A
Authority: CN
Inventors: 骆挺; 杜萌; 宋洋; 高巍
Original assignee: College of Science and Technology of Ningbo University
Current assignee: College of Science and Technology of Ningbo University
Priority date: 2020-12-31
Filing date: 2020-12-31
Publication date: 2022-03-08
Anticipated expiration: 2040-12-31
Also published as: CN112866820A

Abstract

The application relates to a robust HDR video watermark embedding and extracting method, a robust HDR video watermark embedding and extracting system and a robust HDR video watermark embedding and extracting storage medium based on a JND model and T-QR, which solve the problem that the robustness is poor due to the fact that only the visual characteristics of a spatial domain are considered, and comprise the steps of extracting key frame image information according to HDR video information; decomposing according to the key frame image information to extract a matrix to be embedded; constructing a JND model according to the brightness perception information, the contrast perception information and the time domain perception information; partitioning the JND model to form an embedded reference matrix; embedding a preset watermark matrix into a matrix to be embedded according to the embedding reference matrix to form an embedding matrix until all the matrixes to be embedded corresponding to the key frame image information complete the embedding of the watermark matrix; sequentially embedding to form key frame image information for completing watermark embedding; the key frame image information is synthesized to form HDR video information that completes watermark embedding. According to the method and the device, the spatial domain and the time domain are comprehensively considered according to the JND model, and the robustness of the video is improved.

Description

Robust HDR video watermark embedding and extracting method and system based on JND model and T-QR and storage medium

Technical Field

The application relates to the technical field of video processing, in particular to a robust HDR video watermark embedding and extracting method and system based on a JND model and T-QR and a storage medium.

Background

With the development of internet technology, the traditional Low Dynamic Range (LDR) video cannot meet the visual enjoyment of people, and the High Dynamic Range (HDR) video has attracted increasing attention in digital photography, ultra-High definition movies and televisions, video games, remote sensing and medical imaging as an important development and breakthrough in the field of digital images. Different from the traditional low dynamic range image, the high dynamic range image records the pixel information by adopting floating point data, can more accurately record all color range values of a real scene, and can show rich color details and bright-dark levels. How to protect the copyright of high dynamic range video has become an urgent problem to be solved.

In the related technology, in the watermark embedding process, a high dynamic range host image is expressed in a form of three-order tensor, then the high dynamic range host image is processed by using Tucker decomposition, and the obtained first characteristic diagram of the core tensor is used as an embedding carrier of watermark information, and the watermark is embedded according to a brightness mask.

However, the above high dynamic range image watermarking method only considers the visual features of the spatial domain, and thus the robustness corresponding to the embedding and extracting watermarking method of the HDR image/video is poor.

Disclosure of Invention

In order to improve the robustness of a high dynamic range image after embedding and extracting a watermark, the application provides a robust HDR video watermark embedding and extracting method and system based on a JND model and T-QR and a storage medium

In a first aspect, the present application provides a robust HDR video watermark embedding and extracting method based on a JND model and T-QR, including a watermark embedding method, specifically as follows:

acquiring HDR video information;

extracting key frame image information according to the HDR video information;

defining RGB data of the key frame image information as a third-order tensor; dividing RGB data of the key frame image information into non-overlapping first sub tensors; decomposing the first sub tensor to extract a matrix to be embedded;

analyzing and forming brightness perception information, contrast perception information and time domain perception information according to the key frame image information; constructing a JND model by using the brightness perception information, the contrast perception information and the time domain perception information;

carrying out non-overlapping partitioning on data corresponding to the JND model to form embedded reference matrixes in one-to-one correspondence with the first sub-tensors;

the matrix to be embedded corresponds to the embedded reference matrix one by one;

embedding a preset watermark matrix into a matrix to be embedded according to the embedding reference matrix to form an embedding matrix until all the matrixes to be embedded corresponding to the key frame image information complete the embedding of the watermark matrix;

all the embedded matrixes are reversely synthesized to form key frame image information of which the watermark is embedded;

and synthesizing all the watermark embedding completed key frame image information to form watermark embedding completed HDR video information.

By adopting the technical scheme, the key frame image information is extracted from the HDR video information, and the selection method of the key frame image information can be only selected and only known by designers, so that the secrecy of watermark embedding is ensured, the risk of cracking the key frame image information embedded by other people when finding the watermark embedding is reduced, and the safety is improved; in the process of constructing the JND model, not only spatial domain factors but also time domain factors are considered, time domain factors are added on the basis of the spatial domain factors, namely, the brightness, the contrast and the difference situation of the previous and next frames are comprehensively considered, after the brightness, the contrast and the difference situation of the previous and next frames are known, corresponding embedding strength is formed, if the JND model belongs to important information, the embedding strength is reduced, the distortion of an image is avoided, and if the JND model belongs to non-important information, the embedding strength is increased; by the method, the watermark imperceptibility and the image robustness are improved, and the guiding effect is better.

Preferably, the specific method for extracting the key frame image information according to the HDR video information is as follows:

acquiring all frame images in HDR video information;

judging the difference between two frame images by a histogram difference method to form an image difference value;

and comparing the image difference value with a preset comparison threshold value, selecting a corresponding frame image according to a comparison result, and taking the selected frame image as key frame image information.

Preferably, the method for forming the image difference value is as follows:

wherein l is the number of all frame images contained in the HDR video information; i represents a frame image of the ith frame; h (f)_i) A histogram representing a frame image for solving the ith frame; h (f)_i+1) A histogram representing a frame image for solving the (i + 1) th frame; max (h (f)_i),h(f_i+1) Denotes the choice h (f)_i) And h (f)_i+1) The larger of these.

By adopting the technical scheme, in the process of judging the key frame image information, a scene change detection method is adopted, namely the scene is used for detecting a motion scene in a video, the motion scene can be identified by using a histogram difference method, then the image difference value between the frame images is compared with a predefined comparison threshold value, and the frame images with high difference degree are selected as the key frame image information through the judgment of the comparison result; therefore, most frame images with different contents can be selected conveniently, more frame images can be covered as much as possible, the watermark can be embedded conveniently, and the safety is improved.

Preferably, the method for forming the luminance perception information is as follows:

acquiring a first brightness value of the key frame image information;

converting the first brightness value into a second brightness value required by constructing a JND model, wherein a specific formula is as follows:

wherein log₁₀(L_a) Representing a second luminance value; log (log)₁₀(L) represents a first luminance value;

the method for forming contrast perception information is as follows:

converting the second brightness value of two adjacent pixels into a contrast value; the specific formula is as follows:

wherein the content of the first and second substances,

is the contrast value between the ith pixel and the jth pixel; l is_iAnd L_jAre adjacent luminance pixels; k is a Gaussian pyramid level;

converting the contrast value into a visual contrast response value required by construction of a JND model; the specific formula is as follows:

the method for formally time domain perceptual information is as follows:

defining pixel points in the key frame image information as (x, y);

acquiring the gray level I (x, y) of a pixel point (x, y) at the moment t; defining the horizontal movement component of the optical flow W ═ (u, v) at the point as u (x, y) and the vertical movement component as v (x, y);

wherein the content of the first and second substances,

from the optical flow constraint term E_cWith global smoothing constraint term E_sTo obtain the minimization term E of the two;

optical flow constraint term E_cComprises the following steps: e_c＝∫∫(I_xu+I_yv+I_t)dxdy；

Wherein the content of the first and second substances,

global smoothing constraint term E_sComprises the following steps:

wherein the content of the first and second substances,

is the sum of squares of the optical flow gradient modes;

and

mean values of u and v, respectively;

the minimization term E is:

wherein λ is a preset value preset according to the noise condition in the graph;

e deriving u and v, respectively, to obtain an optical flow W ═ u, v:

the JND model in the construction form is as follows:

JND＝L_a+T+W。

by adopting the technical scheme, the JND model is used for representing the maximum image distortion which can not be perceived by human eyes, and the tolerance of the human eyes to image change is embodied, so that the JND model constructed by the spatial domain factors and the temporal domain factors can integrate a plurality of factors to consider the embedding strength, and the possibility of image distortion is reduced as much as possible; in the process of constructing the JND model, the brightness, the contrast and the difference of the front frame and the rear frame are constructed to form a matrix, and a plurality of dimensions are superposed through the logical operation of the matrix so as to guide the embedding strength.

Preferably, after selecting the corresponding frame image according to the comparison result, the selected frame image is further screened, and the specific method is as follows:

acquiring an RGB value of each pixel point in a frame image;

performing logical operation on RGB values of all pixel points of the frame image to form an information quantity judgment value;

comparing the information quantity judgment value with a preset information quantity judgment threshold value, and if the information quantity judgment value is greater than the information quantity judgment threshold value, defining the frame image as a high information quantity frame image; if the information quantity judgment value is less than or equal to the information quantity judgment threshold value, defining the frame image as a low information quantity frame image; and taking the high information content frame image and the low information content frame image as key frame image information.

By adopting the technical scheme, in the process of selecting the key frame image information, the key frame image information is classified, the information quantity of the current frame image is judged according to the RGB value of the pixel point, the frame image with the large information quantity is defined as a high information quantity frame image, the frame image with the small information quantity is defined as a low information quantity frame image, the corresponding embedding is conveniently carried out in the subsequent watermark embedding process, the distortion degree after the image embedding is reduced, and the robustness of the video is improved.

Preferably, the method for acquiring the watermark matrix specifically includes:

acquiring an initial watermark image;

symmetrically partitioning the initial watermark image to form a first pre-processed watermark image and a second pre-processed watermark image respectively;

respectively acquiring RGB data corresponding to the first pre-processed watermark image and the second pre-processed watermark image;

performing logical operation on all RGB values of the first pre-processed watermark image to form a first watermark judgment value; performing logical operation on all RGB values of the second pre-processed watermark image to form a second watermark judgment value;

comparing the first watermark decision value and the second watermark decision value with each other; if the first watermark decision value is greater than the second watermark decision value, defining the first pre-processed watermark image as a high information content watermark image, and defining the second pre-processed watermark image as a low information content watermark image; if the second watermark decision value is greater than the first watermark decision value, defining the second pre-processed watermark image as a high information content watermark image, and defining the first pre-processed watermark image as a low information content watermark image;

when the high information content frame image is subjected to watermark embedding, a matrix corresponding to the low information content watermark image is used as a watermark matrix; the low information content frame image and the high information content watermark image correspond to each other, and when the watermark is embedded into the low information content frame image, a matrix corresponding to the high information content watermark image is used as a watermark matrix.

By adopting the technical scheme, the initial watermark image is preprocessed and divided into a first preprocessed watermark image and a second preprocessed watermark image, and the first preprocessed watermark image and the second preprocessed watermark image are judged and analyzed for information content, so that a high-information-content watermark image and a low-information-content watermark image are formed; thereby mutually corresponding to the low information content frame image and the high information content frame image; the high information content frame image and the low information content watermark image correspond to each other, and the low information content frame image and the high information content watermark image correspond to each other; therefore, the distortion degree of the embedded image can be further reduced, and the robustness of the video is improved.

Preferably, the robust HDR video watermark embedding and extracting method based on the JND model and the T-QR comprises a watermark extracting method, which specifically comprises the following steps:

acquiring HDR video information with embedded watermark;

extracting key frame image information of which the watermark embedding is finished according to the HDR video information of which the watermark embedding is finished;

defining RGB data of the key frame image information with the embedded watermark as a third-order tensor; dividing RGB data of the key frame image information subjected to watermark embedding into non-overlapping second sub tensors; decomposing the second sub-tensor to extract a matrix to be extracted; the first sub tensor and the second sub tensor have the same size;

extracting a watermark matrix from the matrix to be extracted to form an initial matrix to be embedded until all the matrixes to be extracted corresponding to the key frame image information of which the watermark is embedded are extracted;

performing reverse synthesis on all the matrixes to be embedded after watermark extraction to form initial key frame image information;

and synthesizing all the watermark extraction-completed key frame image information to form initial HDR video information.

By adopting the technical scheme, aiming at the HDR video information which finishes watermark embedding, the original HDR video information can be restored again through the inverse process of the watermark embedding method, so that the functions of watermark embedding and extracting are realized.

In a second aspect, the present application provides a robust HDR video watermark embedding and extracting system based on a JND model and T-QR, which adopts the following technical solution: comprises a watermark embedding device; the watermark embedding apparatus includes:

the first acquisition module is used for acquiring HDR video information by a user;

the key frame image extraction module is used for extracting key frame image information according to the HDR video information;

the to-be-embedded matrix analysis module defines RGB data of the key frame image information as a third-order tensor; dividing RGB data of the key frame image information into non-overlapping first sub tensors; decomposing the first sub tensor to extract a matrix to be embedded;

the JND model construction module is used for analyzing and forming brightness perception information, contrast perception information and time domain perception information according to the key frame image information; constructing a JND model by using the brightness perception information, the contrast perception information and the time domain perception information; carrying out non-overlapping partitioning on data corresponding to the JND model to form embedded reference matrixes in one-to-one correspondence with the first sub-tensors; the matrix to be embedded corresponds to the embedded reference matrix one by one;

the watermark embedding module is used for embedding a preset watermark matrix into a matrix to be embedded according to the embedding reference matrix to form an embedding matrix until all the matrixes to be embedded corresponding to the key frame image information complete the embedding of the watermark matrix;

the first image synthesis module is used for carrying out reverse synthesis on all the embedded matrixes to form key frame image information which is embedded with the watermark; and

and the first video synthesis module is used for synthesizing all the watermark embedding completed key frame image information to form watermark embedding completed HDR video information.

Preferably, the watermark extracting device is included; the watermark extraction apparatus includes:

the second acquisition module is used for acquiring HDR video information of which the watermark embedding is finished;

the key frame image extraction module is used for extracting the key frame image information of which the watermark embedding is finished according to the HDR video information of which the watermark embedding is finished;

the to-be-extracted matrix analysis module defines RGB data of the key frame image information in which the watermark embedding is completed as a third-order tensor; dividing RGB data of the key frame image information subjected to watermark embedding into non-overlapping second sub tensors; decomposing the second sub-tensor to extract a matrix to be extracted; the first sub tensor and the second sub tensor have the same size;

the watermark extraction module is used for extracting a watermark matrix from the matrix to be extracted to form an initial matrix to be embedded until all the matrixes to be extracted corresponding to the key frame image information of which the watermark is embedded are extracted;

the second image synthesis module is used for reversely synthesizing all the matrixes to be embedded, which are subjected to watermark extraction, so as to form initial key frame image information; and

and the second video synthesis module is used for synthesizing all the key frame image information subjected to watermark extraction to form initial HDR video information.

In a third aspect, the present application provides a computer-readable storage medium, which adopts the following technical solutions:

a computer readable storage medium comprising a program stored thereon which when loaded and executed by a processor implements a robust HDR video watermark embedding and extraction method based on a JND model and T-QR as described above.

To sum up, the application comprises the following beneficial technical effects: according to the JND model, a space domain and a time domain are comprehensively considered, namely brightness, contrast and the difference situation of the previous frame and the next frame are comprehensively considered, after the brightness, the contrast and the difference situation of the previous frame and the next frame are known, corresponding embedding strength is formed, further the watermark imperceptibility and the image robustness are improved, and the video robustness is better improved in the guidance effect.

Drawings

Fig. 1 is a schematic flow chart of a watermark embedding method according to the present application.

Fig. 2 is a flow chart diagram of a specific method of extracting key frame image information from HDR video information according to the present application.

Fig. 3 is a flow chart illustrating a method for further screening a selected frame image according to the present application.

Fig. 4 is a flowchart illustrating the method for acquiring a watermark matrix according to the present application.

Fig. 5 is a flowchart of a watermark extraction method according to the present application.

Fig. 6 is a system block diagram of the present application relating to a watermark embedding apparatus.

Fig. 7 is a system block diagram of the present application relating to a watermark extraction apparatus.

Detailed Description

The present application is described in further detail below with reference to the attached drawings.

The present embodiment is only for explaining the present application, and it is not limited to the present application, and those skilled in the art can make modifications of the present embodiment without inventive contribution as needed after reading the present specification, but all of them are protected by patent law within the scope of the claims of the present application.

With the development of internet technology, traditional Low Dynamic Range (LDR) video cannot meet the visual enjoyment of people, and High Dynamic Range (HDR) video is increasingly popular. HDR video provides a wider range of luminances than traditional LDR video, and can accurately describe real scenes. However, because of the wide luminance range, HDR video cannot be directly played using current LDR devices. HDR video is usually converted into LDR video by a Tone Mapping Operator (TMO) without losing too much detailed information, and therefore how to protect the copyright of HDR video after passing the Tone Mapping Operator (TMO) has become a problem to be solved urgently. Therefore, the copyright of the HDR video is protected by embedding the watermark, and the watermark embedding can be divided into methods based on a spatial domain and a transform domain.

The watermark based on the spatial domain achieves the purpose of embedding the watermark by directly modifying the pixel value. This type of watermarking method is widely used in image content authentication because it is sensitive to any modification. However, the pixels of the HDR video are floating point values, and since different HDR videos have different luminance ranges, it is difficult to directly modify the pixels of the HDR video. There are also some methods to convert the floating point format of an HDR image to an integer format, but this may lose the details of the image.

Compared with a watermark method based on a spatial domain, the watermark embedding process based on the transform domain can enable the robustness of the image quality to be better and can resist various image attacks and video attacks. The transform domain watermarking method transforms an original carrier image from a space domain to a frequency domain through mathematical transformation, and watermark information is embedded into the transformed frequency domain; however, when the embedding strength is increased, visual distortion of the watermark-embedded image will result. In order to balance between the imperceptibility and the robustness of the watermark, a JND model is constructed, the change amplitude of each coefficient is adjusted by the JND model to embed the watermark, and the method obtains a good visual effect with high embedding strength.

The embodiment of the application provides a robust HDR video watermark embedding and extracting method based on a JND model and T-QR, which comprises a watermark embedding method and a watermark extracting method, wherein in the watermark embedding process, key frame image information is extracted according to HDR video information; partitioning the key frame image information, and performing T-QR decomposition on each partition to form a matrix to be embedded; constructing a JND model according to the brightness perception information, the contrast perception information and the time domain perception information; carrying out non-overlapping partitioning on data corresponding to the JND model to form embedded reference matrixes in one-to-one correspondence with the first sub-tensors; the matrix to be embedded corresponds to the embedded reference matrix one by one; embedding a preset watermark matrix into a matrix to be embedded according to the embedding reference matrix to form an embedding matrix until all the matrixes to be embedded corresponding to the key frame image information complete the embedding of the watermark matrix; all the embedded matrixes are reversely synthesized to form key frame image information of which the watermark is embedded; and synthesizing all the watermark embedding completed key frame image information to form watermark embedding completed HDR video information.

In the embodiment of the application, the key frame image information is extracted from the HDR video information, and the selection method of the key frame image information can be only selected and only known by designers, so that the secrecy of watermark embedding is ensured, the risk that other people find the key frame image information of the watermark embedding and crack the key frame image information is reduced, and the safety is improved; in the process of constructing the JND model, not only spatial domain factors but also time domain factors are considered, time domain factors are added on the basis of the spatial domain factors, namely, the brightness, the contrast and the difference situation of the previous and next frames are comprehensively considered, after the brightness, the contrast and the difference situation of the previous and next frames are known, corresponding embedding strength is formed, if the JND model belongs to important information, the embedding strength is reduced, the distortion of an image is avoided, and if the JND model belongs to non-important information, the embedding strength is increased; by the method, the watermark imperceptibility and the image robustness are improved, and the guiding effect is better.

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship, unless otherwise specified.

The embodiments of the present application will be described in further detail with reference to the drawings attached hereto.

The embodiment of the application provides a robust HDR video watermark embedding and extracting method based on a JND model and T-QR, and the main flow of the method is described as follows.

As shown in fig. 1, the watermark embedding method is specifically as follows:

step 1100: HDR video information is acquired.

The HDR video information refers to a High Dynamic Range (HDR) video, and in the acquisition process, the HDR video information may directly call video data stored in a storage medium such as a hard disk and a usb disk, or may be video data obtained by wireless transmission; the HDR video information can be acquired by a mechanical key triggering mode or a virtual key triggering mode; the HDR video information can be acquired by pressing a trigger key in a mechanical key triggering mode; the manner of virtual key triggering can be realized by pressing the relevant virtual trigger key in the interface of the corresponding software to acquire HDR video information.

Step 1200: key frame image information is extracted from the HDR video information.

The key frame image information refers to a frame image in which a watermark needs to be embedded in HDR video information, and which frame image is specifically selected is selected according to actual conditions; in the present application, as shown in fig. 2, a specific method for extracting key frame image information according to HDR video information is as follows:

step 1210: all frame images in the HDR video information are acquired.

The HDR video information is extracted frame by frame, all frame images based on the HDR video information are further acquired, and after all the frame images are acquired, subsequent analysis processing is facilitated, so that the required frame images are selected.

Step 1220: the difference between two frame images is judged by a histogram difference method to form an image difference value.

In the process of determining the key frame image information, a scene change detection method is adopted, that is, the method is used for detecting a motion scene in a video, the motion scene can be identified by using a histogram difference method, and the method for forming the image difference value specifically comprises the following steps:

Step 1230: and comparing the image difference value with a preset comparison threshold value, selecting a corresponding frame image according to a comparison result, and taking the selected frame image as key frame image information.

The comparison threshold value can be set according to actual conditions, so that specific limitation is not required; comparing the image difference value between the frame images with a predefined comparison threshold value, and selecting the frame images with relatively high difference as key frame image information according to the judgment of the comparison result; therefore, most frame images with different contents can be selected conveniently, more frame images can be covered as much as possible, the watermark can be embedded conveniently, and the safety is improved.

Step 1300: defining RGB data of the key frame image information as a third-order tensor; dividing RGB data of the key frame image information into non-overlapping first sub tensors; the first sub-tensor is decomposed to extract a matrix to be embedded.

A tensor is a form of multidimensional data that can store a lot of information. The P-order tensor can be written as:

wherein l₁,l₂...l_pE Z represents the number of elements in each dimension.

Thus, the vector can be treated as a first order tensor and the matrix as a second order tensor. The higher order tensor can be represented by a set of matrices. For example, the third order tensor can be divided into horizontal slices, lateral slices and frontal slices, respectively expressed as:

where k is ∈ {1,2,3 }.

Defining RGB data of key frame image information as third-order tensor

Size of MxNx3, third order tensor

Dividing the image into non-overlapping blocks, and defining each divided non-overlapping block as a first sub-tensor, wherein the size of the first sub-tensor is 4 multiplied by 3; defining the first sub-tensor as

Where s is the index of each block.

For the first sub-tensor

And (3) decomposing, wherein T-QR decomposition is adopted in the decomposition process, and the specific decomposition process is as follows:

QR decomposition is an important decomposition method in linear algebra and is suitable for any matrix. A matrix B of size mxn, which after QR decomposition can be decomposed as: b ═ Q × R.

Where Q is an orthogonal matrix of size m × n and R is an upper triangular matrix of size n × n.

For high-order QR decomposition, the method can be realized by utilizing T-QR decomposition, and the principle of the T-QR decomposition is as follows:

defining a third order tensor;

can be expressed as:

wherein the content of the first and second substances,

the values of which are 0 and 1,

is a quadrature tensor, the orthogonal tensor,

is the upper triangular tensor.

So to the first sub tensor

Performing T-QR decomposition, and

can be expressed as:

after mathematical transformation, it can be expressed as:

wherein the size of the first sub-tensor is 4 × 4 × 3, wherein l₁＝4,l₂4 and l₃＝3。

Thus, after the T-QR decomposition, an orthogonal tensor and an upper triangular tensor can be obtained. The orthogonal tensor comprises three orthogonal matrixes which are named as a first orthogonal matrix respectively

Second orthogonal matrix

And a third orthogonal matrix

According to actual test data, adopting a second orthogonal matrix to the matrix to be embedded

Watermark imperceptibility and image robustness can be further improved.

In the actual test process, the error rates corresponding to the first orthogonal matrix, the second orthogonal matrix and the third orthogonal matrix are obtained through testing after tone mapping attack, and the smaller the error rate value is, the better the robustness is, so the second orthogonal matrix is adopted in the method.

Step 1400: analyzing and forming brightness perception information, contrast perception information and time domain perception information according to the key frame image information; and constructing a JND model by using the brightness perception information, the contrast perception information and the time domain perception information.

The JND model is used for representing the maximum image distortion which cannot be perceived by human eyes, and the tolerance of the human eyes to image change is reflected. The JND model mainly comprises two aspects of consideration, namely the consideration of a spatial domain and the consideration of time and, wherein the specific spatial domain comprises background brightness and contrast perception.

Factors regarding the spatial domain:

in order to determine the apparent difference of human eyes at a given adaptive illumination level, a related psychophysical study was conducted. This experiment allows the viewer to adapt to the background illumination for a sufficiently long time and then increase the illumination to a level such that the viewer can perceive the change. Experiments have shown that the luminance range of HDR video content covers the entire range visible to the human visual characteristics (HVS). The response of the human eye in this range is neither always linear nor logarithmic, so it is necessary to model the perceived changes at different brightness levels. The background luminance of the input key frame image information is converted into luminance required by the JND model.

The method for forming the luminance perception information is as follows:

acquiring a first brightness value of the key frame image information;

wherein log₁₀(L_a) Representing a second luminance value; log (log)₁₀(L) represents a first luminance value.

The method for forming contrast perception information is as follows:

wherein the content of the first and second substances,

factors regarding the time domain:

and finding the corresponding relation between the previous frame image and the current frame image by using the change of the pixels in the image sequence in the time domain and the correlation between the adjacent frames, thereby calculating the motion information of the object between the adjacent frame images. The method expresses the change of the image, and the change of the image contains the information of the motion of the target, so that the method can be used by an observer for determining the motion condition of the target. The method is based on two assumptions, the first of which is constant brightness, which means that the brightness of a small area remains constant despite position changes. The second assumption is spatial smoothing, which means that neighboring points on the object have similar velocities and the velocity field of the object is smooth.

The method for formally time domain perceptual information is as follows:

according to the principle of visual perception, objects are generally relatively continuously moved in space, and during the movement, the image projected onto the sensor plane is also continuously changed in practice, i.e. the assumption of gray scale invariance. From this basic assumption, the following basic equation can be derived.

Defining pixel points in the key frame image information as (x, y); acquiring the gray level I (x, y) of a pixel point (x, y) at the moment t; defining the horizontal movement component of the optical flow W ═ (u, v) at the point as u (x, y) and the vertical movement component as v (x, y); wherein the content of the first and second substances,

the gray scale of the corresponding point after the time interval dt is I (x + dx, y + dy, t + dt); when dt → 0, the gray level I remains unchanged, I (x, y, t) ═ I (x + dx, y + y, t + dt), the formula is developed by taylor equation. And (4) arranging to obtain a basic constraint equation:

i.e. I_xu+I_yv+I_t0; wherein the content of the first and second substances,

optical flow constraint term E_cComprises the following steps: e_c＝∫∫(I_xu+I_yv+I_t)dxdy。

In order to obtain certain values of u and v, additional constraints need to be introduced, which are introduced from different angles, resulting in different optical flow field calculation methods: differentiation, frequency domain based methods, energy based methods, region matching based methods. The method proposed by Horn and Schunck, abbreviated as HS algorithm, is adopted herein.

The algorithm introduces global smooth constraint condition, and uses the sum of squares of flow gradient modes

Indicating when the sum of squares of the gradient modes of the optical flow

The smaller the light flow changes the slower and the optical flow field is relatively smooth. Defining a global smoothing constraint term E_sComprises the following steps:

wherein the content of the first and second substances,

is the sum of squares of the optical flow gradient modes;

and

mean values of u and v, respectively.

The optical flow W ═ (u, v) should satisfy the optical flow constraint term E_cAnd a global smoothing constraint term E_sE.

The minimization term E is:

wherein λ is a preset value preset according to the noise condition in the graph; if the noise is strong, the optical flow constraint needs to be relied on more, so the value of lambda is larger, and conversely, the value of lambda is smaller.

E deriving u and v, respectively, to obtain an optical flow W ═ u, v:

the JND model in the construction form is as follows:

JND＝L_a+T+W。

the JND model constructed by the spatial domain factors and the temporal domain factors can integrate a plurality of factors to consider the embedding strength, and the possibility of image distortion is reduced as much as possible; in the process of constructing the JND model, the brightness, the contrast and the difference of the front frame and the rear frame are constructed to form a matrix, and a plurality of dimensions are superposed through the logical operation of the matrix so as to guide the embedding strength.

Step 1500: carrying out non-overlapping partitioning on data corresponding to the JND model to form embedded reference matrixes in one-to-one correspondence with the first sub-tensors; the matrix to be embedded corresponds to the embedded reference matrix one to one.

After the construction of the JND model is completed, the JND model is just JND ═ L_a+ T + W; the JND model is also presented in the form of a matrix and the size of the matrix is the firstThe sizes of the sub tensors correspond to each other, so that data corresponding to the JND model is subjected to non-overlapping partitioning according to the size of M × N, where M is 4 and N is 4; and then an embedded reference matrix corresponding to the matrix to be embedded is formed so as to facilitate the embedding of the watermark. Each block is defined as F^sAnd s is an index of each block.

Step 1600: and embedding the preset watermark matrix into the matrix to be embedded according to the embedding reference matrix to form an embedding matrix until all the matrixes to be embedded corresponding to the key frame image information complete the embedding of the watermark matrix.

The preset watermark matrix is set according to actual conditions and corresponds to the matrix to be embedded, and the watermark embedding method specifically comprises the following steps:

wherein the content of the first and second substances,

a second orthogonal matrix for which watermark embedding has been completed; z-sum (F)^s) V ═ sum (z)/(M/4 × N/4), sum (-) is the sum of the return matrices,

wherein the content of the first and second substances,

and

the values of coordinates (2,1) and coordinates (3,1), i.e. the values used in solving avg.

Step 1700: and performing reverse synthesis on all the embedded matrixes to form key frame image information of the embedded watermark.

Wherein all the completed embedded matrices are inversely synthesized by inverse T-QR transformation, i.e.

Step 1800: and synthesizing all the watermark embedding completed key frame image information to form watermark embedding completed HDR video information.

As shown in steps 1210 to 1230, the HDR video information includes key frame image information and other frame image information, and after the watermark embedding is completed, the key frame image information is combined with the other frame image information to be synthesized, so as to implement the watermark embedding into the HDR video information.

In one embodiment, key frame image information is classified to form a high information content frame image and a low information content frame image, and the initial watermark image is classified to form a high information content watermark image and a low information content watermark image, the high information content frame image and the low information content watermark image are corresponding to each other, and the low information content frame image and the high information content watermark image are corresponding to each other; therefore, the distortion degree of the embedded image can be further reduced, and the robustness of the video is improved. Whether the following method is needed or not can be selected according to actual conditions.

As shown in fig. 3, after selecting the corresponding frame image according to the comparison result, the selected frame image is further screened, and the specific method is as follows:

step 1911: and acquiring the RGB value of each pixel point in the frame image.

The acquisition process can be acquired by a mechanical key triggering mode or a virtual key triggering mode; the mechanical key triggering mode can obtain the RGB value of each pixel point in the frame image by pressing the triggering key; the virtual key triggering mode can be realized by pressing a related virtual triggering key in an interface of corresponding software to acquire an RGB value of each pixel point in the frame image.

Step 1912: and performing logical operation on RGB values of all pixel points of the frame image to form an information quantity judgment value.

The logical operation in this embodiment is described by taking an addition operation as an example, that is, all the RGB values of the pixels of the frame image are accumulated, and the accumulated data forms an information amount determination value.

Step 1913: comparing the information quantity judgment value with a preset information quantity judgment threshold value, and if the information quantity judgment value is greater than the information quantity judgment threshold value, defining the frame image as a high information quantity frame image; if the information quantity judgment value is less than or equal to the information quantity judgment threshold value, defining the frame image as a low information quantity frame image; and taking the high information content frame image and the low information content frame image as key frame image information.

The preset information amount determination threshold may be set according to an actual situation, and is not specifically limited in this embodiment. In the process of selecting the key frame image information, classifying the key frame image information, judging the information quantity of the current frame image according to the RGB value of the pixel point, defining the large information quantity as a high information quantity frame image, and defining the small information quantity as a low information quantity frame image, so that the corresponding embedding is conveniently carried out in the subsequent watermark embedding process, the distortion degree of the embedded image is reduced, and the robustness of the video is improved.

Preferably, as shown in fig. 4, the method for acquiring the watermark matrix specifically includes the following steps:

step 1921: and acquiring an initial watermark image.

The acquisition process can be acquired by a mechanical key triggering mode or a virtual key triggering mode; the initial watermark image can be obtained by pressing the trigger key in a mechanical key triggering mode; the virtual key triggering mode can realize the acquisition of the initial watermark image by pressing the related virtual triggering key in the interface of the corresponding software. The initial watermark image is selected according to the size of the matrix to be embedded, the size of the matrix to be embedded in this application is M × N, and M is 4, N is 4, for example, and the initial watermark image may be selected to be 8 × 4 or 4 × 8, so that the initial watermark image is selected for the convenience of the subsequent steps.

Step 1922: and symmetrically partitioning the initial watermark image to form a first pre-processed watermark image and a second pre-processed watermark image respectively.

Wherein, the size of the initial watermark image is selected to be 8 × 4 or 4 × 8, and then the initial watermark image is symmetrically blocked, that is, two images with the size of 4 × 4 are formed after the symmetrical blocking, which are respectively defined as a first pre-processed watermark image and a second pre-processed watermark image.

Step 1923: and respectively acquiring RGB data corresponding to the first pre-processing watermark image and the second pre-processing watermark image.

The process of obtaining is the same as the method of obtaining RGB data in step 1911, and therefore, the description is omitted.

Step 1924: performing logical operation on all RGB values of the first pre-processed watermark image to form a first watermark judgment value; and performing logical operation on all RGB values of the second pre-processed watermark image to form a second watermark judgment value.

The logic operation in this step is all in an addition operation mode.

Step 1925: comparing the first watermark decision value and the second watermark decision value with each other; if the first watermark decision value is greater than the second watermark decision value, defining the first pre-processed watermark image as a high information content watermark image, and defining the second pre-processed watermark image as a low information content watermark image; and if the second watermark judgment value is larger than the first watermark judgment value, defining the second pre-processed watermark image as a high information content watermark image, and defining the first pre-processed watermark image as a low information content watermark image.

The initial watermark image is preprocessed to be divided into a first preprocessed watermark image and a second preprocessed watermark image, and the first preprocessed watermark image and the second preprocessed watermark image are judged and analyzed to analyze the size of information content, so that a high-information-content watermark image and a low-information-content watermark image are formed.

The low information content watermark image and the high information content watermark image respectively correspond to the low information content frame image and the high information content frame image; the high information content frame image and the low information content watermark image correspond to each other, and the low information content frame image and the high information content watermark image correspond to each other; therefore, the distortion degree of the embedded image can be further reduced, and the robustness of the video is improved.

In one embodiment, the robust HDR video watermark embedding and extracting method based on the JND model and the T-QR includes a watermark extracting method, as shown in fig. 5, specifically as follows:

step 2100: and acquiring HDR video information for completing watermark embedding.

Step 2200: and extracting the key frame image information for completing the watermark embedding according to the HDR video information for completing the watermark embedding.

Step 2300: defining RGB data of key frame image information which completes watermark embedding as third-order tensor; dividing RGB data of the key frame image information subjected to watermark embedding into non-overlapping second sub tensors; decomposing the second sub-tensor to extract a matrix to be extracted; wherein the first sub-tensor and the second sub-tensor have the same size.

Step 2400: and extracting the watermark matrix from the matrix to be extracted to form an initial matrix to be embedded until all the matrixes to be extracted corresponding to the key frame image information of which the watermark is embedded are extracted.

Step 2500: and performing reverse synthesis on all the matrixes to be embedded after watermark extraction to form initial key frame image information.

Step 2600: and synthesizing all the watermark extraction-completed key frame image information to form initial HDR video information.

The watermark extraction method is the inverse process of the watermark embedding method, the whole implementation method is similar to the process in the watermark embedding method, detailed description is not needed, and aiming at the HDR video information for completing watermark embedding, the original HDR video information can be restored through the inverse process of the watermark embedding method, so that the functions of watermark embedding and extracting are realized.

The embodiment of the application provides a computer readable storage medium, which comprises a computer readable storage medium and a computer readable storage medium, wherein the computer readable storage medium can be used for realizing the steps of FIG. 1-FIG. 5 when being loaded and executed by a processor. The individual steps described in the flow.

The computer-readable storage medium includes, for example: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Based on the same inventive concept, the embodiment of the present application provides a robust HDR video watermark embedding and extracting system based on a JND model and T-QR, which includes a memory, a processor, and a program stored in the memory and executable on the processor, and the program can be loaded and executed by the processor to implement fig. 1 to 5. The robust HDR video watermark embedding and extracting method based on the JND model and the T-QR is described in the flow.

The method specifically comprises a watermark embedding device and a watermark extracting device.

The watermark embedding apparatus includes:

A watermark extraction means; the watermark extraction apparatus includes:

It will be clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to perform all or part of the above described functions. For the specific working processes of the system, the apparatus and the unit described above, reference may be made to the corresponding processes in the foregoing method embodiments, and details are not described here again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: u disk, removable hard disk, read only memory, random access memory, magnetic or optical disk, etc. for storing program codes.

The above embodiments are only used to describe the technical solutions of the present application in detail, but the above embodiments are only used to help understanding the method and the core idea of the present application, and should not be construed as limiting the present application. Those skilled in the art should also appreciate that various modifications and substitutions can be made without departing from the scope of the present disclosure.

Claims

1. A robust HDR video watermark embedding and extracting method based on a JND model and T-QR is characterized by comprising a watermark embedding method, and specifically comprises the following steps:

acquiring HDR video information;

extracting key frame image information according to the HDR video information;

carrying out non-overlapping partitioning on data corresponding to the JND model to form embedded reference matrixes in one-to-one correspondence with the first sub-tensors; the matrix to be embedded corresponds to the embedded reference matrix one by one;

synthesizing all the key frame image information subjected to watermark embedding to form HDR video information subjected to watermark embedding;

the method for forming the luminance perception information is as follows:

acquiring a first brightness value of the key frame image information;

the method for forming contrast perception information is as follows:

wherein the content of the first and second substances,

the method for forming the time domain perception information is as follows:

defining pixel points in the key frame image information as (x, y);

wherein the content of the first and second substances,

Wherein the content of the first and second substances,

global smoothing constraint term E_sComprises the following steps:

wherein the content of the first and second substances,

is the sum of squares of the optical flow gradient modes;

and

mean values of u and v, respectively;

the minimization term E is:

e deriving u and v, respectively, to obtain an optical flow W ═ u, v:

the constructed JND model is as follows:

JND＝L_a+T+W。

2. the JND model and T-QR based robust HDR video watermark embedding and extraction method according to claim 1, wherein: specific methods for extracting key frame image information from HDR video information are as follows:

acquiring all frame images in HDR video information;

3. The JND model and T-QR based robust HDR video watermark embedding and extraction method according to claim 2, wherein: the method for forming the image difference value is as follows:

4. The JND model and T-QR based robust HDR video watermark embedding and extraction method according to claim 2, wherein: after selecting the corresponding frame image according to the comparison result, further screening the selected frame image, wherein the specific method comprises the following steps:

acquiring an RGB value of each pixel point in a frame image;

5. The JND model and T-QR based robust HDR video watermark embedding and extracting method as claimed in claim 4, wherein the watermark matrix obtaining method is as follows:

acquiring an initial watermark image;

6. The JND model and T-QR based robust HDR video watermark embedding and extracting method as claimed in claim 1, comprising a watermark extracting method, specifically as follows:

acquiring HDR video information with embedded watermark;

7. A robust HDR video watermark embedding and extracting system based on a JND model and T-QR is characterized by comprising a watermark embedding device; the watermark embedding apparatus includes:

the method for forming the luminance perception information is as follows:

acquiring a first brightness value of the key frame image information;

the method for forming contrast perception information is as follows:

wherein the content of the first and second substances,

the method for forming the time domain perception information is as follows:

defining pixel points in the key frame image information as (x, y);

wherein the content of the first and second substances,

Wherein the content of the first and second substances,

global smoothing constraint term E_sComprises the following steps:

wherein the content of the first and second substances,

is the sum of squares of the optical flow gradient modes;

and

mean values of u and v, respectively;

the minimization term E is:

e deriving u and v, respectively, to obtain an optical flow W ═ u, v:

the constructed JND model is as follows:

JND＝L_a+T+W；

8. The JND model and T-QR based robust HDR video watermark embedding and extraction system according to claim 7, comprising a watermark extraction device; the watermark extraction apparatus includes:

9. A computer-readable storage medium storing a program which, when being loaded and executed by a processor, implements the JND model and T-QR based robust HDR video watermark embedding and extraction method according to any one of claims 1 to 6.