CN111861846A

CN111861846A - Electronic document digital watermark processing method and system

Info

Publication number: CN111861846A
Application number: CN202010660167.6A
Authority: CN
Inventors: 曾国坤
Original assignee: Shenzhen Graduate School Harbin Institute of Technology
Current assignee: Shenzhen Graduate School Harbin Institute of Technology
Priority date: 2020-07-10
Filing date: 2020-07-10
Publication date: 2020-10-30

Abstract

The invention relates to a digital watermark processing method and a system for electronic documents, wherein the method comprises a watermark embedding step and a watermark extracting step, and the watermark embedding step comprises the following steps: acquiring user authority; acquiring watermark information, wherein the watermark information comprises user information; uploading the watermark information to a server; receiving the watermark information and storing the watermark information in a database; extracting a plurality of features in the watermark information according to a preset algorithm, and performing feature fusion to generate a digital watermark; after converting the electronic document into an image, embedding the digital watermark into the image, and restoring the image into the electronic document; the watermark extraction step is the reverse operation of the watermark embedding step. The invention not only can solve the problems of low embedding capacity, poor robustness and the like in the prior art, but also has the advantages of better encryption effect, large key space, strong sensitivity, capability of resisting exhaustive attack and statistical attack and the like.

Description

Electronic document digital watermark processing method and system

Technical Field

The invention relates to the technical field of digital watermark processing, in particular to a digital watermark processing method and a digital watermark processing system for electronic documents.

Background

The rapid development of computer networks and information technologies has accelerated the development of office automation and electronic commerce, and a great deal of digital information such as videos, images, mails, documents and the like is spread through networks, but the problems of illegal infringement and copyright protection are increasingly prominent, and the digital watermarking technology has therefore attracted much attention and research. The digital watermarking technology is realized in multimedia data, and mainly depends on the fact that a large amount of redundant information exists in the carrier information, the data redundancy of an electronic document image is very limited relative to other multimedia data, so that the development of the digital watermarking technology based on a text image is limited, and the problems of low embedding capacity, poor robustness and the like exist.

Disclosure of Invention

Based on this, there is a need to provide a digital watermarking method and system for electronic documents, which not only can solve the above problems, but also has the advantages of better encryption effect, large key space, strong sensitivity, and resistance to exhaustive attack and statistical attack.

In order to achieve the above purpose, the invention adopts the following technical scheme.

The invention firstly provides a digital watermark processing method of an electronic document, which comprises a watermark embedding step, wherein the watermark embedding step specifically comprises the following steps:

Acquiring user authority;

acquiring watermark information, wherein the watermark information comprises user information;

uploading the watermark information to a server;

receiving the watermark information and storing the watermark information in a database;

extracting a plurality of features in the watermark information according to a preset algorithm, and performing feature fusion to generate a digital watermark;

and after the electronic document is converted into an image, the digital watermark is embedded into the image, and then the image is restored into the electronic document.

The electronic document digital watermark processing method further comprises a watermark extraction step, wherein the watermark extraction step specifically comprises the following steps:

converting the electronic document embedded with the digital watermark into an image;

processing the image according to a preset algorithm, and extracting the digital watermark;

and converting the digital watermark into a watermark image by adopting inverse operation.

In the above electronic document digital watermark processing method, the watermark information is biometric information of the user, including one or more of facial features, hand features, fingerprint features and voice features of the user.

The invention further provides a digital watermark processing system for electronic documents, which comprises a watermark embedding unit, wherein the watermark embedding unit specifically comprises:

The authorization module is used for acquiring user permission;

the information acquisition module is used for acquiring watermark information, and the watermark information comprises user information;

the information uploading module is used for uploading the watermark information to a server;

the database access module is used for receiving the watermark information and storing the watermark information into a database of the server;

the watermark processing module is used for extracting a plurality of characteristics in the watermark information according to a preset algorithm, and performing characteristic fusion to generate a digital watermark;

and the watermark embedding module is used for embedding the digital watermark into the image after converting the electronic document into the image and restoring the image into the electronic document.

The above digital watermarking processing system for electronic documents further comprises a watermark extracting unit, wherein the watermark extracting unit specifically comprises:

the conversion module is used for converting the electronic document embedded with the digital watermark into an image;

the extraction module is used for processing the image according to a preset algorithm and extracting the digital watermark;

and the restoring module is used for converting the digital watermark into a watermark image by adopting inverse operation.

In the above digital watermark processing system for electronic documents, the digital watermark processing system for electronic documents is composed of a mobile client and a server, wherein the mobile client comprises the authorization module, an information acquisition module and an information uploading module, and the server comprises the database, a database access module, a watermark processing module, a watermark embedding module, a watermark decryption module, a search module and an extraction module.

In the above digital watermarking processing system for electronic documents, the watermark information is biological characteristic information of a user, including one or more of facial characteristics, hand characteristics, fingerprint characteristics and voice characteristics of the user;

the mobile client is installed on the mobile phone, and the information acquisition module comprises a camera, a microphone, a fingerprint recognizer and/or a face recognizer of the mobile phone.

The invention also provides an electronic document digital watermark processing method, which comprises a watermark embedding step of embedding the watermark image into the electronic document, wherein the watermark embedding step specifically comprises the following steps:

converting the electronic document into a carrier image with the bit depth of 8;

selecting white pixel points to reconstruct a carrier image, namely decomposing the carrier image into four frequency domains by using a discrete wavelet algorithm: LL, HL, LH, HH, select HH frequency domain part as the watermark embedding area;

performing singular value decomposition on the watermark image;

embedding the watermark image into the watermark embedding area, namely, replacing singular values of the watermark embedding area with singular values of the watermark image, and obtaining a reconstructed carrier sub-image embedded with the watermark by adopting inverse operation;

and restoring the reconstructed carrier sub-image to an electronic document image according to the original pixel position.

extracting pixel points of the electronic document image embedded with the watermark to obtain a carrier image embedded with the watermark;

carrying out discrete wavelet transform on the carrier image embedded with the watermark, and selecting an HH frequency domain part as a watermark extraction area;

performing singular value decomposition on the watermark extraction area, and extracting the singular value of the watermark extraction area;

and restoring the watermark image by using the obtained singular value by adopting inverse operation.

In the above electronic document digital watermark processing method, the watermark image contains the biometric information of the user, including one or more of facial features, hand features, fingerprint features and voice features of the user.

Compared with the traditional digital watermarking technology, the method has the following outstanding advantages:

1. aiming at the characteristics of pixel distribution of an electronic document image, the invention provides an electronic document digital watermark processing method and a system, a large number of white pixel points exist in an electronic document and are interspersed among lines of characters, the characteristic that the large number of the same pixel values exist is the characteristic that other natural images do not exist, a new carrier image is reconstructed through the pixel points, a watermark is embedded into the carrier image, the carrier image is restored to an original pixel position, and the scrambling operation of embedding the watermark is also realized in the mode of reconstruction and restoration.

2. By analyzing the capacity of white pixel points in the electronic document, the invention can realize 2 to 3 biological characteristics as watermark images, and experiments prove that the embedding of multiple watermarks does not cause great influence on the quality of carrier images, and the watermark algorithm still has better invisibility and robustness.

Drawings

FIG. 1 is a schematic flow chart illustrating a watermark embedding step according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart illustrating a watermark extraction step according to an embodiment of the present invention;

FIG. 3 is a schematic block diagram of an electronic document digital watermarking system according to a second embodiment of the present invention;

FIG. 4 is a schematic flow chart illustrating a watermark embedding step according to a third embodiment of the present invention;

FIG. 5 is a schematic flow chart illustrating the watermark extraction step in the third embodiment of the present invention;

fig. 6a is a diagram of a carrier image used in a simulation experiment in the third embodiment of the present invention;

fig. 6b is a diagram illustrating a watermark image used in a simulation experiment in the third embodiment of the present invention;

fig. 6c is a diagram illustrating a watermark image embedded in a simulation experiment in the third embodiment of the present invention;

fig. 6d is a diagram of a watermark image extracted in a simulation experiment in the third embodiment of the present invention;

FIG. 7 is a diagram of an electronic document image after embedding a watermark in a third embodiment of the present invention;

FIG. 8a is a diagram of a carrier image of a Chinese document for performing a watermark processing test on an electronic document based on multiple feature characteristics according to a third embodiment of the present invention;

8b, 8c, 8d are illustrations of watermark images to be embedded in the image shown in FIG. 8a, respectively;

FIG. 8e is a diagram of FIG. 8a after embedding 8b, 8c, 8d watermark images;

fig. 9a is a diagram of a carrier image of an english document for performing a watermark processing test on an electronic document based on multiple feature characteristics according to a third embodiment of the present invention;

FIGS. 9b, 9c, and 9d are illustrations of watermark images to be embedded in the image shown in FIG. 8a, respectively;

FIG. 9e is a diagram of FIG. 9a after embedding the 9b, 9c, 9d watermark images;

the implementation of the objects of the present invention and their functions and principles will be further explained in the detailed description with reference to the attached drawings.

Detailed Description

The following further description is made with reference to the drawings and specific embodiments.

The first embodiment is as follows:

as shown in fig. 1, the present embodiment provides a digital watermarking method for electronic documents, which mainly includes a watermark embedding step and a watermark extracting step. The watermark embedding step is a process of embedding a watermark image into an electronic document, and the watermark extracting step is a process of extracting an original watermark image from the electronic document in which the digital watermark is embedded, and is mainly used for identity authentication and the like.

Wherein, the watermark embedding step mainly comprises the following steps:

s11: acquiring user authority;

s12: acquiring watermark information, wherein the watermark information comprises user information;

s13: uploading the watermark information to a server;

s14: receiving the watermark information and storing the watermark information in a database;

s15: extracting a plurality of features in the watermark information according to a preset algorithm, and performing feature fusion to generate a digital watermark;

s16: and after the electronic document is converted into an image, the digital watermark is embedded into the image, and then the image is restored into the electronic document.

Specifically, the step S11 of acquiring the user authority refers to a process of acquiring the operation authority of the user after the user passes the authentication by logging in the account and the password of the registered user. The user may be the author of the electronic document, a copyright owner of the electronic document, or other stakeholder.

Step S12 is to collect watermark information, where the watermark information in this embodiment is original authentication or mark information that is intended to be embedded in the electronic document as a digital watermark, and may be characteristic information of the user, including one or more of facial characteristics, hand characteristics, fingerprint characteristics, and voice characteristics of the user.

And after the acquisition is finished, uploading the watermark information to a server.

The server then receives the watermark information and stores it in a database.

Meanwhile, for the watermark information, a plurality of features in the watermark information are extracted according to a preset algorithm and are fused to generate the digital watermark.

And finally, after the electronic document is converted into an image, the digital watermark is embedded into the image, and the image is restored into the electronic document.

The electronic documents referred to in this embodiment include, but are not limited to, office documents, pdf documents, and bitmap documents. Thus, after the watermark embedding operation of the electronic document is completed, the safety and the authenticity of the electronic document in transmission can be ensured.

Referring to fig. 2, the watermark extracting step mainly includes the following steps:

s21: converting the electronic document embedded with the digital watermark into an image;

s22: processing the image according to a preset algorithm, and extracting the digital watermark;

s23: and converting the digital watermark into a watermark image by adopting inverse operation.

The watermark extraction step is mainly used for judging whether the digital watermark is complete or is attacked, and basically belongs to the reverse operation of the watermark embedding step, and is not detailed here.

Digital images are inevitably affected during transmission, such as compression, scanning and the like, the attack is called unintentional attack, some pirates intentionally destroy digital products, falsify image contents and even forge watermark information, and the attack is called malicious attack. No matter which attack, the extracted watermark detection can be affected, and the watermark embedding and extracting technology of the embodiment can effectively avoid the electronic document from being damaged or tampered.

Example two:

as shown in fig. 3, the present embodiment provides an electronic document digital watermark processing system 100, which mainly comprises two parts, one is a mobile client, which can be installed on a mobile intelligent device such as a mobile phone, and is mainly used for registering a user and uploading a biometric image of the user as feature information of a subsequent embedded watermark; the first type is a server side which can be erected on an internet far-end or cloud server and is mainly responsible for storing and managing data and documents uploaded by a mobile client and processing electronic documents. The digital watermarking processing system 100 for electronic documents can be divided into a watermark embedding unit 10 and a watermark extracting unit 20, wherein the watermark embedding unit 10 comprises:

The authorization module 110 is located at the mobile client and used for acquiring the user right;

the information acquisition module 120 is located at the mobile client and is used for acquiring watermark information, wherein the watermark information comprises user information;

an information uploading module 130, located at the mobile client, for uploading the watermark information to the server;

the database access module 140 is located at the server end and used for receiving the watermark information and storing the watermark information into a database of the server;

the watermark processing module 150 is located at the server end and is used for extracting a plurality of features in the watermark information according to a predetermined algorithm, and performing feature fusion to generate a digital watermark;

and the watermark embedding module 160 is located at the server side and is used for embedding the digital watermark into the image after converting the electronic document into the image and restoring the image into the electronic document.

The watermark extraction unit 20 is located at the server side, and specifically includes:

a conversion module 210, configured to convert the digital watermark embedded electronic document into an image;

an extracting module 220, configured to process the image according to a predetermined algorithm, and extract the digital watermark;

and the restoring module 230 is configured to convert the digital watermark into a watermark image by using an inverse operation.

The extracted watermark image can be fed back to the mobile client for the user to check, store and authenticate.

The present embodiment uses biometric information of the user as watermark information, including but not limited to one or more of facial features, hand features, fingerprint features, and voice features of the user.

The mobile client is installed in intelligent mobile equipment such as a mobile phone, and the information acquisition module comprises a camera, a microphone, a fingerprint identifier and/or a face identifier of the equipment and is respectively used for acquiring face features, hand features, voice features and fingerprint features, and the like.

The embodiment combines multiple biological characteristics with digital watermarks, can improve the security of a document system, and realizes the protection of the copyright of a digital product, the certification of the authenticity of document contents, the tracking of piracy and the judgment of whether a carrier is tampered. If the user needs to judge the authenticity of the content of the electronic document or track the copyright, the watermark can be extracted from the electronic document for further tampering detection and biometric authentication.

Example three:

as shown in fig. 4, the present embodiment further specifically provides a digital watermarking method for electronic documents. Statistics shows that a large number of white pixel points exist in the electronic document image. Based on the discovery, the method selects the electronic document as a carrier image, adopts a proper technology to embed and extract the watermark, and specifically comprises the following two steps: firstly, watermark embedding step and secondly, watermark extraction step which is reverse to the watermark embedding step.

The watermark embedding step specifically includes:

s31: converting the electronic document into a carrier image with the bit depth of 8;

s32: selecting white pixel points to reconstruct a carrier image, namely decomposing the carrier image into four frequency domains by using a discrete wavelet algorithm: LL, HL, LH, HH, select HH frequency domain part as the watermark embedding area;

s33: performing singular value decomposition on the watermark image;

s34: embedding the watermark image into the watermark embedding area, namely, replacing singular values of the watermark embedding area with singular values of the watermark image, and obtaining a reconstructed carrier sub-image embedded with the watermark by adopting inverse operation;

s35: and restoring the reconstructed carrier sub-image to an electronic document image according to the original pixel position.

The pixel value distribution of the image with the bit depth of 8 is generally concentrated in the middle and rear sections of the gray value, and through counting a large number of electronic documents, the pixel value is concentrated and distributed between 80 and 255, a large number of white pixel points exist, and the situation only occurs in the documents.

The watermark image contains the biological characteristic information of the user, and comprises one or more of facial features, hand features, fingerprint features and sound features of the user.

In step S32, the HH frequency domain portion is selected as the watermark embedding area, because this portion mainly represents the edge and texture information of the original carrier image, and contains less main information, and therefore the embedding of the watermark image in this frequency domain portion has less influence on the quality of the carrier image, and the watermark embedding step of this embodiment adopts the following algorithm:

Suppose an input electronic document image D, a watermark image W_p×q. Extracting pixel points with the pixel value of 255 from the document picture, namely white pixel points, and reconstructing the pixel points into a new carrier image A_m×nAnd the combined matrix P ═ P_ijAnd 3, recording the specific position of the extracted pixel point in the original carrier image.

For the reconstructed carrier image A_mxnPerforming one-level discrete wavelet decomposition to generate four subgraphs, which are respectively: LL, HL, LH, HH.

Singular value decomposition of the HH frequency domain sub-band is performed using the following equation:

HH＝U_h×S_h×V_h ^T：

wherein, U_hIs an m x m orthogonal matrix, V_hIs an n × n orthogonal matrix, S_n＝diag(σ₁，σ₂，…，σ_n，)，σ_iAre the singular values of the image.

Then, singular value decomposition is performed on the watermark image W:

WaterImage-U_s×S_s×V_s ^T；

then, the singular value S of the watermark image is used_sSingular value S for replacing reconstructed carrier image_h。

Then, obtaining a modified subgraph by using singular value decomposition inverse operation as follows:

HH_new＝U_h×S_s×V_s ^T；

and finally, generating an image embedded with the watermark by adopting inverse discrete wavelet transform, and restoring the image according to the pixel position of the carrier image in the original document image to obtain the electronic document image embedded with the watermark.

As is known, the concept of wavelet transform was proposed by the french engineer Morlet in 1974, and discrete wavelets were discretized from a continuous wavelet. Let us assume the wavelet transform of an arbitrary function x (t) as W (a, b), where a, b are scale and translation factors, respectively. Limiting a and b in the wavelet basis function to be values at a plurality of discrete points, namely performing discrete sampling as shown in the following formula:

Then wavelet Ψ_a，bThe calculation of (t) is as follows:

the definition of Discrete Wavelet Transform (DWT) is shown in equations 2-3:

DWT(m，n)＝∫_Rf(t)Ψ_m，n(t)dt：

the discrete wavelet transform is a wavelet transform based on a finite time domain and transform frequency, which is a frequency domain technique. The image is stored in a two-dimensional matrix form, so the wavelet transform of the image is based on the transform of two-dimensional discrete wavelets, and the algorithm applies the one-dimensional wavelet transform to the rows and columns of two-dimensional signals such as the image and the like respectively to form the two-dimensional wavelet transform, thereby being capable of carrying out effective time-frequency decomposition on the image. The discrete wavelet transform firstly transforms the carrier image to a frequency domain, and then modifies the frequency coefficient of the original carrier image according to the frequency coefficient of the transformed watermark image, so as to obtain a robust watermark-embedded image.

Discrete wavelet transform decomposes images in a hierarchy providing spatial and frequency image descriptions. It decomposes the image in three basic spatial directions, i.e., horizontal, vertical and diagonal, resulting in four different components, the LL band, LH band, HL band and HH band, respectively. The first letter here denotes the low-pass frequency operation or the high-pass frequency operation on the rows of the carrier image, and the second letter denotes the application of the filter to the columns of the carrier image. The LL band represents detail information obtained after low-pass filtering in the horizontal and vertical directions, the LH band represents detail information obtained after low-pass filtering in the horizontal direction and high-pass filtering in the vertical direction, the HL band represents detail information obtained after high-pass filtering in the horizontal direction and low-pass filtering in the vertical direction, and the HH band represents detail information obtained after high-pass filtering in the horizontal and vertical directions. The LL band is the lowest resolution level, and is composed of approximate parts of the carrier image, most of the information is collected, the main features are depicted, and the other three bands respectively maintain the edge detail information of the carrier image in each direction.

For the second level of decomposition, any sub-band can be decomposed into four levels, and the higher the decomposition level is, the more robust the image embedded with the watermark is. At each decomposition level, the wavelet coefficients of the LL band are of a greater magnitude than the other three bands. The human visual system is more sensitive to the low frequency part, i.e. the LL band, so that the watermark information is preferably embedded in the other three bands to better preserve the quality of the original image.

After two-dimensional wavelet transform, the image is decomposed into 4 sub-block regions with the original size of 1/4, each sub-block region includes wavelet coefficients of a corresponding frequency band, and the wavelet coefficients are equivalent to dot separation sampling in the horizontal direction and the vertical direction. The algorithm of the present embodiment selects the HH sub-band (high frequency sub-band) as the watermark embedding region because the frequency band represents the edge and texture information of the image, and the influence on the quality of the original image after embedding the watermark is small.

Singular value decomposition belongs to linear algebra, is asymmetric orthogonal transformation, is based on eigenvectors, and is widely applied to digital image processing. The singular value can not be obviously changed when the image is interfered, the stability is good, the singular value represents the essential characteristic of an image, and after the watermark is embedded, when the image is simply attacked, the singular value is slightly changed. The advantage that the image visual characteristic is not expressed is that the singular value decomposition has better concealment and geometric attack resistance, and the defect that the discrete wavelet transform cannot well resist the geometric attack can be well overcome, so the singular value decomposition and the discrete wavelet transform of the image can be well combined to improve the robustness of the digital watermark and the execution rate of the digital watermark algorithm, and more options are provided for the embedded coefficient.

Basic idea of image singular value decomposition: if a digital image is represented by a matrix A, let the matrix A beR^m ^×nRank (a) ═ r, r ≦ n, then the singular value decomposition of matrix a is defined as shown below:

wherein U is [ U ]₁，u₂，…，u_m，]∈R^m×mAnd V ═ V₁，v₂，…，v_n，]∈R^n×nIs an orthogonal matrix with column vectors u_iAnd v_i(ii) a U and V are respectively called as a left singular matrix and a right singular matrix of the matrix A; d ═ diag (σ)₁，σ₂，…，σ_mIs a diagonal matrix, σ_i(i ═ 1, 2, …, m) is referred to as the singular value of the matrix a, here AA^TOr A^TPositive square root of eigenvalue of A, and satisfies sigma₁≥σ₂≥…≥σ_r＞σ_r+1＝…＝σ_m0. The geometrical meaning of singular value decomposition is that for a matrix of any size, a set of pairwise orthogonal unit vectors needs to be found, so that the matrix interacts with the two vectors to obtain two new vectors, and the two newly generated vectors are kept to be orthogonal.

The singular value of the image represents the inherent property of the image and has better stability, when the image has slight change, the singular value of the image does not change greatly, and the singular value of the matrix has transposition invariance and rotation invariance, which have important significance for realizing the robustness of the digital watermark.

As shown in fig. 5, the watermark extraction step is basically the reverse operation of the watermark embedding step, and is as follows:

S41: extracting pixel points of the electronic document image embedded with the watermark to obtain a carrier image embedded with the watermark;

s42: carrying out discrete wavelet transform on the carrier image embedded with the watermark, and selecting an HH frequency domain part as a watermark extraction area;

s43: performing singular value decomposition on the watermark extraction area, and extracting the singular value of the watermark extraction area;

s44: and restoring the watermark image by using the obtained singular value by adopting inverse operation.

The algorithm of the embodiment performs a simulation experiment in Matlab R2012b environment, and the experiment carrier image is an electronic document image 512 × 512, as shown in fig. 6 a; the watermark image is a 256 × 256 palm grayscale image, as shown in fig. 6 b; the image after embedding the watermark is shown in fig. 6 c; the extracted watermark image is shown in fig. 6 d. The invisibility and robustness of the algorithm will be evaluated experimentally and compared to other watermarking algorithms.

The image information after embedding the watermark is transmitted in a channel, and generally suffers from various types of attacks in the process, and the watermark information embedded in the image information suffers from damage to different degrees. Some malicious attackers steal the watermark information, and even more, want to tamper and forge the data, so that the digital watermark algorithm needs to have security against attacks. Common geometric attacks mainly include compression, noise, filtering, rotation scaling and the like, and in order to judge the attack resistance of the watermark algorithm, the watermark algorithm needs to be checked according to some judgment criteria. The evaluation indexes of the digital image watermarking algorithm mainly include the invisibility and stability of the watermark, and the method of the embodiment has higher stability and better invisibility of the watermark to the electronic document after the watermark is embedded through inspection.

The following is the process of evaluation and verification of watermark invisibility.

The invisibility of the watermark is also called imperceptibility, so that intuitively, the difference between the original image and the image embedded with the watermark cannot be sensed by human eyes, the invisibility is better, people cannot easily find hidden watermark information, the safety of the watermark information is better guaranteed, and the quality of the original image cannot be greatly influenced by the good invisibility.

The peak signal-to-noise ratio (PSNR) and the Structural Similarity (SSIM) are generally selected as evaluation indexes.

The peak signal-to-noise ratio is used for explaining the quality change of an image before and after watermark embedding, the watermark embedding system is regarded as a communication system, an original carrier image represents a signal needing to be transmitted, embedded watermark information represents noise loaded on the original signal, the peak signal-to-noise ratio can also be understood as the ratio of the maximum value of the transmission signal to the noise, and the calculation formula is shown as the following formula.

Wherein MSE represents mean square error, and the calculation formula is as follows:

wherein, I_M×NRepresenting the original Carrier image, I'_M×NRepresenting the watermarked image. The mean square error is used to calculate the difference between the two images, the magnitude of which can represent the degree of fluctuation of the image difference.

Generally, when the peak signal-to-noise ratio is not less than 36dB, it is considered that the watermark algorithm is acceptable for damaging the image quality of the original carrier, and the larger the peak signal-to-noise ratio is, the more similar the detected image is to the original image, the less the image distortion is, and the higher the image quality after embedding the watermark is.

Another indicator for measuring the similarity between two images is Structural Similarity (SSIM), which is calculated as follows:

wherein u is_xIs the average value of x, u_yIs the average value of y and is,

is the variance of x and is,

is the variance of y, σ_xyIs the standard deviation of x and y. c. C₁＝(k₁L)²，c₂＝(k₂L)²Is a constant used to maintain stability. L is a pixelDynamic range of values, summarized by experiments, let k₁＝0.01，k₂0.03. The larger the structural similarity value is, the smaller the image distortion is.

The image has high structure, and strong correlation exists between pixels, especially under the condition of spatial similarity. Most quality assessment methods based on error sensitivity, such as peak signal-to-noise ratio and mean square error, are based on linear transformation decomposition of image signals, and the method does not relate to image specific correlation. Therefore, an evaluation index measuring the degree of structural similarity may be selected.

In the experiment, three different electronic document images are selected as carrier images and are all embedded into the same watermark image as shown in fig. 6b, experiment comparison is performed, the experiment result is shown in fig. 7, table 1-1 shows the peak signal-to-noise ratio and the structural correlation coefficient of the three embedded watermark images, and the data in the table shows that the algorithm has better invisibility and data fidelity.

TABLE 1-1 PSNR and SSIM values for watermarking algorithms

In the experiment, the watermark image shown in fig. 6b is embedded into fig. 6a, the peak signal-to-noise ratio and the structural similarity are calculated, and the experiment is compared with other algorithms. Tables 1-2 show the results of comparative experiments, and it can be seen from the experimental data that the peak snr value of the algorithm proposed herein is 58.81, which is higher than that of other algorithms, and the structural correlation coefficient is the maximum value, so that the algorithm has better concealment than other algorithms.

TABLE 1-2 PSNR and SSIM comparative tests

The following is directed to performing watermarking experiments on electronic documents based on multiple features.

Two electronic document pictures are selected as carrier images, one is full Chinese, the other is full English, the two documents are in the form of the most fonts in Chinese and English languages, namely the situation that white pixel points are relatively the least, and therefore the two carrier images can fully explain the capacity of the text watermarking algorithm. In the experiment, three different watermark images are respectively embedded in two carrier images. Three palm pictures are embedded in the Chinese electronic document, and two face images and one palm image are embedded in the English electronic document picture.

The size of the carrier image adopted in the experiment is 1024 × 1024, the size of the embedded watermark image is 128 × 128, and the watermark image is acquired by an Android mobile phone. As shown in fig. 8e and fig. 9e, it can be known from the experimental results that embedding multiple biometric images does not affect the readability of the original carrier image, and the extracted watermark pattern also maintains high similarity with the original feature image, which also provides conditions for biometric identification and multi-biometric fusion.

For the embedding experiments of the two different carrier images and different watermark images, the fidelity and invisibility of the image after the watermark is embedded are evaluated, the peak signal to noise ratio (PSNR) and the Structural Similarity (SSIM) after the watermark is embedded are tested, and the PSNR value of the text algorithm is higher as seen from the data shown in the table 2-1, so that the readability of the original carrier is not influenced by embedding a plurality of watermark images.

TABLE 2-1 Experimental results for embedding multiple watermark images

The two images embedded with the watermarks are subjected to watermark extraction through experiments, and the normalized correlation coefficient of the extracted watermarks and the original watermark image is calculated so as to judge whether the watermark image is distorted. The experimental data are shown in table 2-2, and no matter the document image is an english document or a chinese document image, the three watermark images extracted from the document image keep high correlation with the original image, which also shows that the algorithm of the embodiment has feasibility and better robustness when applied to multiple watermark embedding.

TABLE 2-2 extracting NCC values of watermark images

In summary, the present invention provides a digital watermarking processing method and system for electronic documents aiming at the characteristics of pixel distribution of electronic document images, which utilizes the characteristics that a large number of white pixel points exist in the electronic document and are interspersed among lines of characters, and the characteristics that the same pixel values exist in a large number are the characteristics that other natural images do not exist, and reconstructs a new carrier image through the pixel points, embeds the watermark into the new carrier image, and then restores the watermark to the original pixel position, and the scrambling operation of embedding the watermark is also realized by the reconstruction and restoration method. In addition, by analyzing the capacity of white pixel points in the electronic document, the invention can realize 2 to 3 biological characteristics as watermark images, and experiments prove that the embedding of multiple watermarks does not cause great influence on the quality of the carrier images, and the watermark algorithm still has better invisibility and robustness.

The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.

Claims

1. A digital watermark processing method for electronic documents is characterized by comprising a watermark embedding step, wherein the watermark embedding step specifically comprises the following steps:

acquiring user authority;

uploading the watermark information to a server;

2. The digital watermarking method of an electronic document according to claim 1, further comprising a watermark extraction step, wherein the watermark extraction step specifically includes:

3. The digital watermarking method for electronic documents according to claim 1 or 2, wherein the watermark information is biometric information of the user, including one or more of facial features, hand features, fingerprint features and voice features of the user.

4. A digital watermark processing system of an electronic document is characterized by comprising a watermark embedding unit, wherein the watermark embedding unit specifically comprises:

the authorization module is used for acquiring user permission;

5. The digital watermarking system for electronic documents according to claim 4, further comprising a watermark extraction unit, wherein the watermark extraction unit specifically includes:

6. The electronic document digital watermark processing system of claim 5, wherein the electronic document digital watermark processing system is composed of a mobile client and a server, wherein the mobile client comprises the authorization module, an information acquisition module and an information uploading module, and the server comprises the database, a database access module, a watermark processing module, a watermark embedding module, a watermark decryption module, a lookup module and an extraction module.

7. The digital watermarking system for electronic documents according to claim 6, wherein the watermark information is biometric information of the user, including one or more of facial features, hand features, fingerprint features and voice features of the user;

8. A digital watermark processing method for electronic documents is characterized by comprising a watermark embedding step of embedding watermark images into the electronic documents, wherein the watermark embedding step specifically comprises the following steps:

performing singular value decomposition on the watermark image;

9. The digital watermarking method of an electronic document according to claim 8, further comprising a watermark extraction step, wherein the watermark extraction step specifically includes:

10. The digital watermarking method for electronic documents according to claim 8 or 9, wherein the watermark image contains biometric information of the user, including one or more of facial features, hand features, fingerprint features and voice features of the user.