CN112651879A - Text image digital watermarking method capable of resisting printing and scanning - Google Patents
Text image digital watermarking method capable of resisting printing and scanning Download PDFInfo
- Publication number
- CN112651879A CN112651879A CN202011555431.6A CN202011555431A CN112651879A CN 112651879 A CN112651879 A CN 112651879A CN 202011555431 A CN202011555431 A CN 202011555431A CN 112651879 A CN112651879 A CN 112651879A
- Authority
- CN
- China
- Prior art keywords
- text
- word
- character
- watermark
- line
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 32
- 238000007639 printing Methods 0.000 title claims abstract description 18
- 238000000605 extraction Methods 0.000 claims abstract description 12
- 238000007781 pre-processing Methods 0.000 claims abstract description 12
- 230000009466 transformation Effects 0.000 claims description 5
- 238000003708 edge detection Methods 0.000 claims description 3
- 125000004122 cyclic group Chemical group 0.000 claims description 2
- 238000004364 calculation method Methods 0.000 abstract 1
- 238000002474 experimental method Methods 0.000 description 4
- 230000007547 defect Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000000903 blocking effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010019 resist printing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/0021—Image watermarking
- G06T1/005—Robust watermarking, e.g. average attack or collusion attack resistant
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/10—Image enhancement or restoration using non-spatial domain filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/20—Image enhancement or restoration using local operators
- G06T5/30—Erosion or dilatation, e.g. thinning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2201/00—General purpose image data processing
- G06T2201/005—Image watermarking
- G06T2201/0065—Extraction of an embedded watermark; Reliable detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20048—Transform domain processing
- G06T2207/20052—Discrete cosine transform [DCT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20048—Transform domain processing
- G06T2207/20061—Hough transform
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20048—Transform domain processing
- G06T2207/20064—Wavelet transform [DWT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30176—Document
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Processing (AREA)
- Editing Of Facsimile Originals (AREA)
Abstract
The invention discloses a printing and scanning resistant text image digital watermarking method, which comprises the following specific steps: (1) recognizing a text line; (2) text word recognition; (3) preprocessing watermark information; (4) watermark embedding preprocessing; (5) embedding a watermark; (6) correcting the image; (7) watermark extraction preprocessing; (8) and extracting the watermark. The invention solves the problems that the redundancy of the text watermarking method is small when the watermark is embedded, and the watermark has weak capability of resisting printing and scanning attacks. The invention utilizes the advantages of the spatial domain text watermarking algorithm, not only has strong robustness to the printing scanning attack, but also has good invisibility of the watermark, simple calculation and high watermark embedding capacity.
Description
Technical Field
The invention belongs to the technical field of image processing, and further relates to a digital watermarking method for a printing and scanning resistant text image in the technical field of information hiding. The invention can be used for embedding the watermark information into the text image when printing/printing the output image, and can realize the copyright protection of printed and printed products by scanning the printed image and extracting the embedded watermark information.
Background
The phenomenon that the copyright of paper books and other paper data is infringed is increasingly serious in the internet era, and the legal rights and interests of authors and publishers are seriously damaged. The text-based digital watermarking technology can hide information of a group or a personal identification property in an electronic or paper text in a specific mode, and the information cannot be recognized by human eyes, so that the copyright protection effect on the text works is realized. Text digital watermarking methods are mainly classified into two categories: space domain methods and frequency domain methods.
Huanghua et al, in "a new text digital watermark marking strategy and detection method" ("Sian university of transportation proceedings 2002, 36(2):165-168), propose a new line shift marking strategy, which does not use a reference line, and each text line moves up and down according to the previous text line to realize the embedding of the watermark. The method simplifies the detection method and realizes the blind extraction of the watermark, but the method still has the following defects that the algorithm has smaller watermark capacity due to the limited number of lines of Chinese text.
Tan Zheng et al put forward a text watermarking method based on three-level DWT transformation in the 'anti-printing-scanning digital watermarking technology based on document image' (computer applied research 2007, 24 (12): 199-; the text image has simple texture details, uneven distribution and small redundant space, so that the invisibility and the robustness of the watermark are difficult to balance.
Disclosure of Invention
The invention aims to provide a text image digital watermarking method for resisting printing and scanning, which is mainly used for embedding watermarks into printed/printed works while printing/printing and provides basis for copyright protection. The invention aims to solve the main problems that the existing text watermarking method is difficult to resist printing scanning attack, the watermarking capacity is low, the realization is complex, and especially the balance between the invisibility and the robustness of the watermarking is difficult to obtain.
The invention comprises two processes of watermark embedding and watermark extracting;
the watermark embedding process comprises the following specific steps:
(1) and (3) text line recognition:
1a) counting black pixel points of each pixel row of the carrier image and calculating the line width of the pixel row;
1b) calculating the ratio of the total number of black pixels to the width of each pixel line, judging the blank pixel line if the ratio is 0, and judging the text pixel line if the ratio is not 0;
1c) and traversing all pixel lines from top to bottom, judging the upper boundary of the text line of the current pixel line if the current pixel line is a text pixel line and the previous pixel line is a blank pixel line, and judging the lower boundary of the text line of the previous pixel line if the current pixel line is a blank pixel line and the previous pixel line is a text pixel line. Storing the recognized text line boundary into a line boundary array;
(2) text word recognition:
2a) counting black pixel points of each identified text row pixel column;
2b) traversing the pixel row of the current row from left to right, if the total number of the black pixel points of the current row is not 0 and the total number of the black pixel points of the previous row is 0, judging that the current row is a left boundary of the text character, and if the total number of the black pixel points of the current row is 0 and the total number of the black pixel points of the previous row is not 0, judging that the previous row is a right boundary of the text character. Storing the recognized text word boundary into a word boundary array;
2c) carrying out word width and word space statistical sorting on the determined text word boundary, and setting a word width intermediate value and a word space intermediate value as threshold values T1And a threshold value T2If the width of the two adjacent text characters is smaller than the threshold value T1And the distance between two text words is less than the threshold value T2If so, judging that the current two text characters jointly form a Chinese character, deleting the right boundary of the left text character and the left boundary of the right text character from the character boundary array, and updating the character boundary array;
2d) performing word width statistical sorting on the updated text word boundaries, and setting a word width intermediate value as a threshold value T3Width of Chinese character 'Pai' is larger than 1.8 × T3The text character is split, columns with the total number of black pixel points being 0 are respectively searched leftwards and rightwards from the middle position of the text character, and the columns are used as newly-added left and right boundaries and stored in a character boundary array;
(3) preprocessing watermark information:
the ten-digit Arabic numeral characters are converted into corresponding ASCII codes, and the ASCII codes are subjected to cyclic operation to obtain a binary watermark information sequence to be embedded;
(4) watermark embedding preprocessing:
carrying out word width and word space statistical sorting on the word boundary array obtained in the step (2), and setting a word width intermediate value and a word space intermediate value as threshold values T respectively4And a threshold value T5Width of word is smaller thanSetting 0 as a special mark at the boundary of the text word;
(5) and (3) watermark embedding process:
5a) embedding is not carried out on the head and tail text characters of each recognized text line, only the even numbered text characters of each text line are embedded, if the left and right boundaries of the current even numbered text characters are not 0, and the right boundary of the left text character and the left boundary of the right text character are not 0, the current text character is judged to be an embeddable text character, otherwise, the text character is not embeddable;
5b) comparing the left and right word spacing of the embeddable text word, if the left word spacing and the right word spacing are both greater than 4 XT5If not, the text word is changed into a non-embeddable text word, otherwise, the text word is changed into an embeddable text word;
5c) if the current watermark information to be embedded is 0, moving the embeddable text characters to enable the left-side character spacing to be smaller than the right-side character spacing, and if the current watermark information to be embedded is 1, moving the embeddable text characters to enable the left-side character spacing to be larger than the right-side character spacing;
5d) embedding watermarks into the embeddable text characters of all the text lines, and printing and scanning the text images embedded with the watermarks to obtain scanned images embedded with the watermarks;
the watermark extraction process comprises the following specific steps:
(6) and (3) image rectification:
6a) taking the upper half part of the scanned watermark-containing image, turning the upper half part of the scanned watermark-containing image into a white image with a black background, and removing a white edge of the image generated by inclination in the scanning process;
6b) performing expansion operation on the black-background white character image in the horizontal and vertical directions to connect discontinuous characters into longer line segments;
6c) performing edge detection on the expanded image, and performing Hough transformation on edge points to find the inclination angle theta of the longest line segment;
6d) rotating the scanned text image by an angle theta to remove an image black edge generated by image rotation;
(7) watermark extraction preprocessing:
7a) performing text line recognition according to the step (1), and performing text column recognition according to the step (2);
7b) carrying out word width and word space statistical sorting on the word boundary array obtained in the step (2), and respectively taking a word width intermediate value and a word space intermediate value as threshold values T'4And a threshold value T'5Width of word is smaller thanSetting 0 as a special mark at the boundary of the text word;
(8) and (3) watermark extraction process:
8a) the method comprises the steps that the head and tail text characters of each recognized text line are not extracted, only the even numbered text characters of each text line are embedded, if the left and right boundaries of the current even numbered text characters are not 0, and the right boundary of the left text character and the left boundary of the right text character are not 0, the current text character is judged to be an extractable text character, otherwise, the text character is not extractable;
8b) comparing left and right word spaces of the extractable text words, and if the left word space and the right word space are both larger than 4 xT'5Then change it to non-extractable textOtherwise, the text word can be extracted;
8c) if the left-side character spacing of the current extractable text characters is smaller than the right-side character spacing, extracting the watermark information to be 0, and if the left-side character spacing of the current extractable text characters is larger than the right-side character spacing, extracting the watermark information to be 1;
8d) extracting watermarks from the extractable text characters of all text lines, connecting the extracted watermark information into a binary watermark sequence, converting the binary sequence ASCII code into corresponding Arabic numeral characters, and obtaining the finally extracted watermark digital information.
Compared with the prior art, the invention has the following advantages:
firstly, the invention selects to embed the watermark in the airspace, and realizes the embedding of the watermark through the tiny movement of the text word, thereby overcoming the defect of poor capability of resisting the printing and scanning attacks in the prior art, and leading the invention to have the advantages of strong robustness and good invisibility.
Secondly, the invention does not need to carry out complex matrix DWT and DCT transformation and does not need to carry out blocking on the image, so that the invention has the advantage of extracting the watermark image more quickly and accurately.
Thirdly, the watermark embedding method constructed by the invention not only improves the watermark capacity, but also overcomes the defect that the original image data is required to be referred for watermark extraction in the prior similar technology, so that the invention has the advantages of high watermark capacity and blind extraction.
Description of the drawings:
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a schematic of an experiment according to the present invention;
FIG. 3 schematically illustrates: referring to fig. 3, the print scanning resistant digital watermark system of the present invention firstly identifies text words and text lines of a carrier image, secondly embeds watermarks by using ASCII codes converted from watermark information, secondly performs print scanning operation on the images after watermark embedding, and finally performs correction and watermark extraction on the scanned images to obtain extracted watermark information.
The specific implementation mode is as follows:
the present invention will be described in detail below with reference to the accompanying drawings.
The specific steps of the method of the present invention are as follows with reference to figure 1.
And step 1, recognizing text lines.
And counting black pixel points and calculating the line width of the pixel lines of each pixel line of the carrier image, calculating the ratio of the total number of the black pixel points to the line width of the pixel lines, judging the line blank pixel line if the ratio is 0, and judging the line text pixel line if the ratio is not 0. And traversing all pixel lines from top to bottom, judging the upper boundary of the text line of the current pixel line if the current pixel line is a text pixel line and the previous pixel line is a blank pixel line, and judging the lower boundary of the text line of the previous pixel line if the current pixel line is a blank pixel line and the previous pixel line is a text pixel line. And storing the recognized text line boundary into a line boundary array.
and counting black pixels of each identified text row pixel column. Traversing the pixel rows from left to right, if the total number of the black pixel points of the current row is not 0 and the total number of the black pixel points of the previous row is 0, judging that the current row is a text character left boundary, and if the total number of the black pixel points of the current row is 0 and the total number of the black pixel points of the previous row is not 0, judging that the previous row is a text character right boundary. The recognized text word boundaries are stored in a word boundary array.
Carrying out word width and word space statistical sorting on text word boundaries, and setting a word width intermediate value and a word space intermediate value as threshold values T respectively1And a threshold value T2If the width of the two adjacent text characters is smaller than the threshold value T1And the distance between two text words is less than the threshold value T2And judging that the current two text characters jointly form a Chinese character, deleting the right boundary of the left text character and the left boundary of the right text character from the character boundary array, and updating the character boundary array.
Performing word width statistical sorting on the updated text word boundaries, and setting a word width intermediate value as a threshold value T3Width of Chinese character 'Pai' is larger than 1.8 × T3Is divided from the middle position of the text wordAnd respectively searching the rows with the total number of black pixels being 0 to the left and the right, and storing the rows as newly-added left and right boundaries into the word boundary array.
And 3, preprocessing watermark information.
And converting the ten-digit Arabic numeral characters into corresponding ASCII codes, and performing cycle operation on the ASCII codes to obtain a binary watermark information sequence to be embedded.
And 4, watermark embedding preprocessing.
Carrying out word width and word space statistical sorting on the word boundary array obtained in the step (2), and setting a word width intermediate value and a word space intermediate value as threshold values T respectively4And a threshold value T5Width of word is smaller thanThe text word boundary of (1) is set with 0 as a special mark.
And 5, embedding the watermark.
The first step, the head and tail text characters of each recognized text line are not embedded, only the even number text characters of each text line are embedded, if the left and right boundaries of the current even number text characters are not 0, and the right boundary of the left text character and the left boundary of the right text character are not 0, the current text character is judged to be an embeddable text character, otherwise, the text character is not embeddable.
Secondly, comparing the left and right word spaces of the embeddable text words, if the left word space and the right word space are both larger than 4 XT5If the text word is not embeddable, the text word is changed to be non-embeddable, otherwise, the text word is embeddable.
And thirdly, if the current information of the watermark to be embedded is 0, moving the embeddable text characters to enable the left-side character spacing to be smaller than the right-side character spacing, and if the current information of the watermark to be embedded is 1, moving the embeddable text characters to enable the left-side character spacing to be larger than the right-side character spacing.
And fourthly, embedding watermarks into the embeddable text characters of all the text lines, and printing and scanning the text images embedded with the watermarks to obtain scanned images embedded with the watermarks.
And 6, correcting the image.
And (3) turning the upper half part of the scanned watermark-containing image into a white image with black background, and removing the white edge of the image generated by inclination in the scanning process. The expansion operation is carried out on the black-background white character image in the horizontal and vertical directions, and discontinuous characters are connected into long line segments. And carrying out edge detection on the expanded image, carrying out Hough transformation on edge points, and finding out the inclination angle theta of the longest line segment. The scanned text image is rotated by an angle theta, and the black edge of the image generated by the image rotation is removed.
And 7, extracting and preprocessing the watermark.
And (5) performing text line recognition according to the step (1), and performing text column recognition according to the step (2). To step (2)
Carrying out statistical sorting on word width and word space of the obtained word boundary array, and respectively taking a word width intermediate value and a word space intermediate value as threshold values T'4And a threshold value T'5Width of word is smaller thanThe text word boundary of (1) is set with 0 as a special mark.
And 8, extracting the watermark.
Step one, not extracting the head and tail text characters of each identified text line, only embedding the even text characters of each text line, if the left and right boundaries of the current even text characters are not 0, and the right boundary of the left text character and the left boundary of the right text character are not 0, determining that the current text character is an extractable text character, otherwise, determining that the current text character is an unextractable text character;
secondly, comparing left and right word spaces of the extractable text words, and if the left word space and the right word space are both larger than 4 xT'5If the text word is not extractable, the text word is changed into an extractable text word, otherwise, the text word is changed into an extractable text word;
thirdly, if the left-side character spacing of the current extractable text characters is smaller than the right-side character spacing, extracting the watermark information to be 0, and if the left-side character spacing of the current extractable text characters is larger than the right-side character spacing, extracting the watermark information to be 1;
and fourthly, extracting watermarks from the extractable text characters of all text lines, connecting the extracted watermark information into a binary watermark sequence, converting the binary watermark sequence ASCII code into corresponding Arabic numeral characters, and obtaining finally extracted watermark digital information.
The effects of the present invention will be further described with reference to the experimental drawings.
The printer model selected in the experiment of the invention is HP Color Laser MFP 178nw, the scanner of the printer is used for scanning, and the printing resolution and the scanning resolution are both 300 dpi. The quality of the image containing the watermark is evaluated by SSIM (structural similarity), and the anti-attack performance of the watermark is evaluated by DR (correct extraction rate). The watermark embedding and extracting of the text image by using the method of the invention are as follows:
referring to FIG. 2, the printer model HP Color Laser MFP 178nw was used in the experiment of the present invention, and the scanner of the printer itself was used for scanning. Fig. 2(a) shows an image of an image carrier having a size of 4958 × 7017, and 1803121134 is information of a watermark to be embedded. Using the method of the present invention, the obtained watermark-containing image is shown in fig. 2(b), SSIM is 0.9429, the image after being corrected by printing and scanning is shown in fig. 2(c), the extracted watermark information is shown in fig. 2(d), DR is 94.23%;
as a result of the experiment, it can be seen from fig. 2(b) that the carrier image has good invisibility after the watermark is embedded. As can be seen from fig. 2(d), the watermark-containing image can still correctly extract the watermark after being printed and scanned, which shows that the method of the present invention has strong robustness against print-scan attacks.
Claims (3)
1. A text image digital watermarking method resisting printing and scanning comprises two processes of watermark embedding and watermark extracting;
the watermark embedding process comprises the following specific steps:
(1) and (3) text line recognition:
1a) counting black pixel points of each pixel row of the carrier image and calculating the line width of the pixel row;
1b) calculating the ratio of the total number of black pixels to the width of each pixel line, judging the blank pixel line if the ratio is 0, and judging the text pixel line if the ratio is not 0;
1c) and traversing all pixel lines from top to bottom, judging the upper boundary of the text line of the current pixel line if the current pixel line is a text pixel line and the previous pixel line is a blank pixel line, and judging the lower boundary of the text line of the previous pixel line if the current pixel line is a blank pixel line and the previous pixel line is a text pixel line. Storing the recognized text line boundary into a line boundary array;
(2) text word recognition:
2a) counting black pixel points of each identified text row pixel column;
2b) traversing the pixel row of the current row from left to right, if the total number of the black pixel points of the current row is not 0 and the total number of the black pixel points of the previous row is 0, judging that the current row is a left boundary of the text character, and if the total number of the black pixel points of the current row is 0 and the total number of the black pixel points of the previous row is not 0, judging that the previous row is a right boundary of the text character. Storing the recognized text word boundary into a word boundary array;
2c) carrying out word width and word space statistical sorting on the determined text word boundary, and setting a word width intermediate value and a word space intermediate value as threshold values T1And a threshold value T2If the width of the two adjacent text characters is smaller than the threshold value T1And the distance between two text words is less than the threshold value T2If so, judging that the current two text characters jointly form a Chinese character, deleting the right boundary of the left text character and the left boundary of the right text character from the character boundary array, and updating the character boundary array;
2d) performing word width statistical sorting on the updated text word boundaries, and setting a word width intermediate value as a threshold value T3Width of Chinese character 'Pai' is larger than 1.8 × T3The text character is split, columns with the total number of black pixel points being 0 are respectively searched leftwards and rightwards from the middle position of the text character, and the columns are used as newly-added left and right boundaries and stored in a character boundary array;
(3) preprocessing watermark information:
the ten-digit Arabic numeral characters are converted into corresponding ASCII codes, and the ASCII codes are subjected to cyclic operation to obtain a binary watermark information sequence to be embedded;
(4) watermark embedding preprocessing:
carrying out word width and word space statistical sorting on the word boundary array obtained in the step (2), and setting a word width intermediate value and a word space intermediate value as threshold values T respectively4And a threshold value T5Width of word is smaller thanSetting 0 as a special mark at the boundary of the text word;
(5) and (3) watermark embedding process:
5a) embedding is not carried out on the head and tail text characters of each recognized text line, only the even numbered text characters of each text line are embedded, if the left and right boundaries of the current even numbered text characters are not 0, and the right boundary of the left text character and the left boundary of the right text character are not 0, the current text character is judged to be an embeddable text character, otherwise, the text character is not embeddable;
5b) comparing the left and right word spacing of the embeddable text word, if the left word spacing and the right word spacing are both greater than 4 XT5If not, the text word is changed into a non-embeddable text word, otherwise, the text word is changed into an embeddable text word;
5c) if the current watermark information to be embedded is 0, moving the embeddable text characters to enable the left-side character spacing to be smaller than the right-side character spacing, and if the current watermark information to be embedded is 1, moving the embeddable text characters to enable the left-side character spacing to be larger than the right-side character spacing;
5d) embedding watermarks into the embeddable text characters of all the text lines, and printing and scanning the text images embedded with the watermarks to obtain scanned images embedded with the watermarks;
the watermark extraction process comprises the following specific steps:
(6) and (3) image rectification:
6a) taking the upper half part of the scanned watermark-containing image, turning the upper half part of the scanned watermark-containing image into a white image with a black background, and removing a white edge of the image generated by inclination in the scanning process;
6b) performing expansion operation on the black-background white character image in the horizontal and vertical directions to connect discontinuous characters into longer line segments;
6c) performing edge detection on the expanded image, and performing Hough transformation on edge points to find the inclination angle theta of the longest line segment;
6d) rotating the scanned text image by an angle theta to remove an image black edge generated by image rotation;
(7) watermark extraction preprocessing:
7a) performing text line recognition according to the step (1), and performing text column recognition according to the step (2);
7b) carrying out word width and word space statistical sorting on the word boundary array obtained in the step (2), and respectively taking a word width intermediate value and a word space intermediate value as threshold values T'4And a threshold value T'5Width of word is smaller thanSetting 0 as a special mark at the boundary of the text word;
(8) and (3) watermark extraction process:
8a) the method comprises the steps that the head and tail text characters of each recognized text line are not extracted, only the even numbered text characters of each text line are embedded, if the left and right boundaries of the current even numbered text characters are not 0, and the right boundary of the left text character and the left boundary of the right text character are not 0, the current text character is judged to be an extractable text character, otherwise, the text character is not extractable;
8b) comparing left and right word spaces of the extractable text words, and if the left word space and the right word space are both larger than 4 xT'5If the text word is not extractable, the text word is changed into an extractable text word, otherwise, the text word is changed into an extractable text word;
8c) if the left-side character spacing of the current extractable text characters is smaller than the right-side character spacing, extracting the watermark information to be 0, and if the left-side character spacing of the current extractable text characters is larger than the right-side character spacing, extracting the watermark information to be 1;
8d) extracting watermarks from the extractable text characters of all text lines, connecting the extracted watermark information into a binary watermark sequence, converting the binary sequence ASCII code into corresponding Arabic numeral characters, and obtaining the finally extracted watermark digital information.
2. The method of claim 1 for print scan resistant digital watermarking of text images, wherein: the black pixel point in the step 1a) is a pixel with a pixel value of 0.
3. The method of claim 1 for print scan resistant digital watermarking of text images, wherein: the pixel line width in the step 1a) refers to the number of pixels between the first black pixel and the last black pixel (including the head and tail black pixels) in the pixel line.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011555431.6A CN112651879A (en) | 2020-12-25 | 2020-12-25 | Text image digital watermarking method capable of resisting printing and scanning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011555431.6A CN112651879A (en) | 2020-12-25 | 2020-12-25 | Text image digital watermarking method capable of resisting printing and scanning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112651879A true CN112651879A (en) | 2021-04-13 |
Family
ID=75362738
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011555431.6A Pending CN112651879A (en) | 2020-12-25 | 2020-12-25 | Text image digital watermarking method capable of resisting printing and scanning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112651879A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113393360A (en) * | 2021-06-08 | 2021-09-14 | 陕西科技大学 | Correction method for printing and scanning resistant digital watermark image |
-
2020
- 2020-12-25 CN CN202011555431.6A patent/CN112651879A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113393360A (en) * | 2021-06-08 | 2021-09-14 | 陕西科技大学 | Correction method for printing and scanning resistant digital watermark image |
CN113393360B (en) * | 2021-06-08 | 2022-10-21 | 陕西科技大学 | Correction method for printing and scanning resistant digital watermark image |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101016712B1 (en) | Watermark information detection method | |
CN107248134B (en) | Method and device for hiding information in text document | |
Bhattacharjya et al. | Data embedding in text for a copier system | |
US20110052094A1 (en) | Skew Correction for Scanned Japanese/English Document Images | |
Gebhardt et al. | Document authentication using printing technique features and unsupervised anomaly detection | |
US8275168B2 (en) | Orientation free watermarking message decoding from document scans | |
Kumar et al. | Segmentation of printed text in devanagari script and gurmukhi script | |
US10949509B2 (en) | Watermark embedding and extracting method for protecting documents | |
CN102495833A (en) | Document watermark copyright information protection device based on Opentype vector outline fonts | |
Tan et al. | Print-Scan Resilient Text Image Watermarking Based on Stroke Direction Modulation for Chinese Document Authentication. | |
CN100498834C (en) | Digital water mark embedding and extracting method and device | |
Wu et al. | A printer forensics method using halftone dot arrangement model | |
CN112651879A (en) | Text image digital watermarking method capable of resisting printing and scanning | |
US20110170133A1 (en) | Image forming apparatus, method of forming image and method of authenticating document | |
JP2011139449A (en) | Method and system for embedding messages into structure shapes | |
Cu et al. | Watermarking for security issue of handwritten documents with fully convolutional networks | |
EP1310940A1 (en) | Color display device and method | |
US7221795B2 (en) | Document processing method, recording medium having recorded thereon document processing program, document processing program, document processing apparatus, and character-input document | |
JP2003115031A (en) | Image processor and its method | |
KR100814029B1 (en) | Method for digital watermarking | |
JP5517028B2 (en) | Image processing device | |
Cheng et al. | Steganalysis of binary text images | |
Xia et al. | Print-scan resilient watermarking for the Chinese text image | |
US8125691B2 (en) | Information processing apparatus and method, computer program and computer-readable recording medium for embedding watermark information | |
CN115239605A (en) | Anti-printing scanning method for text image based on pixel invariance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |