CN100534033C - Text numerical watermark method for resisting analog domain attack - Google Patents

Text numerical watermark method for resisting analog domain attack Download PDF

Info

Publication number
CN100534033C
CN100534033C CN 200510060488 CN200510060488A CN100534033C CN 100534033 C CN100534033 C CN 100534033C CN 200510060488 CN200510060488 CN 200510060488 CN 200510060488 A CN200510060488 A CN 200510060488A CN 100534033 C CN100534033 C CN 100534033C
Authority
CN
China
Prior art keywords
word
block
watermark
text
embedding
Prior art date
Application number
CN 200510060488
Other languages
Chinese (zh)
Other versions
CN1801707A (en
Inventor
何一兵
尹树田
张云明
梁源松
斌 罗
裘正定
鹏 高
Original Assignee
北京交通大学;杭州天谷信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京交通大学;杭州天谷信息科技有限公司 filed Critical 北京交通大学;杭州天谷信息科技有限公司
Priority to CN 200510060488 priority Critical patent/CN100534033C/en
Publication of CN1801707A publication Critical patent/CN1801707A/en
Application granted granted Critical
Publication of CN100534033C publication Critical patent/CN100534033C/en

Links

Abstract

本发明涉及一种抗模拟域攻击的文本数字水印方法,包括以下步骤:a)嵌入水印前预处理:对原始文本图像内的字块进行划分、合并、筛选,形成水印嵌入字块序列;b)水印嵌入:将水印信息的二值化序列依次嵌入原始文本图像选出的水印嵌入字块序列中,通过每个水印嵌入字块位置的水平移位来嵌入1比特的水印信息;c)提取水印前预处理:对水印图像进行与a)中相同的字块划分,利用原始文本图像辅助生成、定位水印图像中的水印嵌入字块序列及其定位字块序列;d)水印提取判决:根据字块的移位提取水印信息,进行判决。 The present invention relates to a method for watermarking text analog domain of an anti-attack, comprising the following steps: a) pretreatment before embedding: word block of text within the original image is divided, combined, screened, formed watermark embedding word block sequence; B ) embedding: binary sequence sequentially watermark information embedding watermark embedding block sequence word in the original text image is selected, the shift to the embedded watermark information by one bit of each word block watermarking horizontal position; c) extraction pretreatment watermark before: the watermark image is divided into blocks of the same word a), the auxiliary image generated using the original text, the word sequence of blocks and watermark embedding block alignment word sequence is positioned in the watermark image; D) extracting the watermark judgment: the shift word block of extracting watermark information, decision. 这种数字水印方法嵌入的水印信息具有很强的鲁棒性,而且嵌入水印后的文本具有良好的美观性和水印隐蔽性。 This digital watermark information embedding watermarking method is robust, but also text watermarked with good aesthetics and hidden watermarks.

Description

一种抗模拟域攻击的文本数字水印方法 An anti-attack simulation domain text digital watermarking method

技术领域 FIELD

本发明涉及一种信息隐藏技术,尤其是一种抗模拟域攻击的文本数字水印方法,属于安全认证技术领域。 The present invention relates to an information hiding technology, in particular an anti-text method for watermarking analog domain attack, a security authentication technical field.

背景技术 Background technique

数字水印作为一种新颖的信息隐藏技术,为解决开放性网络上的版权保护、来源认证、篡改认证、网上发行、用户跟踪和身份认证等一系列问题提供了崭新的解决思路。 Digital watermarking as a novel information hiding technology, in order to solve copyright protection on the open network, origin authentication, tamper certification, online distribution, tracking and user authentication and other issues provides a new solution ideas. 但现在人们大多把注意力集中到了针对数字图像、音频和视频的数字水印技术的研究,涉及到以文本图像作为载体的数字水印算法的研究和应用却很少。 But now most people to focus on the study of digital watermarking technology for digital images, audio and video, involving research and application of digital watermarking algorithm in text image as a carrier are few. 然而文本数字水印的应用需求十分迫切,尤其对于国家行政机关、大型商业组织之类的机构, 常需要将文档分发给数量庞大的下属机构,由于多种因素,这些对接收的文档负有保护责任的下属机构可能会以打印、扫描、复印、缩印、传真、照片等多种形式泄露文档。 However, applications of digital watermarking text is very urgent, especially for agencies like the state administration, large commercial organizations, often require a document distributed to a large number of subsidiary bodies, due to many factors, which bears the responsibility to protect the document received the subsidiaries may be to print, scan, copy, microprinting, fax, and other forms of photo leaked documents. 如果在分发文档中加入唯一的水印信息用以标识各个下属机构身份,在截获了被泄露的文档后,便可实现泄露文档的源头侦讯。 If you add unique watermark information to identify the various subsidiaries identity document distribution, after intercepting the document was leaked, leaked documents source can be realized interrogation. 然而,以文本图像作为载体的数字水印算法有以下设计难点: However, the digital watermarking algorithm in text image as a carrier has the following design challenges:

1) 文本图像仅仅用简单的几种颜色就能表示整个图像,几乎没有什么纹理和细节可言,有的仅仅是不同的几何形状,利用这样的载体很难设计出符合数字水印要求的算法; 1) simple text image using only several colors can be expressed by the whole image, there is little detail and texture at all, only some different geometric shapes, with such vectors difficult to design a digital watermarking algorithm meet the requirements;

2) 文本图像内多为具有相同颜色值或灰度值的连续区域块,如果采用通常的修改某些选定像素颜色或灰度值的方法来嵌入水印信息,就可能会造成在相同颜色值或灰度值区域中出现肉眼可察觉的不同颜色或亮度的像素点,破坏了数字水印的隐蔽性和文本的美观性; 2) within the text image having a plurality of successive block areas of the same color or gray values, by normal changes in selected color or gray scale pixel values ​​of the watermark information embedding method, it may cause the same color value or pixels of different colors or brightness appears visually perceptible tone value area, destroy the aesthetics of the hidden text and digital watermark;

3) 针对文本的数字水印方法不但要在数字域有效,更需要在文本经过了一系列模拟域攻击(包括但不限于打印、扫描、复印、传真、縮放、拍摄等具有模数/数模变换的过程)后能够正确提取出嵌入的水印信息。 3) effective not only in the digital domain for digital watermarking method of the text, the text need an analog domain through a series of attacks (including, but not limited to printing, scanning, copying, faxing, zoom, photographing the like has a modulus / digital to analog conversion after the process) can correctly extract the embedded watermark information.

现有一般的电子加密技术或数字水印算法对于上述难点都没有进行很好的解决。 Existing general electronic encryption technology or digital watermarking algorithm for the above difficulties have not been solved. 发明内容 SUMMARY

本发明为解决上述设计难点,目的在于提供一种抗模拟域攻击的文本数字水印方法,釆用这种方法的嵌入的水印信息具有很强的鲁棒性,在经历各种模拟域攻击造成的噪声、旋转、平移、縮放、字边缘模糊等变化后,仍能正确提取水印。 The present invention is designed to solve the above difficulties, a text object is to provide a method for watermarking an anti-attack analog domain, Bian having robustness with this method of embedding watermark information in the analog domain undergo various attacks caused after the noise, rotation, translation, scaling, edge blurring word changes can still correct extraction of the watermark. 该数字水印方法的这种强鲁棒性不仅对数字域文本图像有效,对模拟域的文本硬拷贝(如纸质文档)也是有效的,而且嵌入水印后的文本具有良好的美观性和水印隐蔽性。 Such strong robustness of the digital image watermarking method is only valid for digital domain text, the text hard copy (e.g., paper document) is also effective in the analog domain, watermarked text and have good appearance and the watermark concealment .

本发明是通过以下技术方案达到上述目的:一种抗模拟域攻击的文本数字水印方法,包括以下步骤-a)嵌入水印前预处理;b)水印嵌入; C)提取水印前预处理; d)水印提取判决; 具体过程如下: The present invention has the following technical solution to achieve the above object is achieved by: a text watermarking method of combating attacks analog domain, comprising the steps of -a) pretreatment embedded watermark; b) embedding the watermark; pretreatment C) extracting a watermark; D) watermark extraction decision; the specific process is as follows:

a) 嵌入水印前预处理: a) pretreating the watermark embedding:

本发明中的数字水印方法的对象为只包含文本内容的文本图像,文本存在行间隔区和字间隔区,可利用这个特点划分出矩形字块。 An object text image watermarking method of the present invention contain only text content, the presence of text and word line spacer spacer can use this feature rectangle divided word block. 对于某些字块内部在水平方向上存在间隔区而被错分为多个部分字块的情况,将部分字块进行合并。 For certain word block is present inside the spacer in the horizontal direction is divided into a plurality of portions of the wrong word block, the word block merging portion. 然后进行字块的筛选,依次检査各个字块是否同时满足事先设定好的对字宽、 And then screening the word block, checking whether each word block is set beforehand to satisfy the word width,

复杂度和完整性设定的要求:1)字宽E,字宽的单位为像素,需满足^、 ' , £"为 Complexity and completeness of the set requirements: 1) the word width E, word width in pixels, need to meet ^, ', £ "is

字体、字号及打印分辨率固定的完整字块在文本图像中的最大字宽;2)复杂度,复杂度的定义是字块内 Font, size and fixed print resolution full-word blocks in the text image maximum word width; 2) define the degree of complexity, the complexity of the word block

前景色像素点数量与字块内所有像素点数量的比值,复杂度的要求大于1/15; 3)完整性的要求为字块不 Number of foreground pixels and the number of all the pixels within a word block ratio of greater than 1/15 require complex; 3) the integrity of the word block is not required

能为一个不完整的汉字、字母或者符号。 Energy is an incomplete characters, letters or symbols. 如满足条件则添加到可嵌入水印字块集合,用于后续的水印嵌入 The condition is satisfied may be added to the watermark embedding block set word, for subsequent watermarking

和水印提取。 And watermark extraction.

接着进行水印嵌入字块序列的生成:通过密钥和伪随机序列可以在可嵌入水印字块集合中选出水印嵌入字块,水印字块必须满足与左右两侧的最邻近字块的间距大于等于1/50英寸,并且此字块左右两侧都存在属于可嵌入水印字块集合的字块。 Followed by generating a sequence of watermark embedding blocks words: by the key and pseudorandom sequence may be embedded in a word watermark embedding block set selected word block, the watermark blocks must satisfy the word most adjacent spacing around both sides of the word block is larger than equal to 1/50 inch, and the left and right sides of this word block exists word block belong to the set of word block embedded watermark. 这两个字块为定位字块,其位置在水印嵌入过程中不会被移动(一旦某字块被选为定位字块便不会再被选为水印嵌入字块),提取水印时用于判断确定水印嵌入字块的相对位置变化。 Word block alignment word block, its position will not be moved when used (once a word block is selected as block alignment word will no longer be selected watermark embedding word block), extract the watermark in the watermark embedding process Analyzing watermark embedding determining the relative positions of the word block. 根据嵌入水印信息的多少依次选择足够数量的水印嵌入字块构成水印嵌入字块序列。 Word block sequence constituting the watermark embedding watermark information is embedded a sufficient number of sequentially selecting the number of watermark embedding according to the word block.

b) 水印嵌入: b) embedding:

将水印信息二值化序列按比特嵌入水印嵌入字块序列,分别对应水印比特为1或o的情况,将水印嵌入字块在水平方向上往左或右整体移动一定的距离(移动距离需大于等于1A00英寸),依次嵌入所有比特则完成水印信息的嵌入,水印信息视应用需要可以包含纠错码和校验码。 The sequence of binary watermark information is embedded by the watermark embedding bit sequences word block, the bit corresponding to a watermark or o is 1, a watermark is embedded word block in the horizontal direction the overall area left or right by a distance (moving distance must be greater than 1A00 equal inches), successively embedded watermark information embedding all bits is completed, depending on the application needs to be watermark information containing an error correction code and a check code.

c) 提取水印前预处理: c) extracting the watermark pretreating:

用于提取水印的文本图像可能经过了包括模拟域攻击在内的各种操作,导致其前景色和背景色与原始图像相比都发生了变化。 Text image for extracting a watermark may go through a variety of operations including analog domain, including attacks, leading to its foreground and background colors of the original image have changed in comparison. 为了正确区分前景区(文本区)和背景区,首先需要对水印图像(嵌入水印的文本图像)进行二值化,然后依照与a)嵌入水印前预处理中同样的规则对水印图像进行字块划分,但是不进行字块划分的合并、筛选等后续操作。 In order to correctly distinguish foreground area (text area) and a background area, first need to watermark image (watermarked document image) is binarized, and then the word block in accordance with the same rules watermark image before the watermark and pre-processing a) embedding division, but not subsequent operations were combined, screening word block division. 接着对原始数字域文本图像(未嵌入水印的文本图像)进行与嵌入水印时完全相同的字块的划分,并生成同样的水印嵌入字块序列,通过坐标变换计算出水印图像中对应各个水印嵌入字块的位置,实现水印图像中水印嵌入字块序列的粗定位。 Next, the raw digital domain text image (text image not watermarked) divided exactly the same word block of watermark embedding, and generate the same sequence of watermark embedding word block, is calculated by the coordinate transformation corresponding to each of the watermarked image watermarking block position word, the word is roughly positioned watermark embedding watermark image block sequence. 鉴于粗定位结果与实际值存在微小的差别,需要通过搜索水印图像中的字块划分,找出与粗定位结果差别最小的即得到准确位置。 In view of the coarse positioning result and the actual values ​​there is a slight difference in the need to search for the watermark image block division word, and find the smallest difference in the results of the coarse positioning to obtain an accurate position. a)嵌入水印前预处理中选取水印嵌入字块时确保了水印嵌入字块左右两侧均存在定位字块,由原始图像的水印嵌入字块5序列生成对应的定位字块序列,按照类似的粗定位加搜索的方法,可以得到水印图像中所有定位字块的准确位置,生成与水印嵌入字块序列对应的定位字块序列。 a) When selecting word block embedding watermark embedding pretreating ensures watermarking word block alignment word exist in the left and right sides of the block, the sequence generator 5 corresponding to the block sequence alignment word word block of the original image by the watermark embedding, in a similar the method of adding coarse positioning search, the exact location of the watermark image can be obtained for all block alignment word, word generating block sequence alignment word corresponding to the block sequence watermark embedding.

所述对水印文本图像的水印嵌入字块序列和定位字块序列中的字块进行粗定位的方法:根据该字块在 The method of the watermark embedding watermark image text word block alignment word and word sequence of blocks in the block sequence coarse positioning: This character block in accordance with

原始图像中的的位置坐标、该字块在原始图像中的所在文字行的左右两端的横坐标,以及该文字行在水印文本图像中对应文字行的左右两端的横坐标,由坐标变换计算出在水印文本图像中对应字块的粗略位置坐标。 The position coordinates of the original image, and the abscissa the left and right ends of the line of text where the word block in the original image, and the abscissa the left and right ends of the character line image corresponding watermark text lines of text, is calculated by the coordinate transformation coarse position coordinates corresponding to words of the block in the watermark in the text image.

所述的通过搜索获取字块准确位置的方法为:在水印图像中,用字块通过粗定位得到的的左端横坐标与字块所在文字行内所有字块的左端横坐标计算偏差量,偏差量最小的即为所搜索字块的左端横坐标准确值;在水印图像中,用字块通过粗定位得到的的右端橫坐标与字块所在文字行内所有字块的右端横坐标计算偏差量,偏差量最小的即为所搜索字块的右端横坐标准确值。 The exact position of the block acquired by the word searching method: in the watermark image, the deviation amount is calculated by using the left end of the abscissa word block of all blocks in the left end of the abscissa word block obtained coarse positioning text lines where the amount of deviation the smallest is the left end of the search word block exact value of the abscissa; in the watermark image, the deviation amount is calculated by using the right word block of all blocks in the right end of the abscissa with the line of text where the word block obtained abscissa coarse positioning deviation is the minimum amount of the right end of the search word block abscissa exact value.

水印图像由于经历了一系列模拟域攻击,不可避免引入了噪声、旋转、平移、縮放、字边缘模糊等变化,字块宽度及字块间隔都受到了影响,如果采用与原始图像相同的方法进行字块合并、筛选,很容易产生错误。 Since the watermark image subjected to a series of attacks analog domain, noise is inevitably introduced, rotation, translation, scaling, edge blurring changes the word, the word and the word block width spacer block are affected, if the same image with the original method word block merging, filtering, error-prone. 而在原始图像中的操作是完全精确且可重现的,可利用原始图像划分、合并、筛选出的字块来获取水印图像中的水印嵌入字块和定位字块,这种粗定位加搜索的方法可使水印算法对包括模拟域攻击在内的各种攻击有更强的鲁棒性。 Operates in the original image are completely accurate and reproducible, may be utilized original image into combined, screened to acquire word block watermarked image watermarking word block and a block alignment word, add this coarse location search the method enables watermarking algorithm more robust to various attacks, including attacks, including the analog domain.

d)水印提取判决: d) watermark extraction judgment:

通过水印文本图像中每个水印嵌入字块相对于原始图像的位置变化关系,可判断水印嵌入字块在嵌入水印比特时是往左移还是往右移,对应嵌入规则即可提取其中嵌入的水印比特,依次提取出所有水印嵌入字块序列中的水印比特得到嵌入的水印信息。 Each text image by the watermark embedding block character with respect to the relationship between the position change of the original image, the watermark embedding may be determined when the word block is the embedded watermark bit to the left or to the right, corresponding to the embedding rule to extract the embedded watermark bits sequentially extracted watermark embedding watermark bits for all words in the sequence of blocks obtained embedded information.

由水印嵌入字块及其定位字块的质心,分别可以计算出在水印图像和原始图像中水印嵌入字块与两侧定位字块质心间距离的比值。 Word block and the watermark embedding block alignment word centroids, each ratio can be calculated watermark image and the original image watermarking word block and the distance between both block alignment word centroid. 从原始图像到水印图像,这个比值因为水印嵌入字块被往左或右平移而发生了变化,可以根据其变化关系判断出相对于水印嵌入字块的平移方向,再对照嵌入规则即可确定嵌入水印比特为1还是0。 Watermark image from the original image, because this ratio watermark embedding block is left or right word translation of change, can determine the direction of translation with respect to the watermark embedding block in accordance with the change word relation, and then control the embedding embedding rule can be determined watermark bit is a 1 or a 0. 根据实际情况可以进行纠错、校验码校验等后续处理。 Subsequent processing may be performed error correction code parity check of the actual situation and the like.

质心在字块内的相对位置抗旋转、平移和縮放,水印嵌入字块与定位字块的质心间距抗平移和旋转, 而水印嵌入字块和左侧定位字块质心间距与水印嵌入字块和右侧定位字块质心间距的比值是抗旋转、平移、縮放的,并且在遭受到各种模拟域攻击后也不会发生很大的变化。 Centroid anti relative position within the character block to rotate, pan and zoom, watermark embedding character block alignment word block centroid spacing anti translation and rotation, but watermark embedding word block and left positioning word block centroid spacing watermark embedding word block and right positioning word block centroid pitch ratio is an anti-rotation, translation, scaling, and will not change greatly in the analog domain after subjected to a variety of attacks. 采用字块间质心的相对位置变化关系来进行水印信息的提取,使得本发明中的数字水印方法具有很强的鲁棒性,可以有效抵抗各种模拟域攻击,正确提取出所嵌入的水印信息。 Using inter-word block centroid relative positional relationship changes extracted watermark information, so that the digital watermarking method of the present invention has strong robustness can be effective against various attacks analog domain, correctly extract the embedded watermark information .

本发明的有益效果: Advantageous effects of the invention:

1.本发明给出了一种针对文本图像的抗模拟域攻击的数字水印方法,巧妙地利用了文本图像具有行间隔 1. The present invention presents a method for watermarking attack directed against the analogue domain of the text image, clever use of the image having a line spacing of text

区和字间隔区的特点,以划分出的字块作为嵌入目标,简洁而有效地嵌入了水印信息; Features spacer region and a word to the word block divided as embedding object, simple and effective watermark information is embedded;

62. 通过移动字块的水平位置来嵌入水印信息,具有良好的美观性和隐蔽性,避免了在背景色区中产生前景色点或者在前景色区中产生背景色点造成的对视觉效果的影响; 62. The watermark is embedded by the horizontal position of the word block, having a good appearance and concealment, to avoid the generation point foreground color region or a background color of the background point due to the visual effect in the foreground region influences;

3. 采用水印嵌入字块与左右两側的定位字块的相对位置与原始图像相比的变化关系来嵌入和提取水印信息,并巧妙结合了使用字块的质心来计算字块之间的距离的方法,可以有效抵抗各种模拟域攻击造成的噪声、旋转、平移、縮放、字边缘模糊等变化; 3. Using the relationship between changes in comparison with the block alignment word about the relative position on both sides of the original image watermarking word block to detect the watermark and embedding, and using a unique combination of word block to calculate the distance between the centroid of the word block the method is robust against the noise caused by the attack of various analog domain, rotation, translation, scaling, edge blurring changes the word;

4. 采用密钥和伪随机序列来生成水印嵌入字块序列,在置乱水印信息二值序列的同时使得嵌入字块的位置也是变化的,提髙了水印信息的安全性: 4. using the key to generate a pseudo-random sequence and embedding sequence word block, while scrambling binary watermark embedding sequence such that the position of the word block is also variable, mention Gao watermark information security of:

5. 提取水印时,以数字域图像中的字块为参照进行粗定位,然后在水印图像中进行搜索得到准确的水印嵌入字块及其定位字块,避免了模拟域攻击造成的字块划分错误,有效提髙了水印提取的准确率。 5. extract the watermark to the digital domain image as a reference word in the block for coarse positioning and accurate search word block and the watermark is embedded in the watermark image block alignment word, the word block is divided to avoid the analog domain caused by the attack error, effectively improve the accuracy of Gao watermark extraction.

附图说明: BRIEF DESCRIPTION OF:

图1为本发明的水印嵌入的流程图; Embedding a flowchart in FIG. 1 of the present invention;

图2为本发明的水印提取判决流程图; FIG watermark extraction decision flow chart 2 of the present invention;

图3为实施例1中用于嵌入水印的原始数字域文本图像; Original digital domain is a text image of FIG. 3 in Example 1 for embedding watermarks;

图4为对图3嵌入水印信息后的数字域水印图文图像; FIG 4 is a graphic image digital domain watermark embedding watermark information in FIG 3;

图5为用图4打印后再扫描得到的灰度水印文本图像; FIG 5 is a gray image obtained by the watermark text printing and then 4 scan;

图6为嵌入水印前对图3中一行文字进行字块划分的示意图; FIG 6 is a schematic diagram of the watermark before embedding in FIG. 3 for the word line of text block division;

图7为图3中的一个数字水印字块及其定位字块的示意图。 7 is a schematic view in FIG. 3 and a digital watermark word block alignment word block.

具体实施方式: Detailed ways:

实施例1:下面结合实际文本图像的水印嵌入和水印提取判决过程,通过实施例对本发明作进一步阐 Example 1: watermark extracting and watermark embedding decision process actual text image below through examples further explain the present invention

述: Above:

图1为本发明的水印嵌入的流程图,图2为本发明的水印提取判决流程图,图3为仿宋字体、小三号字号、A4纸型电子文本生成的300dpi分辨率、尺寸为2481x3509像素的bmp格式二值原始文本图像I; 图4为以整数211为伪随机序列的种子密钥,将整数94728对应的长度为20比特(不够20比特髙位补0) 的二进制序列(嵌入前加入了12比特的纠错信息,最终的水印信息为32比特)嵌入图3得到的水印图像 Embedding a flowchart in FIG. 1 of the present invention, the watermark extraction decision diagram flow chart 2 of the present invention, FIG. 3 is a fangsongti the 3rd small size, A4 paper size of an electronic document generated 300dpi resolution, a size of 2481x3509 pixels bmp binary format of the original text image I; 4 is an integer of 211 as a pseudo-random sequence seed key, the integer 94728 length corresponding to 20 bits (not 20 bits Gao complement bit 0) of FIG binary sequence (before addition of the embedded 12 error correction information bits, a final 32-bit watermark information) is embedded watermark image obtained in FIG. 3

/w (这里的密钥和水印信息可以任意指定,并非特殊值);图5为图4通过普通激光打印机以300dpi分辨 / W (where the key and the watermark information can be specified, not a special value); FIG. 4 through FIG. 5 is a general laser printer resolution to 300dpi

率打印,再经普通扫描仪以300dpi分辨率扫描得到的bmp格式256级灰度的灰度水印文本图像I',尺寸为2550x3509像素。 Print rate, the watermark bmp format gradation of 256 gradations by an ordinary scanner then scanned at 300dpi resolution text image obtained by I ', a size of 2550x3509 pixels. (实例中的具体参数仅为说明而设,应用中可视实际情况来定,以下描述雷同): (Examples of specific parameters is provided for illustration only, the application to set the actual situation in the visual, the following description identical):

本发明中的数字水印方法的整个嵌入和检测过程可细分按如下步骤进行描述:a)嵌入水印前预处理; b)水印嵌入;C)提取水印前预处理;d)水印提取判决。 The entire process of the digital watermark embedding and detection method of the present invention can be broken down as follows described: a) pretreatment before embedding the watermark; b) embedding the watermark; before C) extracting a watermark pretreatment; D) extracting the watermark judgment.

7a)嵌入水印前预处理: 7a) pretreatment before embedding watermark:

I为原始数字域文本图像l,背景色(如白色〉表示为W,前景图像(文字内容,如黑色)表示为B。 首先通过行划分来确定该行文字内所有字块上下两端的位置坐标,然后在划分出的文字行内利用字间隔区确定字块左右两端的位置坐标,从而实现所有字块的划分。首先利用文字行间隔区来进行行划分2,以原始图像I的左下角的像素为坐标原点(0,0)建立直角坐标系,对原始图像I在竖直方向上进行横向投影,也即统计每一像素行(这里的行不是指一行文字,而是指一行像素)具有前景色像素的个数,这样得到有3509 个元素的一维数组ProY[], ProY[i]表示纵坐标为i的像素行的前景色像素个数(i以及在后面出现的所有数组的元素编号都从0开始)。在文本图像的边缘空白部分以及行与行之间的空白间隔区ProY[i]为零,而各行文字所在的区域ProY[i】不为零。从纵坐标为0的 I is the original digital image L text field, the background color (e.g., white> is represented as W, foreground image (text, such as black) is represented as B. First, to determine the position coordinates of the upper and lower ends of all blocks within the line of text by a dividing line and then determine the position coordinates word block left and right ends utilized within partitioned text lines spacer region, in order to achieve all the divided word blocks. Firstly text line spacing region to the divided line 2, lower left corner of the original image I a pixel establish a Cartesian coordinate system as the coordinate origin (0,0), the original image I is projected in the vertical transverse direction, i.e. the statistics of each pixel row (row here does not refer to a line of text, but rather refers to a row of pixels) having a front the number of pixels of the scenery, thus obtained one-dimensional array ProY [], ProY [i] 3509 elements represented ordinate foreground pixel row number of i (i and all elements in the array of numbers appearing later start from 0). in ProY [i] the text image portion and a blank space edge region of the blank between the rows and row zero, and the region of each line of text where ProY [i] is not zero. ordinate is from 0 像素行开始依次检查数组ProY[](由于直角坐标系的原点在原始图像I的左下角,原始图像I的行顺序为从下至上〉,当ProY[i]>0且ProY[il]=:0时,表明行i为一行文字的起始像素行;当ProY[i+l]X)且ProY[i]i时,表明行i为一行文字的结束像素行。扫描完3509 行之后,得到I中有22行文字,并获取了各文字行的起始像素行和结束像素行坐标,以二位数组RowPos[22][2]表示,其中RowPos[j][0]表示文字行j的起始像素行纵坐标,RowPos[j][l]表示文字行j的结束像素行纵坐标。至此完成I的行分割。 An array of pixel rows are successively tested PROY [] (since the origin of the rectangular coordinate system in the lower left corner of the original image I, the row order of the original image I is from bottom>, when ProY [i]> 0 and ProY [il] =: 0, indicating that the pixel row starting with row i of the line of text; when ProY [i + l] X) and PROY [i] i, indicating the end pixel row i after completion of scanning a line of text lines 3509, to give. I have 22 lines of text, and acquires the start pixel of each character row and an end row of pixel rows coordinates to two-digit groups rowPos [22] [2], where rowPos [j] [0] represents a character row j starting pixel row ordinate, rowPos [j] [l] denotes the end of the pixel rows of text ordinate row j. I thus completed the row division.

以一行文字为例说明进行字块划分3的过程:对文字行j,取出由RowPos[j][O]和RowPos[j][l]限定的文字矩形区,在水平方向上进行垂直投影,统计每一像素列的前景色像素个数,得到有2481个元素的一维数组ProX[2481],其中ProX[k]表示横坐标为k的像素列的前景色像素个数。 In the word line of text explaining an example of block division process 3: The text lines j, removed from RowPos [j] [O] and RowPos [j] [l] defined text rectangular region, the vertical projection in the horizontal direction, count the number of foreground pixels in each pixel column, to obtain one-dimensional array of 2,481 elements ProX [2481], wherein ProX [k] is the number of foreground pixels abscissa k pixel columns. 从横坐标为0的像素列开始依次检查ProX[],当ProX[k]X)且ProX[kl】=0时,表明列k为一个字块的起始像素列;当ProX[k+l]>0 且ProX[k】^时,表明列k为一个字块的结束像素列。 The abscissa is the pixel columns are successively tested ProX of 0 [], when ProX [k] X) and ProX [kl] = 0, it indicates a starting column k of the pixel columns of one block; when ProX [k + l ]> 0 ProX and [k] ^, indicates the end of column k of the pixel columns of one block. 以文字行O(即"系列报道")为例,划分出6个字块, 用数组WordPos[22][6][2]表示,WordPos[J][n][0]表示文字行j内的字块n的起始列横坐标,WordPos[j][n][l] 表示文字行j内的字块n的结束列横坐标。 Text line O (i.e. "series of reports"), for example, divided into 6 blocks of words, an array WordPos [22] [6] [2] represented, WordPos [J] [n] [0] indicates the character line j block n words starting column abscissa, WordPos [j] [n] [l] denotes a word line of text blocks within the end of the J n columns abscissa. 对WordPos[0][][]检査可知,字块1和字块2分别为"列"这个字的左右部分,对其进行字块合并4,得到最后的5个字块,合并前后的划分结果如图6所示。 Of WordPos [0] [] [] Check understood, block 1 and block 2 are Word "column" left and right portions of the word, its word block 4 were combined to give the final word block 5, after the merger of division result as shown in FIG. 每个字块由其左上角坐标X(left, top)和右下角坐标Y(right, bo加m)确定。 Each word block is determined by its top-left corner coordinates X (left, top) and lower right coordinates Y (right, bo plus m). 这里每个字块具体由WordPos(j][n]【2]和所在行的RowPosU][2]四个值限定,也即左上角坐标X(leA topHWordPos[j][n][O], RowPos[jj[l]),右下角坐标Y(right, bo加mMWordPos[j][n][l],RowPos[j][O])。按照同样的方法可以完成其它行的字块划分以及部分字块的合并。 Particularly where each word block [2] by the four values ​​WordPos (j] [n] [2], and the row RowPosU] is defined, i.e. the top left corner coordinates X (leA topHWordPos [j] [n] [O], rowPos [jj [l]), the coordinates of the lower right corner Y (right, bo added mMWordPos [j] [n] [l], rowPos [j] [O]). the method can be accomplished in the same word line and the other block division the combined word block portion.

通过检査划分出的字块是否同时满足字宽、复杂度和完整性的要求,可将诸如v、 'r、'一'之类过于简单的字块,及某些未得到合并的部分字块剔除掉,完成字块筛选4。 Divided by checking word width of the word block is satisfied, the complexity and integrity requirements, such as v, 'r,' a simplistic 'word block or the like, and some have not been combined partial word weed out block, complete filter 4 word block. 剩下的字块构成可嵌入水印字块集 The remaining word blocks can be embedded watermark word block set

合Q,用于后续的水印嵌入和水印提取。 Together Q, for subsequent watermark embedding and watermark extraction. 这里可嵌入水印宇块集合内的字块需满足如下条件:l)字宽E, Herein may be embedded in a word block Yu watermark block set must satisfy the following conditions: l) word width E,

字宽的单位为像素, 了、 ";— ' 、 " ;, ^m为字体、字号及打印分辨率固定的完整字块在文本 The word width in pixels, of, "; - ',";, ^ m for the font, size and fixed print resolution full-word text block

图像中的最大字宽,本实施例中字宽为大于38像素小于64像素;2)复杂度。 The maximum width of the image words, in this embodiment the word width is greater than 38 pixels smaller than 64 pixels; 2) complexity. 复杂度的定义是字块内前 Complexity is defined before the word block

8景色像素点数量与字块内所有像素点数量的比值,要求大于1/15; 3)完整性。 Number of views 8 word block of pixels and the number of pixels in all ratios, require more than 1/15; 3) integrity. 要求字块不能为一个不完整的汉字、字母或者符号。 Word block can not be a required incomplete characters, letters or symbols.

水印嵌入字块序列的生成5:选择一定数量的行,将这些行中属于可嵌入水印字块集合Q的字块连接成一个序列A,通过密钥K 6和伪随机序列可以在序列A中选出一个字块s,判断这个字块是否同时满足如下条件:此字块之前未被选中为水印嵌入字块;在原始图像I中可嵌入水印字块S左右两侧最邻近位置都存在属于可嵌入水印字块集合Q的定位字块&、 f2 (用于在提取水印的时候判断水印嵌入字块的平移方向),且这两个字块都不是水印嵌入字块;S分别与f,、 f2的距离都要超过一定的阈值t,以避免在移动之 Embedding sequence generation word block 5: select a certain number of lines, those lines belonging to the watermark may be embedded in a word block set Q is connected to a word block sequence A, by the key K 6 and the pseudo-random sequence may be the sequence A selecting one block s, determine whether the word block satisfies the following conditions: prior to this word block not selected for the watermark embedding word block; in the original image I can be embedded watermark words about the position of nearest neighbor block S is present on both sides belong can be embedded watermark word block alignment word block set Q &, f2 (for determining the time of extracting a watermark embedding block the translation direction of the word), and this word is not embedding block word block; F and S respectively, , f2 should be more than a certain distance threshold t, the movement to avoid

后字块发生重叠或者字块间隔变化太明显影响视觉效果,t根据字体和字号的不同而有所变化,可事先指 Syllable word block or blocks overlap interval change too significantly affect the visual effects, t varies depending on the font and font size, may refer to the prior

定或自适应计算得到。 Calculated fixed or adaptive. 如上述条件同时满足将s添加到水印嵌入字块序列,否则寻找下一个字块,直到得 If the above conditions satisfied add s to the watermark embedding block sequence word, or to find the next block until give

到长度为M的水印嵌入字块序列S={ S; (X;,X;),户1 ,2,...局为止,超过一定数量的搜索次数则表示无法找 To the length of M watermark embedding block word sequence S = {S; (X;, X;), households 1, 2, ... Board far, more than a certain number of times, said number of search could not find

到足够的字块嵌入水印。 Word block sufficient to embed the watermark. 在多行文本中,可以以一定数量的文字行为基数开始生成S,如生成S失败则增加文字行重试,成功生成S则从下一行开始尝试生成另一个S以重复嵌入水印信息。 In the multiple lines of text, the text may conduct a certain number of starts generating base S, such as generating S fails to increase lines of text retry, S successfully generated from the next line began to try to generate another S to repeat the embedded watermark information. 根据文本的嵌入能力不同,可生成最多为p (p>=0)个S,用于重复嵌入相同的水印信息以提高信息冗余度,信息冗余度能有效提髙水印提取的准确率和鲁棒性。 Depending on the capacity of embedded text, can generate a maximum of p (p> = 0) a S, is repeated for embedding the same watermark information to increase redundancy of information, redundancy of information can effectively improve the accuracy of the extracted watermark and Gao robustness.

要嵌入的水印信息为32比特二进制序列,以4行文字作为基数开始尝试生成S,以序号为0-3的文字行为例,将这4行内属于Q的字块按行序号由小到大,行内字块由左到右的顺序连成一维字块序列A,本例中这里的A的长度为62。 Watermark information to be embedded 32-bit binary sequence, 4 lines of text to start trying to generate as the base S, and Serial No. Example 0-3 behavior character belonging to Q 4 word blocks within these rows by row number from small to large, word block rows in the order from left to right word even a one-dimensional block sequence a, in the present embodiment, the length a is 62 herein. 以密钥211作为伪随机序列的种子产生随机数random,以random。 211 as a key to a pseudorandom sequence to generate a random number seed random, at random. /。 /. 62(求模运算)计算出一个A内的字块的位置序号iPos,检査iPos对应的字块需同时满足以下条件:未被选为水印嵌入字块,左右两侧最近邻的定位字块都属于A且没有被选为水印嵌入字块,与左右两侧的最邻近字块的空白间隔大于等于1/50英寸,具体在本实施例中300dpi的分辨率下为6个像素以上,,如果满足则将其选为水印嵌入字块,添加到S中,否则以伪随机序列寻找下一个嵌入字块。 62 (modulo operation) is calculated within a word block A in position number iPos, iPos check word block corresponding to simultaneously meet the following criteria: not selected watermark embedding word block, the left and right sides of the nearest neighbor block alignment word and a are not selected as part of the watermark embedding word block, the blank interval of the most adjacent to both sides of the word block is 1/50 inch or greater, particularly in the present embodiment the resolution of 300dpi or more pixels ,, 6 If the condition then it is preferably watermark embedding word block, is added to S, otherwise to find a pseudo-random sequence insert word block. 如果连续5000次都没有找到新的字块添加到S中,表示不能找到足够的字块,则增加一行重新生成A进行尝试。 If not found 5000 times continuously added to the new S word block, the word represents a block can not find enough, then add a line A try to regenerate. 在本例中,0-3行的A有62个字块只能生成23个满足条件的水印嵌入字块,04行A有83个字块成功生成32个字块长的S0。 In the present embodiment, the line 0-3 A 62 word block 23 can generate watermark embedding conditions are satisfied word block, row A 04 word block of 83 successfully generates 32 word block length S0. 以5-8行开始尝试生成下一个S,最后的结果为5-9行,10-15行,16-20行都分别成功生成了不同的S。 5-8 to start trying to generate the next line S, the final result of lines 5-9, lines 10-15, 16-20 respectively successfully generated row different S. S2、 S3,这样在本文本中可以重复嵌入4次水印信息。 S2, S3, so that in the present text can be repeated four times embedded watermark information.

b)水印嵌入: b) embedding:

这里以Sc为例说明水印信息嵌入过程7,要嵌入的水印信息8为需要保护的重要信息二值化序列C^c》 j'=l,2,...)M}e{0,l},这个重要信息可以包括纠错码和检验码,视不同的应用而定。 In an example where Sc watermark information embedding process 7, 8 to the embedded watermark information is important information to be protected binary sequence C ^ c "j '= l, 2, ...) M} e {0, l } this important information may include an error correction code and check code, depending on the application. 这里要嵌入的整数94728 以20比特二进制表示,加上12比特的纠错码构成要嵌入的32比特水印信息。 94728 herein to be embedded 20-bit integer binary representation, with 12-bit error correction code composed of 32 bits to be embedded watermark information. So-( Sy(X》Y,),y-lA...,32〉 中的每个字块s,为WordPos[j][n】[2]和所在行的RowPos[j][2]四个值所限定的矩形区域,这里j表示所在文字行的行号,n为在文字行内的字块序号。加入纠错码的32比特二进制水印信息为C="00010111001000001000101100001100,,,按照由低位到高位的顺序将水印比特嵌入(最低位为co,最髙位为Cj;),如果水印比特C广l,将对应的水印嵌入字块Sj整体往左平移3个像素,如果水印比特9T0,将对应的水印嵌入字块Sy整体往右平移3个像素。至此完成对So进行水印嵌入的过程,依照同样的步骤可以对S,、 S2、 S3嵌入同样的水印信息,生成数字域的水印图像IO~~Iw。 c)提取水印前预处理:通过普通激光打印机以300dpi分辨率打印Iw,再经普通扫描仪以300dpi分辨率扫描得到的bmp格式256级灰度图I',尺寸为2550x3509像素。为便于有效区别前景图与背景图,对I'进行二值化,这 So- (Sy (X "Y,), y-lA ..., 32> each word in the block of s, is WordPos [j] [n] [2], and the row RowPos [J] [2] four values ​​defined rectangular region, where j represents the line number of text lines, n is the block number of words in the line of text added error correction code is a 32-bit binary watermark information C = "00010111001000001000101100001100 ,,, follow from low high order to embed the watermark bit (LSB co, most bits Cj of Gao;), if the watermark bit wide L C, corresponding to the watermark embedding block Sj whole word translation three pixels to the left, if the watermark bit 9T0, the watermark embedding word block corresponding to the entire right translation Sy 3 pixels. This completes the process of embedding watermark So, it is possible for S ,, S2, S3 same embedded watermark image watermark information according to the digital domain to generate the same procedure before IO ~~ Iw c) extracting a watermark pretreatment: Iw by conventional laser printer to print at a resolution of 300dpi, and then the common bmp format scanner scanned at 300dpi resolution obtained 256 grayscale I ', a size of 2550x3509 pixels to facilitate the effective difference between the foreground and background, to I 'is binarized, this 采用的阈值为指定的经验值150,(阈值也可由自适应方法确定)即对图像中的每一个像素,如果其灰度值小于150,判定为前景区像素,令其灰度值等于0 (黑点),否则判定为背景区像素,令其灰度值为255 (白点)。由于打印扫描过程9势必容易引入噪声,尤其是在背景区中的噪声会导致字块合并、筛选发生错误, 需要进行一步去噪处理,如果水印提取失败可人工检査是否有去噪处理未去除的大的噪声点、非文本内容的污迹以及字块的粘连等,可随后进行人工修复。实验表明本发明中的水印方法对文本图像在(-0.5度, 0.5度)范围内的旋转有鲁棒性,不需要作任何配准处理。接下来在灰度水印文本图像I'中进行水印嵌入字块及对应的定位字块的提取14。首先对原始图像I进行与嵌入水印时完全相同的字块划分3,并使用同样的密钥6和伪随机序列提取出水印嵌入 The threshold value employed empirical designated 150, (also determined by adaptive threshold method), that each of the image pixel gray value if it is less than 150, it is determined as a foreground pixel region, so that it is equal to 0 gradation ( black dots), or the pixel area is determined as the background, make the gradation value of 255 (white dots). Since the print scanning process 9 is bound easily introduce noise, especially noise in the background area leads to merging word block, an error occurs screening the need for further de-noising, if the watermark extraction failure point loud noise can manually check for noise removal process is not removed, stains and adhesions and other non-text content of the words of the block, can then be manually repaired. experiments show the method of the present invention is a watermark that are robust in the text image rotation (-0.5 degrees, 0.5 degrees) range, it does not require any registration processing for the next watermark embedding watermark text word gradation image I 'in extraction block and the corresponding block 14. the first alignment word of the original image I is exactly the same word watermark embedding block division 3, and 6 use the same pseudo-random sequence and the key extracted watermarking 块序列S 5, 对灰度水印文本图像I '也进行同样的行划分11和字块划分12,但是不进行部分字块的合并等后续操作。 以原始图像I为模板,按照如下方法提取灰度水印文本图像I '中的水印嵌入字块序列So' 14:设原始图像I 中一个被嵌入水印的字块seS,其所在行V最左端前景色像素和最右段前景色像素的横坐标分别为D和E, I'中的对应行V'中最左端前景色像素和最右段前景色像素的横坐标分别为D'和E'。 s的左端和右端的横坐标为L和R,根据平移和縮放关系由坐标变换计算出在V '中对应的字块s'的左端横坐标L'和右端横坐标R',计算公式如下:Z/ = (£-Z) + 1),(五'-D' + 1)/(五-D + 1) + jD'-1; 及'=(/2-+ D' + l)/(五-D + 1) + Z)'-l。 Block sequence S 5, the gradation of the watermark text image I 'is also subjected to the same word lines are divided into blocks 11 and 12 is divided, but not subsequent operations merging words of the block portion, etc. of the original image I as a template, as follows ash extract the text of the watermark image I 'word block embedding a watermark sequence So' 14: seS word block is provided in the original image I to be embedded in a watermark, the abscissa its row V foreground pixel leftmost and rightmost segments of foreground pixels are D and E, I 'corresponding row in V' leftmost pixel and the foreground pixel rightmost abscissa foreground segments are D 'and E'. s left and right ends of the abscissa is L and R the pan and zoom coordinate conversion relationships calculated by an 's corresponding word block' in the left end of the abscissa V L 'and the right end abscissa R', is calculated as follows: Z / = (£ -Z) + 1), (v '-D' + 1) / (five -D + 1) + jD'-1; and '= (/ 2- + D' + l) / (five -D + 1) + Z) '- l . ("以上公式得出的L'和R'与实际值有一定的差别。为得到L'和R'的精确值,用计算出的L'和R'分别同V'中已划分出的字块的左端横坐标和右端横坐标计算偏差量,找出差别最小的即为L'和R'的真实值。依次可以找出I'中的所有水印嵌入字块。在a)中选取水印嵌入字块的时候,其中的一个条件是Q中存在两个定位字块f,和f2分别位于左右两侧,按照计算s'同样的方法,可以得到s'左右两侧的定位字块"和f2'。 ( "Derived from the above formula L 'and R' is somewhat different from the actual values ​​of the exact value of L 'and R' are, with the calculated L 'and R' are the same V 'has been divided word block left and right abscissa abscissa deviation calculation, find the smallest difference is the L 'and R' true value. successively find all the words embedding block I 'is. selected watermark embedding in a) when the word block, a condition wherein there are two alignment word Q F blocks, and f2 are respectively the left and right sides, is calculated in accordance with s 'in the same manner, it is possible to obtain s' left and right sides of the block alignment word "and f2 '. 定位字块的提取非常简单,在成功生成s时,此时对应的字块序列A中s左右两侧的最邻近字块,即为定位字块f,、 f2。 Extracting block alignment word is very simple, upon successful generation s, this time corresponding word blocks most adjacent word sequence s in the left and right sides of the block A, namely alignment word block f ,, f2. 这里对I使用相同的伪随机序列种子密钥整数211和伪随机序列得到完全相同的水印嵌入字块序列So、S,、 S2、 S3。 Get exactly the same words watermark embedding block sequence So, S ,, S2, S3 herein using the same pseudo random sequence seed key and an integer of 211 pseudorandom sequence I. 以So为例,生成的定位字块序列Fo表示为F(P((f/,,f々),y'-l,2,...,32》。在获取I'中的水印嵌入10字块序列So'和定位字块序列Fe',需要利用行间隔区和字间隔区对灰度水印文本图像I '进行初步的字块划分,但是不进行部分字块的合并等后续操作。这里以SnSSo为例,在l中so在行2中被框出的字"金",其左上角坐标为(1210,706),右下角坐标为(1271, 641),如图7所示,所在行2最左端的前景色点横坐标为379,最右端为2097,此行前景区的宽度为2097-379+1=1719。对应的I'中的行2的最左端和最右端前景色点横坐标分别为411、 2084,其前景区宽度为2084-411+1=1674。可以看出经过模拟域攻击后I'中的行2与I中的行2相比发生了平移和縮放(产生的旋转量很小,不影响水印提取)。根据平移和縮放关系,利用公式(l)可以计算出s。在I'中对应的水印嵌入字块30的左右两端的横坐标分别为1>122 So as an example to generate the alignment word is represented as a sequence of blocks Fo F (P ((f / ,, f々), y'-l, 2, ..., 32 ". In the watermark acquisition I 'embedding 10 word block sequence So 'and a block alignment word sequence Fe', need to use spacers and the word lines of the spacer watermark text gradation image I 'preliminary word block is divided, but not subsequent operations merging words of the block portion, etc. here in SnSSo an example, in the l in row 2 to be so boxed word "gold", the top left corner coordinates (1210,706), the bottom right coordinates (1271, 641), shown in Figure 7, where line 2 of the leftmost point of the abscissa is the foreground 379, the rightmost 2097, the width of this line is the foreground region 2097-379 + 1 = 1719. corresponding to I 'in the leftmost and rightmost row foreground point 2 abscissa 411, 2084, which is the foreground region width is 2084-411 = 1674 + 1 it can be seen that the analog domain after the attack I 'in row 2 and row 2 I is compared to pan and zoom occurs (generated amount of rotation is small, does not affect the watermark extraction) relationship using the formula (l) s can be calculated according to the pan and zoom. the abscissa the left and right ends of the corresponding I 'word in the watermark embedding block 30 are a> 122 0, R'=1279。计算出的L'和R'与实际值存在一定的偏差,用L' 和R'分别与I'中s。'所在行2中的所有字块的左端横坐标和右端横坐标计算偏差量,找到偏差量最小的即为L'和R'的真实值。这里I'的行2有28个字块,通过搜索得到真实值1>1223, R'=1283,与计算值1220和1279的偏差量分别为3和4,再加上其所在行2的上下纵坐标687和751则限定了字块s。',其左上角坐标为(1223,751),右下角坐标为(1283,687)。按照同样的方法可以得到1和I '中所有水印嵌入字块及其左右两侧定位字块,其中sc的定位字块如图7所示。d)水印提取判决:按照水印信息二值化序列C的嵌入顺序,水印提取15的过程为:对ses及对应的s'e S', f\, f2, f\', f2',分别计算各个字块的质心Zs,Zy,Zi,Z2,Z"Z(。以s的质心计算为例,s由其左上角坐标(left,top) 和右下角坐标(right,bDttom)限定,则其质心为:水印嵌入字块S的选 0, R '= 1279. The calculated L' 'and the presence of some actual value deviation, with L' and R and R 'and I' in s. 'Row left abscissa of all blocks 2 and right abscissa deviation amount is calculated, to find the smallest amount of deviation is the L 'and R' true value. here I 'line has 28 word block 2, to give the true value 1> 1223, R by searching' = 1283, and Calcd deviation 1220 and 1279, respectively 3 and 4, together with its row of compressional coordinates 2 and 687 define the word block 751 s. ', which is the upper left corner coordinates (1223,751), the lower right corner coordinates (1283,687). according to the same method can be obtained and an I 'word block all the watermarking and right and left sides of block alignment word, wherein the alignment word block sc .d shown in FIG. 7) extracting the watermark judgment : order of embedding watermark information in accordance with binarizing sequence C, the watermark extraction process was 15: ses and of the corresponding s'e S ', f \, f2, f \', f2 ', respectively, the centroid is computed for each word block . Zs, Zy, Zi, Z2, Z "Z (s centroid calculation to an example, its top left corner s coordinates (left, top) and lower right coordinates (right, bDttom) is defined, then its centroid: watermarking selected word block S 标准保证了在其左右两侧都存在定位字块f,和f2,并且f,和f2不会被选为水印嵌入字块,也即f,和f2的水平位置是不会发生变化的。如果s被左移了,则s靠近左侧定位字块fi而远离右侧定位字块f2;如果S被右移了,则S远离左侧定位字块f,而靠近右侧定位字块f2。 s、 f,、 6的质心之间的相对位置关系就能够反映S相对于&和f2平移的方向,也就可以提取出嵌入的水印比特。 Ensuring the presence of standard alignment word blocks are F, and f2 on the left and right sides, and F, and f2 are not selected watermark embedding word block, i.e. F, f2 and horizontal position is not changed. If s is shifted to the left, then s is positioned near the left away from the right side of the fi word block alignment word block F2; if s is shifted to the right, away from the left positioning s F word block, and positioned close to the right word block f2. s, f ,, relative positional relationship between the center of mass S 6 can be reflected and f2 with respect to a direction & translation, it can extract the embedded watermark bit. 质心在字块内的相对位置是抗旋转、平移和缩放的,S分别与f,, f2的质心的距离是抗平移、旋转的,因此,可得到S和A的质心的距离与S和f2的质心的距离的比值是抗旋转、平移、缩放的。 The relative position of the centroid in the word block is an anti-rotation, translation and scaling, S respectively f ,, f2 is the centroid is an anti-translation, rotation, and therefore, can be obtained from the centroid of S and A and S and f2 ratio of the distance of the centroid is an anti-rotation, translation, scaling. 可以由Zs,Zs.,Zi,Z2,Z"《按如下公式判断字块的平移方向:艰,z;),败,化> l,则水印嵌入字块s被左移了;《l,则水印嵌入字块s被右移了。这里D(Zi,Z》表示Z,,Z2两点的欧式距离。对照水印信息嵌入时的规则,可以知道s往左移,往右移分别对应的水印比特是1还是0,进而可以提取出完整的水印信息。根据实际情况可以进行纠错、校验码校验等后续处理。So'字块其左上角坐标为(1223,751),右下角坐标为(1283,687);左侧定位字块左上角坐标为(l092,751), 右下角坐标为(1147,687);右侧定位字块左上角坐标为(1288,751),右下角坐标为(1339,687)。原始图像I中So字块其左上角坐标为(1210,706),右下角坐标为(1271,641);左侧定位字块左上角坐标为(1078,706),右下角坐标为(1134,641);右侧定位字块左上角坐标为(1279,706),右下角坐标为(1332,641)。按照公式(2)分别计算以上6个字块的质心, May be made Zs, Zs, Zi, Z2, Z "" is determined by the following formula word block translation direction: difficult, Z;), lost, of> l, the watermark embedding block word s is shifted to the left;. "L, embedding the word block s is shifted to the right. here D (Zi, Z "represents Z ,, Z2 Euclidean distances of two points. control rules for embedding watermark information, can know s to the left, respectively corresponding to right watermark bit is 1 or 0, in turn, may complete the extracted watermark information may be processed for subsequent error correction code parity check of the actual situation and the like .So 'word block which is the upper left corner coordinates (1223,751), the lower right corner coordinates (1283,687); word block left positioning coordinates of the upper left corner (l092,751), the bottom right coordinates (1147,687); right positioning word block upper left corner coordinates (1288,751), the lower right corner coordinates (1339,687) of the original image I So word block which is the upper left corner coordinates (1210,706), the bottom right coordinates (1271,641); the upper left corner of the left alignment word block coordinates (1078,706) , the lower right corner coordinates (1134,641); right side left corner of block alignment word (1279,706), the bottom right coordinates (1332,641) are calculated according to the equation (2) six or more word block centroids , 果为Sfl质心(1252.55,714.736), so'左侧定位字块质心(l 117.97,720.988), s。'右侧定位字块质心(l312.09,716.948), s。质心(1240.31,669.198), s。左侧定位字块质心(1106.21,675.004), s。右侧定位字块质心(1304.22,671.273)。按照公式(3)可以计算出113«0=0.8615<1,判断为水印嵌入字块被右移了,按照嵌入水印的对应规则, 字块被右移代表嵌入水印比特为0。按照同样的步骤可以提取出所有水印嵌入字块中嵌入的水印比特,将其按照顺序连成二进制序列,进行纠错便可得到水印提取结果16„在本实施例中S。、 S,、 S2、 S3中都能正确提取出嵌入的整数94728。12 Fruit Sfl centroid (1252.55,714.736), so 'center of mass left alignment word (l 117.97,720.988), s.' Center of mass right alignment word (l312.09,716.948), s. Centroid (1240.31,669.198), s the left block alignment word centroid (1106.21,675.004), s. centroid right positioning word block (1304.22,671.273) according to equation (3) can be calculated 113 «0 = 0.8615 <1, it is determined that the watermark embedding block is the word shifted to the right, the rules of embedding a watermark in a corresponding word blocks are shifted to the right on behalf of the embedded watermark bit is 0. in the same step may extract all word block embedding watermark embedded watermark bits, the order of which is connected to a binary sequence, error correction result can be obtained watermark extraction 16 "S. in this embodiment, S ,, S2, S3 can correctly extract the embedded integer 94728.12

Claims (10)

1. 一种抗模拟域攻击的文本数字水印方法,其特征在于,包括以下步骤:a)嵌入水印前预处理:对原始文本图像内的字块进行划分、合并、筛选生成可嵌入水印字块集合,从可嵌入水印字块集合中选出水印嵌入字块,形成水印嵌入字块序列;b)水印嵌入:通过每个水印嵌入字块位置的水平移位来嵌入1比特的水印信息,取值为0或1;再将水印信息的二值化序列依次嵌入原始文本图像选出的水印嵌入字块序列中,得到加入水印的水印文本图像;c)提取水印前预处理:对水印文本图像区分前景区和背景区;对原始文本图像进行与a)完全相同的操作,生成水印嵌入字块序列及其定位字块序列;对水印文本图像进行与a)中相同的字块划分,并利用原始文本图像的水印嵌入字块序列及其定位字块序列来辅助生成和定位水印文本图像中的水印嵌入字块序列及其定位字 1. An anti-text attack analog domain watermarking method comprising the steps of: a) prior to the watermark embedding Pretreatment: word block of text within the original image is divided, combined, screened generate watermark embedded word block set, can be selected from the set of word block embedded watermark embedding word block, the word is formed embedding block sequence; b) embedding: embedding a watermark bit information word level of each watermark embedding block position shift, taking a value of 0 or 1; then the binary sequence sequentially embedding watermark information embedding word block sequence of the original text image selected to obtain watermarked image watermark text; c) extracting a watermark before pretreatment: watermark text image distinguishing foreground region and a background region; original text image and a same operation), generates a watermark embedding block word sequences and sequence alignment word block; text watermark image is the same as a) the word block is divided, using watermark embedding word block alignment word sequences and sequence of the original text image block to assist in word block watermark embedding sequence generation and positioning and alignment word in the text image watermarking 块序列;d)水印提取判决:根据水印文本图像中每个水印嵌入字块相对于原始文本图像的水印嵌入字块的移动方向提取水印信息,进行判决;在d)中所述水印提取判决阶段,通过判断水印文本图像中的水印嵌入字块与左右两侧定位字块的相对位置与原始图像中的水印嵌入字块与左右两侧定位字块的相对位置相比的变化关系来提取水印嵌入信息,其中水印嵌入字块的位置用质心坐标来表示。 Block sequence; d) extracting the watermark judgment: The extracted watermark embedding for each text image with respect to the moving direction of the word block word block embedding the original watermark image text information, the decision; the decision watermark extraction stage in d) to extract the watermark is determined by the relationship between changes in relative position compared to the left and right sides of the block alignment word word block embedding watermark embedding block word in the text image and the left and right sides relative position of the alignment word block is embedded in the original image information, wherein the position of the watermark embedding block word is represented by the center coordinate quality.
2. 根据权利要求1所述的抗模拟域攻击的文本数字水印方法,其特征在于,所述字块的划分是依靠文本图像中的行间隔区和字间隔区来进行字块划分,所述字块的合并是对字块内部在水平方向上存在间隔区而被错分为多个部分字块的,进行字块合并,所述字块的筛选是指字宽、复杂度和完整性同时满足预先设定的要求的字块,才可进入可嵌入水印字块集合。 The anti analog domain according to a text attack watermarking method claim, wherein said dividing block is to rely on word in the text image and a word line spacer to spacer word block is divided, the word block is combined in the presence of an internal spacer word block in the horizontal direction is divided into a plurality of portions of the wrong word block performs merging word block, the block character refers to screening word width, while the complexity and completeness meet preset word block, before entering a word may be embedded watermark block set.
3. 根据权利要求2所述的抗模拟域攻击的文本数字水印方法,其特征在于,所述字块的筛选时对字宽、复杂度和完整性设定的要求为:l)字宽五,字宽的单位为像素,需满足£>1(£")"'^£<(£"'"),£"为为字体、字号及打印分辨率固定的完整字块在文本图像中的最大字宽;2)复杂度,复杂度的定义是字块内前景色像素点数量与字块内所有像素点数量的比值,复杂度的要求大于1/15; 3)完整性的要求为字块不能为一个不完整的汉字、字母或者符号。 The anti-attack of the analog domain to claim 2, text watermarking method, wherein, when the filter block to the word width of the word, and the complexity of the set of integrity requirements: l) word-wide five , the word width in pixels, need to meet the £> 1 (£ ")" '^ £ <(£ "'"), £ "for the font, size and fixed print resolution full-word blocks in the text image the maximum word width; 2 defined complexity, complexity) is the number of all the pixels within the number of foreground pixels within the block and word word block ratio of greater than 1/15 require complex; 3) required for the integrity of the word It can not be an incomplete block of characters, letters or symbols.
4. 根据权利要求1所述的抗模拟域攻击的文本数字水印方法,其特征在于,生成水印嵌入字块序列中每个水印嵌入字块时,确保该字块左右两侧同时存在不是水印嵌入字块的定位字块,而且必须与左右两侧的字块的间距分别都大于等于1/50英寸。 The method for watermarking text attack the anti analog domain to claim 1, characterized in that the word block generating watermark embedding watermark embedding sequence for each word block, to ensure that the right and left sides of the block while the word is not present watermarking word block alignment word block, and spacing must be left on both sides of each word block are not less than 1/50 inch.
5. 根据权利要求1所述的抗模拟域攻击的文本数字水印方法,其特征在于,所述的水印嵌入字块序列的形成由密钥和伪随机序列来控制。 The anti-text attack analog domain watermarking method according to claim 1, characterized in that a sequence of watermark embedding of the word block is controlled by a pseudo-random sequence and the key.
6. 根据权利要求1所述的抗模拟域攻击的文本数字水印方法,其特征在于,嵌入1比特水印信息的定义:通过将原始文本图像中选定的水印嵌入字块往左移动,嵌入水印信息为"1"的一个比特信息, 往右移对应嵌入的水印信息为"0"的一个比特信息;反之亦可;移动距离需大于等于1/100英寸,且最大移动距离以不会造成在移动之后字块发生重叠为上限。 The anti-text method for watermarking an analog domain to the attack claimed in claim 1, wherein the embedded watermark information bit 1 is defined: by the original text image watermarking selected word block moves to the left, the embedded watermark information is "1" in a bit of information, to right corresponding to the embedded watermark information is "0" in a bit of information; vice versa; must be greater than the moving distance equal to 1/100 inch, and the maximum movement distance in order not to cause word block overlap the upper limit after the movement.
7. 根据权利要求1所述的抗模拟域攻击的文本数字水印方法,其特征在于,所述的定位字块为可嵌入字块集合中位于水印嵌入字块左右两侧且与该水印嵌入字块最邻近的两个字块。 The anti-attack of the analog domain to claim 1, text watermarking method, wherein the block alignment word is embedded in a word block set may be the left and right sides of the block and word watermarking embedding the word block nearest word block.
8. 根据权利要求l-7任一权利要求所述的抗模拟域攻击的文本数字水印方法,其特征在于,在水印提取前,以对原始文本图像进行划分生成的水印嵌入字块序列和定位字块序列的位置坐标为参照,对水印文本图像的水印嵌入字块序列和定位字块序列中的字块进行粗定位,再通过搜索水印文本图像得到水印嵌入字块序列和定位字块序列中的各个字块的准确位置。 The anti-text method for watermarking an analog domain attack l-7 according to any one of claims claim, characterized in that, before extracting the watermark, the watermark embedding for word block division generated sequence and positioning of the original text image the position coordinates of the word sequence of blocks as a reference, for the watermark embedding word block sequence and the alignment word sequence of blocks of the watermark text image in word block coarse positioning and then to obtain watermark embedding word block sequence alignment word block sequence by searching for the watermark text image the exact position of each word block.
9. 根据权利要求8所述的抗模拟域攻击的文本数字水印方法,其特征在于,所述对水印文本图像的水印嵌入字块序列和定位字块序列中的字块进行粗定位的方法:根据该字块在原始文本图像中的位置坐标、该字块在原始文本图像中的所在文字行的左右两端的横坐标,以及该文字行在水印文本图像中对应文字行的左右两端的横坐标,由坐标变换计算出在水印文本图像中对应字块的粗略位置坐标。 9. The anti-text attack analog domain watermarking method according to claim 8, wherein said coarse positioning method of embedding a watermark sequence and the word block alignment word image block sequence in the watermark text word block: the position coordinates of the character in the original text image block, and the abscissa the left and right ends of the word in the original text block text lines in the image location, and the text of the character line image corresponding watermark abscissa the left and right ends of the lines of text , coarse location corresponding to the coordinates of the watermark text word block calculated by the image coordinate conversion.
10. 根据权利要求8所述的抗模拟域攻击的文本数字水印方法,其特征在于,所述的通过搜索获取字块准确位置的方法为:在水印文本图像中,用字块通过粗定位得到的左端横坐标与字块所在文字行内所有字块的左端横坐标计算偏差量,偏差量最小的即为所搜索字块的左端横坐标准确值;在水印文本图像中,用字块通过粗定位得到的右端横坐标与字块所在文字行内所有字块的右端横坐标计算偏差量, 偏差量最小的即为所搜索字块的右端横坐标准确值。 10. The anti-text attack analog domain watermarking method according to claim 8, wherein said obtaining the exact position of the block by the word searching method: the watermark text image, with the word block obtained by coarse positioning the left end of the abscissa where the deviation amount and the text word block of all blocks in the left end abscissa line calculation, the minimum amount of deviation is the desired word block searched exact value of the left end of the abscissa; the watermark text image, with the word through a coarse positioning block the right end of the abscissa of all blocks in the right end of the abscissa where the word lines of text block obtained deviation amount is calculated, the smallest amount of deviation is the desired word block searched exact right abscissa value.
CN 200510060488 2005-08-25 2005-08-25 Text numerical watermark method for resisting analog domain attack CN100534033C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200510060488 CN100534033C (en) 2005-08-25 2005-08-25 Text numerical watermark method for resisting analog domain attack

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200510060488 CN100534033C (en) 2005-08-25 2005-08-25 Text numerical watermark method for resisting analog domain attack

Publications (2)

Publication Number Publication Date
CN1801707A CN1801707A (en) 2006-07-12
CN100534033C true CN100534033C (en) 2009-08-26

Family

ID=36811491

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200510060488 CN100534033C (en) 2005-08-25 2005-08-25 Text numerical watermark method for resisting analog domain attack

Country Status (1)

Country Link
CN (1) CN100534033C (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103065101A (en) * 2012-12-14 2013-04-24 北京思特奇信息技术股份有限公司 Anti-counterfeiting method for documents
CN103414892B (en) * 2013-07-25 2016-08-10 西安空间无线电技术研究所 The Image Hiding that a kind of Large Copacity is incompressible
CN105848010B (en) * 2016-03-31 2018-12-25 天津大学 The insertion of mobile device video watermark and extracting method based on piecemeal combination

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1450495A (en) 2002-03-29 2003-10-22 佳能株式会社 Image process device and method
CN1558595A (en) 2004-01-18 2004-12-29 哈尔滨工业大学 Method for making and verifying digital signature and digital watermark bar code

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1450495A (en) 2002-03-29 2003-10-22 佳能株式会社 Image process device and method
CN1558595A (en) 2004-01-18 2004-12-29 哈尔滨工业大学 Method for making and verifying digital signature and digital watermark bar code

Also Published As

Publication number Publication date
CN1801707A (en) 2006-07-12

Similar Documents

Publication Publication Date Title
Maxemchuk et al. Marking Text Documents.
US6993154B2 (en) Measuring digital watermark strength using error correction coding metrics
US7113615B2 (en) Watermark embedder and reader
US8788971B2 (en) Methods and arrangements for composing information-carrying artwork
US7158654B2 (en) Image processor and image processing method
JP4554358B2 (en) Visible authentication pattern for printed documents
US6694041B1 (en) Halftone watermarking and related applications
US7054461B2 (en) Authenticating printed objects using digital watermarks associated with multidimensional quality metrics
US6307950B1 (en) Methods and systems for embedding data in images
US7142689B2 (en) Image processing apparatus for determining specific images
Wu et al. Data hiding in binary image for authentication and annotation
US9594993B2 (en) Two dimensional barcode and method of authentication of such barcode
US6741758B2 (en) Image processor and image processing method
US6959385B2 (en) Image processor and image processing method
JP3136061B2 (en) Document copy protection method
JPWO2004084125A1 (en) Information input / output method using dot pattern
US7245740B2 (en) Electronic watermark embedding device, electronic watermark detection device, electronic watermark embedding method, and electronic watermark detection method
US7995790B2 (en) Digital watermark detection using predetermined color projections
US8027510B2 (en) Encoding and decoding media signals
CN101366266B (en) Method and device for embedding and detecting digital watermark in text document
Solanki et al. Print and scan'resilient data hiding in images
Huang et al. Interword distance changes represented by sine waves for watermarking text images
JP4005780B2 (en) Digital watermark embedding and detection
Voloshynovskiy et al. Multibit digital watermarking robust against local nonlinear geometrical distortions
JP4269861B2 (en) Printed material processing system, watermarked document printing device, watermarked document reading device, printed material processing method, information reading device, and information reading method

Legal Events

Date Code Title Description
C06 Publication
C10 Request of examination as to substance
C41 Transfer of the right of patent application or the patent right
ASS Succession or assignment of patent right

Owner name: HANGZHOU TIMEVALE INFORMATION TECHNOLOGY CO., LTD

Free format text: FORMER OWNER: HANGZHOU TIMEVALE INFORMATION TECHNOLOGY CO., LTD.

Effective date: 20070309

C14 Granted