CN103761459B

CN103761459B - A kind of document multiple digital watermarking embedding, extracting method and device

Info

Publication number: CN103761459B
Application number: CN201410035906.7A
Authority: CN
Inventors: 陈小军; 时金桥; 徐睿; 蒲以国; 赵亮; 张锐
Original assignee: Institute of Information Engineering of CAS
Current assignee: Institute of Information Engineering of CAS
Priority date: 2014-01-24
Filing date: 2014-01-24
Publication date: 2016-08-17
Anticipated expiration: 2034-01-24
Also published as: CN103761459A

Abstract

The present invention relates to a kind of document multiple digital watermarking embedding, extracting method and device, document multiple digital watermarking embedding grammar, comprise the following steps: obtain original watermark information, key and the pending document of user's input；Calculate the summary info in original watermark information, generate new watermark information；Original watermark information and new watermark information are stored in data base collectively as a data storehouse record；Character in document is divided into two-layer, the length of the watermark information position total, new of the character according to document ground floor, obtain the group number of the new watermark information of document ground floor to be embedded, according to order be from front to back embedded into organizing new watermark information in attribute position in document ground floor more；It is embedded into organizing new watermark information in the attribute position in the document second layer according to order from back to front more.Present invention character attibute based on Word format document, uses key to improve safety, repeats embedding and strengthen robustness, and multiple embedding improves watermark capacity.

Description

A kind of document multiple digital watermarking embedding, extracting method and device

Technical field

The present invention relates to digital watermarking field, particularly to a kind of document multiple digital watermarking embedding, extracting method and dress Put.

Background technology

In recent years, along with developing rapidly of multimedia and internet, the copyright of protection copyright becomes current science One much-talked-about topic of boundary's research.Digital watermarking is as the important research direction of Information Hiding Techniques, at text, video, audio frequency Deng multimedia copyright protection aspect, there is important value.Digital watermarking is by embedding for the copyright informations such as serial number, word, logos Enter in multi-medium data, to play the effects such as copyright protection, confidential corespondence, the real and fake discrimination of data file and product marking.

The Text Watermarking method that existing availability is higher mainly has Text Watermarking based on form and based on natural language This two big class of Text Watermarking.It is the most class text watermark occurred up to now based on format text watermark, from initial Row displacement, word displacement, feature coding, develop the change method such as font size, color finally, the water mark method of this type Study the most active, but the method exists the weak points such as weak in safety, watermark capacity is low.Text based on natural language Watermark is early than within 02 year, being proposed by Mikhail.J.Atallah and VictorRaskin of Purdue university of the U.S. et al..Main If adding watermark information by changing the method such as sentence structure, synonym replacement.Natural language digital watermarking changes literary composition This content, but do not change implication and the form of text, hardly possible after adding watermark is noticeable, and is also not easy to be broken Bad.But for normative document, because its call format is relatively stricter, this kind of method may change semanteme, thus is not suitable for The file that call format is strict.The most ripe to the process of natural language additionally, due to computer, this has become based on natural language The bottleneck of speech Text Watermarking technology.

Summary of the invention

The technical problem to be solved is to provide a kind of character attibute based on Word format document, utilizes key Improve safety, repeat to embed add strong robustness, multiple embedding improves the document multiple digital watermarking of watermark capacity and embeds, carries Access method and device.

The technical scheme is that a kind of document multiple digital watermarking embedding grammar, bag Include following steps:

Step 1: obtain original watermark information, key and the pending document of user's input；

Step 2: utilize digest algorithm to calculate the summary info in original watermark information, generate new watermark information, according to newly Watermark information obtains the length of new watermark information position；

Step 3: original watermark information and new watermark information are stored in data base collectively as a data storehouse record, are used for Inquiry original watermark information when extracting watermark；

Step 4: the character in document is divided into two-layer, watermark information position total according to the character of document ground floor, new Length, obtain the group number of the new watermark information of document ground floor to be embedded, new watermark will be organized according to order from front to back more Information is respectively embedded in the attribute position in document ground floor, and organize utilizes separators between new watermark information more；

Step 5: the attribute that new watermark information is respectively embedded in the document second layer will be organized according to order from back to front more In Wei, organize utilizes separators more between new watermark information, and embedding the group number of new watermark information in the document second layer is to embed The group two times of number of new watermark information in document ground floor.

The invention has the beneficial effects as follows: present invention character attibute based on Word format document, use key to improve peace Quan Xing, repeats embedding and strengthens robustness, and multiple embedding improves watermark capacity.

On the basis of technique scheme, the present invention can also do following improvement.

Further, the character in document is divided into the method for two-layer and specifically includes following steps:

Obtaining the Unicode coding of the character being used as key, the Unicode of the character that will act as key is encoded translated is two System sequence, using binary sequence last two as key sequence；

Obtain the Unicode coding of all characters in document, the Unicode of character each in document is encoded and converts respectively For binary sequence；

The binary sequence that each character changes into document respectively by key sequence carries out xor operation, if result is 00,10, then this character is divided into document ground floor；If result is 01,11, then it is divided into the document second layer.

Further, the binary sequence of the non-visible character that described separator is arbitrarily of little use in being Unicode coding.

Further, specifically include following steps by organizing the different attribute position that new watermark information is respectively embedded in document more:

For ground floor, it is respectively modified the NoProofing property value of all characters in ground floor, if the most to be embedded New watermark information is 1, then NoProofing property value is revised as True, otherwise, keeps original value False constant；

For the second layer, it is respectively modified the LanguageIDOther property value of all characters in the second layer, if currently treating embedding The new watermark information entered is 00, then keep original value constant, if new watermark information position the most to be embedded is 01, then revise LanguageIDOther property value is wdBasque, if new watermark information position the most to be embedded is 10, then revises LanguageIDOther property value is wdVenda, if new watermark information position the most to be embedded is 11, then revises LanguageIDOther property value is wdEstonian.

Further, a kind of document multiple digital watermarking extracting method, comprise the following steps:

Step 1a: detect in pending document whether embed watermark information, if it is, all characters are divided into two by rule Layer, proceeds to step 2a, and otherwise, end processes；

Step 2a: extract watermark information in the attribute position of document ground floor, extract in the attribute position of the document second layer Watermark information, obtains the actual extracting group number of every layer of watermark information extracted respectively according to separator；

Step 3a: according to the character of document ground floor sum, the watermark that extracts in the ground floor and the second layer of document The length of information bit, respectively obtains the predetermined extraction group number of the watermark information embedding document ground floor and the second layer；

Step 4a: when the many groups watermark information extracted unanimously and all matches a data storehouse record, the reality of every layer When extraction group number is the most equal with predetermined extraction group number, the most all watermark informations are normal, and document is not attacked, and inquire about data base Rear output original watermark information；Otherwise, error correct is carried out.

Further, described step 4a unanimously and all matches a data storehouse note when the many groups watermark information extracted When record, the actual extracting group number of every layer are the most equal with predetermined extraction group number, it are additionally included in the attribute position in the document second layer and carry During the group two times of number that group number is the watermark information extracted in the attribute position in document ground floor of the watermark information taken out, institute There is watermark information normal.

Further, in document the NoProofing property value of each character and LanguageIDOther property value by system It is predefined for default value, detects the character attibute of each character in the document of watermark information to be extracted one by one, if existing The character that NoProofing property value is different from default value with LanguageIDOther property value, then the document is for embedding watermark The document of information, otherwise, the document is the document being not embedded into watermark information.

Further, described error correct specifically includes following steps:

Step 3a.1: the many groups watermark information extracted by separator, if how group watermark information is not quite identical, and at least one When group watermark information matches a data storehouse record, return the watermark information extracted and point out document damage situations；Otherwise, Turn 3a.2；

Step 3a.2: if how group watermark information the most not with any database record matching, prompting document is impaired seriously, extracts Go out watermark information failure.

Further, a kind of document multiple digital watermarking flush mounting, including acquisition module, generation module, memory module, the One embeds module and second embeds module；

Described acquisition module, for obtaining original watermark information, key and the pending document of user's input；

Described generation module, for utilizing digest algorithm to calculate the summary info in original watermark information, generates new watermark Information, obtains the length of new watermark information position according to new watermark information；

Described memory module, for being stored in original watermark information and new watermark information collectively as a data storehouse record Data base, inquiry original watermark information when being used for extracting watermark；

Described first embeds module, and for the character in document is divided into two-layer, the character according to document ground floor is total The length of watermark information position several, new, obtains the group number of the new watermark information of document ground floor to be embedded, suitable according to from front to back Sequence is respectively embedded in organizing new watermark information in the attribute position in document ground floor more, and organize utilizes separation more between new watermark information Symbol separates；

Described second embeds module, for being respectively embedded in document according to order from back to front by organizing new watermark information more In attribute position in the second layer, organize utilizes separators more between new watermark information, embeds new watermark letter in the document second layer The group number of breath is two times of the group number embedding new watermark information in document ground floor.

Further, a kind of document multiple digital watermarking extraction element, including detection module, extraction module, computing module and Matching module；

Described detection module, for detecting in pending document whether embed watermark information, if it is, all characters are pressed Rule is divided into two-layer, proceeds to extraction module, and otherwise, end processes；

Described extraction module, for extracting watermark information, at the genus of the document second layer in the attribute position of document ground floor Property position in extract watermark information, obtain every layer of actual extracting group number of watermark information extracted according to separator respectively；

Described computing module, for the character sum according to document ground floor, carries in the ground floor and the second layer of document The length of the watermark information position taken out, respectively obtains the predetermined extraction group of the watermark information embedding document ground floor and the second layer Number；

Described matching module, for when the many groups watermark information that extract unanimously and all match a data storehouse record, When the actual extracting group number of every layer is the most equal with predetermined extraction group number, the most all watermark informations are normal, and document is not attacked, Original watermark information is exported after inquiry data base；Otherwise, error correct is carried out.

Accompanying drawing explanation

Fig. 1 is document multiple digital watermarking embedding grammar flow chart of the present invention；

Fig. 2 is document multiple digital watermarking extracting method flow chart of the present invention；

Fig. 3 is document multiple digital watermarking flush mounting structure chart of the present invention；

Fig. 4 is document multiple digital watermarking extraction element structure chart of the present invention.

In accompanying drawing, the list of parts representated by each label is as follows:

1, acquisition module, 2, generation module, 3, memory module, 4, first embeds module, and 5, second embeds module, and 6, detection Module, 7, extraction module, 8, matching module.

Detailed description of the invention

Being described principle and the feature of the present invention below in conjunction with accompanying drawing, example is served only for explaining the present invention, and Non-for limiting the scope of the present invention.

As it is shown in figure 1, be document multiple digital watermarking embedding grammar flow chart of the present invention；Fig. 2 is that document of the present invention is multiple Digital watermarking extracting method flow chart；Fig. 3 is document multiple digital watermarking flush mounting structure chart of the present invention；Fig. 4 is the present invention Document multiple digital watermarking extraction element structure chart.

Embodiment 1

A kind of document multiple digital watermarking embedding grammar, comprises the following steps:

Character in document is divided into the method for two-layer and specifically includes following steps:

Described separator is the binary sequence of the non-visible character being arbitrarily of little use in Unicode coding.

Specifically include following steps by organizing the different attribute position that new watermark information is respectively embedded in document more:

A kind of document multiple digital watermarking extracting method, comprises the following steps:

In described step 4a when the many groups watermark information extracted unanimously and all match a data storehouse record, every layer When actual extracting group number is the most equal with predetermined extraction group number, it is additionally included in the attribute position in the document second layer watermark extracted During the group two times of number that group number is the watermark information extracted in the attribute position in document ground floor of information, all watermark informations Normally.

In described step 1a, the detection method of watermark information is:

In document, NoProofing property value and the LanguageIDOther property value of each character are predefined for writing from memory by system Recognize value, detect the character attibute of each character in the document of watermark information to be extracted one by one, if there is NoProofing property value The character different from default value with LanguageIDOther property value, then the document is the document embedding watermark information, otherwise, The document is the document being not embedded into watermark information.

Described error correct specifically includes following steps:

A kind of document multiple digital watermarking flush mounting, including acquisition module 1, generation module 2, memory module 3, first is embedding Enter module 4 and second and embed module 5；

Described acquisition module 1, for obtaining original watermark information, key and the pending document of user's input；

Described generation module 2, for utilizing digest algorithm to calculate the summary info in original watermark information, generates new watermark Information, obtains the length of new watermark information position according to new watermark information；

Described memory module 3, for depositing original watermark information and new watermark information collectively as a data storehouse record Enter data base, inquiry original watermark information when being used for extracting watermark；

Described first embeds module 4, and for the character in document is divided into two-layer, the character according to document ground floor is total The length of watermark information position several, new, obtains the group number of the new watermark information of document ground floor to be embedded, suitable according to from front to back Sequence is respectively embedded in organizing new watermark information in the attribute position in document ground floor more, and organize utilizes separation more between new watermark information Symbol separates；

Described second embeds module 5, for being respectively embedded in literary composition according to order from back to front by organizing new watermark information more In attribute position in the shelves second layer, organize utilizes separators more between new watermark information, embeds new watermark in the document second layer The group number of information is two times of the group number embedding new watermark information in document ground floor.

A kind of document multiple digital watermarking extraction element, including detection module 6, extraction module 7, computing module 8 and coupling Module 9；

Described detection module 6, for detecting in pending document whether embed watermark information, if it is, all characters Being divided into two-layer by rule, proceed to extraction module 7, otherwise, end processes；

Described extraction module 7, for extracting watermark information, at the genus of the document second layer in the attribute position of document ground floor Property position in extract watermark information, obtain every layer of actual extracting group number of watermark information extracted according to separator respectively；

Described computing module 8, for the character sum according to document ground floor, carries in the ground floor and the second layer of document The length of the watermark information position taken out, respectively obtains the predetermined extraction group of the watermark information embedding document ground floor and the second layer Number；

Described matching module 9, for when the many groups watermark information that extract unanimously and all match a data storehouse record, When the actual extracting group number of every layer is the most equal with predetermined extraction group number, the most all watermark informations are normal, and document is not attacked, Original watermark information is exported after inquiry data base；Otherwise, error correct is carried out.

In being embodied as, the embedding grammar of the present invention includes following 6 steps:

1) input original watermark information, key and pending Word document；

2) summary info of original watermark is calculated by the message digest algorithm such as MD5 or SHA1, using this as using afterwards Watermark data；

3) watermark information of generation and original watermark information are stored in data base as a record, inquire about when being used for extracting Raw information；

4) all characters of Word document are divided into two-layer, for different layers, watermark information are embedded into different genus Property position；

5) if total number of characters is N, summary info bit length is M, then embed K=N/M group watermark, and group numerical value rounds downwards.Often Separator is needed, as the RLO of the non-visible character of Unicode coding can be chosen as the separation between often group between group watermark Symbol, its value is 0010000000101110.For ground floor character, embed K group watermark according to order from front to back；

6) for second layer character, with step 5,2*K group watermark is embedded according to order from back to front.

Above-mentioned steps 4), 5), 6) be the core of this method.

Step 4), the method for text layering is: obtains the Unicode coding being used as key character, converts thereof into two and enter Sequence processed, take last two as key.Simultaneously in telescopiny, obtain text character Unicode coding one by one, also by it It is converted into binary sequence, takes last two, carry out xor operation with key, if

● result is 00,10, then be divided into ground floor, revises NoProofing position；

● result is 01,11, then be divided into the second layer, revises LanguageIDOther position.

Step 5), 6), this method uses the OLE interfacing of Microsoft official, it is achieved the operation to character attibute.Embed water The ultimate principle of print is to utilize the attribute of single word: NoProofing and LanguageIDOther in Word document.The two The effect of attribute is as follows: for Selection object (such as single character etc.), if this value of NoProofing attribute is True, Then spelling and grammar checking tools will ignore the word specified；The LanguageIDOther attribute of character, this attribute position can set Being set to the enumerated value of the less language of number of users, Microsoft recommends this attribute and arranges or return at Microsoft Word The language used by document Chinese and western language word that language version is created from right to left.LanguageIDOther attribute has 64 Enumerated value, through research screening, this method choose wherein three less language of number of users enumerated value (wdBasque, WdVenda, wdEstonian) as modified values, the most each character can embed two watermark bit, and the second layer can embed ground floor Twice information, thus improve watermark capacity.Two above character attibute have by programming could find, add and revise Feature, this watermark feature can not be removed in the operation of common Word program, possess stronger disguise and attack resistance Property.Repeating to embed is repeatedly to improve its robustness, even if suffering that deleting amendment etc. attacks, as long as it is normal for having one group of watermark , then just can recover the raw information of watermark.

The extracting method of watermark is the inverse process of embedding grammar, for:

1) all characters of Word document to be detected are divided into two-layer by rule；

2) character to every layer is pressed embedding method one by one and is read data, obtains n group watermark information；

3) consistent when n group watermark, it is possible to when matching a data storehouse record, then to may indicate that all watermark informations are complete Complete normal, document is not attacked, and exports original watermark information after inquiry data base.Otherwise, error correct algorithm is turned.

The error correction method of watermark is:

1) pressing the n group watermark that separator extracts, if n group is not quite identical, but at least 1 group watermark matches is to a data During the record of storehouse, destroy as document suffers that increase, deletion character etc. are attacked, return the watermark information extracted and point out document impaired Situation.Otherwise, 2 are turned；

2) n group watermark all non-matched data storehouse records, represent that often group watermark information is destroyed in various degree, points out document Impaired seriously, it is impossible to extract watermark information.

The detection method of watermark is:

In Word, the default value of NoProoing and LanguageIDOther of each character is respectively FALSE and 1033 (wdEnglishUS), the character attibute of input is detected one by one, if there is the two attribute not for the character of default value, then should Document is the document embedding watermark.

Good effect

The character attibute embedding watermark information is invisible attribute, so being visually can not perception after embedding watermark , there is good disguise.

For from statistical theory, every layer of number of characters averagely accounts for 50%, and 100 characters are divided into two-layer, every layer of average mark There are not 50 characters, then watermark capacity is exactly 150%.Through experimental demonstration, result is as shown in table 1, and actual watermark capacity is close 150%.It is greatly improved relative to other text watermarking algorithm, as shown in table 2.

When embedding watermark, original watermark information message digest algorithm is encrypted, even if obtaining the watermark information embedded, Original watermark information can not be obtained, improve the safety of watermark.It addition, use key to be layered, if extracted The key of input error, then will mistake be layered, causes the attribute dislocation extracted, then will can not get watermark information, from And further increase the safety of watermark.

If the document after embedding watermark suffers to increase, delete the attacks such as character, after extracting watermark information, according to separation Symbol judges, as in figure 2 it is shown, underscore is separator, rectangle frame is watermark information.If there being complete watermark to believe after separator Breath, then extracted, can be ensured the robustness of water mark method to a certain extent.

Table 1 watermark capacity is added up

Table 2 text watermarking algorithm capacity comparison

The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all spirit in the present invention and Within principle, any modification, equivalent substitution and improvement etc. made, should be included within the scope of the present invention.

Claims

1. a document multiple digital watermarking embedding grammar, it is characterised in that comprise the following steps:

Step 1: obtain the original watermark information of user's input and pending document；

Step 2: utilize digest algorithm to calculate the summary info in original watermark information, generate new watermark information, according to new watermark Information obtains the length of new watermark information position；

Step 3: original watermark information and new watermark information are stored in data base collectively as a data storehouse record, are used for extracting Original watermark information is inquired about during watermark；

Step 4: the character in document is divided into two-layer, the length of watermark information position total according to the character of document ground floor, new Degree, obtains the group number of the new watermark information of document ground floor to be embedded, will organize new watermark information according to order from front to back more Being respectively embedded in the attribute position in document ground floor, organize utilizes separators between new watermark information more；

Step 5: be respectively embedded in organizing new watermark information in the attribute position in the document second layer according to order from back to front more, Organize utilizes separators more between new watermark information, embed in the document second layer group number of new watermark information for embedding document the The group two times of number of new watermark information in one layer.

Document multiple digital watermarking embedding grammar the most according to claim 1, it is characterised in that: the character in document is drawn The method being divided into two-layer specifically includes following steps:

Obtaining the Unicode coding of the character being used as key, the Unicode of the character that will act as key is encoded translated for binary system Sequence, using binary sequence last two as key sequence；

Obtain the Unicode coding of all characters in document, the Unicode of character each in document is encoded and is separately converted to two System sequence；

The binary sequence that each character changes into document respectively by key sequence carries out xor operation, if result be 00, 10, then this character is divided into document ground floor；If result is 01,11, then it is divided into the document second layer.

Document multiple digital watermarking embedding grammar the most according to claim 1, it is characterised in that: described separator is The binary sequence of the non-visible character being arbitrarily of little use in Unicode coding.

Document multiple digital watermarking embedding grammar the most according to claim 1, it is characterised in that: new watermark information will be organized more The different attribute position being respectively embedded in document specifically includes following steps:

For ground floor, it is respectively modified the NoProofing property value of all characters in ground floor, if new water the most to be embedded Official seal breath is 1, then NoProofing property value is revised as True, otherwise, keeps original value False constant；

For the second layer, it is respectively modified the LanguageIDOther property value of all characters in the second layer, if the most to be embedded New watermark information is 00, then keep original value constant, if new watermark information position the most to be embedded is 01, then revise LanguageIDOther property value is wdBasque, if new watermark information position the most to be embedded is 10, then revises LanguageIDOther property value is wdVenda, if new watermark information position the most to be embedded is 11, then revises LanguageIDOther property value is wdEstonian.

5. a document multiple digital watermarking extracting method, it is characterised in that comprise the following steps:

Step 1a: detect in pending document whether embed watermark information, if it is, all characters are divided into two-layer by rule, Proceeding to step 2a, otherwise, end processes；

Step 2a: extract watermark information in the attribute position of document ground floor, extracts watermark in the attribute position of the document second layer Information, obtains the actual extracting group number of every layer of watermark information extracted respectively according to separator；

Step 3a: according to the character of document ground floor sum, the watermark information that extracts in the ground floor and the second layer of document The length of position, respectively obtains the predetermined extraction group number of the watermark information embedding document ground floor and the second layer；

Step 4a: when the many groups watermark information extracted unanimously and all matches a data storehouse record, the actual extracting of every layer When group number is the most equal with predetermined extraction group number, the most all watermark informations are normal, and document is not attacked, defeated after inquiry data base Go out original watermark information；Otherwise, error correct is carried out.

Document multiple digital watermarking extracting method the most according to claim 5, it is characterised in that: when carrying in described step 4a The many groups watermark information taken out unanimously and all matches a data storehouse record, the actual extracting group number of every layer and predetermined extraction group When number is the most equal, it is additionally included in the attribute position in the document second layer group number of the watermark information extracted at document ground floor In attribute position in extract the group two times of number of watermark information time, all watermark informations are normal.

Document multiple digital watermarking extracting method the most according to claim 5, it is characterised in that watermark in described step 1a The detection method of information is:

In document, NoProofing property value and the LanguageIDOther property value of each character are predefined for acquiescence by system Value, detect the character attibute of each character in the document of watermark information to be extracted one by one, if exist NoProofing property value and The character that LanguageIDOther property value is different from default value, then the document is the document embedding watermark information, otherwise, should Document is the document being not embedded into watermark information.

Document multiple digital watermarking extracting method the most according to claim 5, it is characterised in that: described error correct is concrete Comprise the following steps:

Step 3a.1: the many groups watermark information extracted by separator, if how group watermark information is not quite identical, and least one set water When official seal breath matches a data storehouse record, return the watermark information extracted and point out document damage situations；Otherwise, turn 3a.2；

Step 3a.2: if how group watermark information the most not with any database record matching, prompting document is impaired seriously, extracts water outlet Official seal ceases unsuccessfully.

9. a document multiple digital watermarking flush mounting, it is characterised in that: include acquisition module (1), generation module (2), deposit Storage module (3), first embeds module (4) and second embeds module (5)；

Described acquisition module (1), for obtaining the original watermark information of user's input and pending document；

Described generation module (2), for utilizing digest algorithm to calculate the summary info in original watermark information, generates new watermark letter Breath, obtains the length of new watermark information position according to new watermark information；

Described memory module (3), for being stored in original watermark information and new watermark information collectively as a data storehouse record Data base, inquiry original watermark information when being used for extracting watermark；

Described first embeds module (4), and for the character in document is divided into two-layer, the character according to document ground floor is total The length of watermark information position several, new, obtains the group number of the new watermark information of document ground floor to be embedded, suitable according to from front to back Sequence is respectively embedded in organizing new watermark information in the attribute position in document ground floor more, and organize utilizes separation more between new watermark information Symbol separates；

Described second embeds module (5), for being respectively embedded in document according to order from back to front by organizing new watermark information more In attribute position in the second layer, organize utilizes separators more between new watermark information, embeds new watermark letter in the document second layer The group number of breath is two times of the group number embedding new watermark information in document ground floor.

10. a document multiple digital watermarking extraction element, it is characterised in that: include detection module (6), extraction module (7), meter Calculate module (8) and matching module (9)；

Described detection module (6), for detecting in pending document whether embed watermark information, if it is, all characters are pressed Rule is divided into two-layer, proceeds to extraction module (7), and otherwise, end processes；

Described extraction module (7), for extracting watermark information, at the attribute of the document second layer in the attribute position of document ground floor Extract watermark information in Wei, obtain the actual extracting group number of every layer of watermark information extracted respectively according to separator；

Described computing module (8), for the character sum according to document ground floor, extracts in the ground floor and the second layer of document The length of the watermark information position gone out, respectively obtains the predetermined extraction group number of the watermark information embedding document ground floor and the second layer；

Described matching module (9), records, often for unanimously and all matching a data storehouse when the many groups watermark information extracted When the actual extracting group number of layer is the most equal with predetermined extraction group number, the most all watermark informations are normal, and document is not attacked, and looks into Original watermark information is exported after asking data base；Otherwise, error correct is carried out.