CN103761459B - A kind of document multiple digital watermarking embedding, extracting method and device - Google Patents
A kind of document multiple digital watermarking embedding, extracting method and device Download PDFInfo
- Publication number
- CN103761459B CN103761459B CN201410035906.7A CN201410035906A CN103761459B CN 103761459 B CN103761459 B CN 103761459B CN 201410035906 A CN201410035906 A CN 201410035906A CN 103761459 B CN103761459 B CN 103761459B
- Authority
- CN
- China
- Prior art keywords
- document
- watermark information
- layer
- new
- character
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000000605 extraction Methods 0.000 claims description 32
- 239000000284 extract Substances 0.000 claims description 20
- 238000001514 detection method Methods 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 8
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 6
- 230000001771 impaired effect Effects 0.000 claims description 5
- 238000000926 separation method Methods 0.000 claims description 5
- 206010034719 Personality change Diseases 0.000 claims description 3
- 229910002056 binary alloy Inorganic materials 0.000 claims 1
- 230000008859 change Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000006073 displacement reaction Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 101100217298 Mus musculus Aspm gene Proteins 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/10—Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
- G06F21/16—Program or content traceability, e.g. by watermarking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/126—Character encoding
Abstract
The present invention relates to a kind of document multiple digital watermarking embedding, extracting method and device, document multiple digital watermarking embedding grammar, comprise the following steps: obtain original watermark information, key and the pending document of user's input;Calculate the summary info in original watermark information, generate new watermark information;Original watermark information and new watermark information are stored in data base collectively as a data storehouse record;Character in document is divided into two-layer, the length of the watermark information position total, new of the character according to document ground floor, obtain the group number of the new watermark information of document ground floor to be embedded, according to order be from front to back embedded into organizing new watermark information in attribute position in document ground floor more;It is embedded into organizing new watermark information in the attribute position in the document second layer according to order from back to front more.Present invention character attibute based on Word format document, uses key to improve safety, repeats embedding and strengthen robustness, and multiple embedding improves watermark capacity.
Description
Technical field
The present invention relates to digital watermarking field, particularly to a kind of document multiple digital watermarking embedding, extracting method and dress
Put.
Background technology
In recent years, along with developing rapidly of multimedia and internet, the copyright of protection copyright becomes current science
One much-talked-about topic of boundary's research.Digital watermarking is as the important research direction of Information Hiding Techniques, at text, video, audio frequency
Deng multimedia copyright protection aspect, there is important value.Digital watermarking is by embedding for the copyright informations such as serial number, word, logos
Enter in multi-medium data, to play the effects such as copyright protection, confidential corespondence, the real and fake discrimination of data file and product marking.
The Text Watermarking method that existing availability is higher mainly has Text Watermarking based on form and based on natural language
This two big class of Text Watermarking.It is the most class text watermark occurred up to now based on format text watermark, from initial
Row displacement, word displacement, feature coding, develop the change method such as font size, color finally, the water mark method of this type
Study the most active, but the method exists the weak points such as weak in safety, watermark capacity is low.Text based on natural language
Watermark is early than within 02 year, being proposed by Mikhail.J.Atallah and VictorRaskin of Purdue university of the U.S. et al..Main
If adding watermark information by changing the method such as sentence structure, synonym replacement.Natural language digital watermarking changes literary composition
This content, but do not change implication and the form of text, hardly possible after adding watermark is noticeable, and is also not easy to be broken
Bad.But for normative document, because its call format is relatively stricter, this kind of method may change semanteme, thus is not suitable for
The file that call format is strict.The most ripe to the process of natural language additionally, due to computer, this has become based on natural language
The bottleneck of speech Text Watermarking technology.
Summary of the invention
The technical problem to be solved is to provide a kind of character attibute based on Word format document, utilizes key
Improve safety, repeat to embed add strong robustness, multiple embedding improves the document multiple digital watermarking of watermark capacity and embeds, carries
Access method and device.
The technical scheme is that a kind of document multiple digital watermarking embedding grammar, bag
Include following steps:
Step 1: obtain original watermark information, key and the pending document of user's input;
Step 2: utilize digest algorithm to calculate the summary info in original watermark information, generate new watermark information, according to newly
Watermark information obtains the length of new watermark information position;
Step 3: original watermark information and new watermark information are stored in data base collectively as a data storehouse record, are used for
Inquiry original watermark information when extracting watermark;
Step 4: the character in document is divided into two-layer, watermark information position total according to the character of document ground floor, new
Length, obtain the group number of the new watermark information of document ground floor to be embedded, new watermark will be organized according to order from front to back more
Information is respectively embedded in the attribute position in document ground floor, and organize utilizes separators between new watermark information more;
Step 5: the attribute that new watermark information is respectively embedded in the document second layer will be organized according to order from back to front more
In Wei, organize utilizes separators more between new watermark information, and embedding the group number of new watermark information in the document second layer is to embed
The group two times of number of new watermark information in document ground floor.
The invention has the beneficial effects as follows: present invention character attibute based on Word format document, use key to improve peace
Quan Xing, repeats embedding and strengthens robustness, and multiple embedding improves watermark capacity.
On the basis of technique scheme, the present invention can also do following improvement.
Further, the character in document is divided into the method for two-layer and specifically includes following steps:
Obtaining the Unicode coding of the character being used as key, the Unicode of the character that will act as key is encoded translated is two
System sequence, using binary sequence last two as key sequence;
Obtain the Unicode coding of all characters in document, the Unicode of character each in document is encoded and converts respectively
For binary sequence;
The binary sequence that each character changes into document respectively by key sequence carries out xor operation, if result is
00,10, then this character is divided into document ground floor;If result is 01,11, then it is divided into the document second layer.
Further, the binary sequence of the non-visible character that described separator is arbitrarily of little use in being Unicode coding.
Further, specifically include following steps by organizing the different attribute position that new watermark information is respectively embedded in document more:
For ground floor, it is respectively modified the NoProofing property value of all characters in ground floor, if the most to be embedded
New watermark information is 1, then NoProofing property value is revised as True, otherwise, keeps original value False constant;
For the second layer, it is respectively modified the LanguageIDOther property value of all characters in the second layer, if currently treating embedding
The new watermark information entered is 00, then keep original value constant, if new watermark information position the most to be embedded is 01, then revise
LanguageIDOther property value is wdBasque, if new watermark information position the most to be embedded is 10, then revises
LanguageIDOther property value is wdVenda, if new watermark information position the most to be embedded is 11, then revises
LanguageIDOther property value is wdEstonian.
Further, a kind of document multiple digital watermarking extracting method, comprise the following steps:
Step 1a: detect in pending document whether embed watermark information, if it is, all characters are divided into two by rule
Layer, proceeds to step 2a, and otherwise, end processes;
Step 2a: extract watermark information in the attribute position of document ground floor, extract in the attribute position of the document second layer
Watermark information, obtains the actual extracting group number of every layer of watermark information extracted respectively according to separator;
Step 3a: according to the character of document ground floor sum, the watermark that extracts in the ground floor and the second layer of document
The length of information bit, respectively obtains the predetermined extraction group number of the watermark information embedding document ground floor and the second layer;
Step 4a: when the many groups watermark information extracted unanimously and all matches a data storehouse record, the reality of every layer
When extraction group number is the most equal with predetermined extraction group number, the most all watermark informations are normal, and document is not attacked, and inquire about data base
Rear output original watermark information;Otherwise, error correct is carried out.
Further, described step 4a unanimously and all matches a data storehouse note when the many groups watermark information extracted
When record, the actual extracting group number of every layer are the most equal with predetermined extraction group number, it are additionally included in the attribute position in the document second layer and carry
During the group two times of number that group number is the watermark information extracted in the attribute position in document ground floor of the watermark information taken out, institute
There is watermark information normal.
Further, in document the NoProofing property value of each character and LanguageIDOther property value by system
It is predefined for default value, detects the character attibute of each character in the document of watermark information to be extracted one by one, if existing
The character that NoProofing property value is different from default value with LanguageIDOther property value, then the document is for embedding watermark
The document of information, otherwise, the document is the document being not embedded into watermark information.
Further, described error correct specifically includes following steps:
Step 3a.1: the many groups watermark information extracted by separator, if how group watermark information is not quite identical, and at least one
When group watermark information matches a data storehouse record, return the watermark information extracted and point out document damage situations;Otherwise,
Turn 3a.2;
Step 3a.2: if how group watermark information the most not with any database record matching, prompting document is impaired seriously, extracts
Go out watermark information failure.
Further, a kind of document multiple digital watermarking flush mounting, including acquisition module, generation module, memory module, the
One embeds module and second embeds module;
Described acquisition module, for obtaining original watermark information, key and the pending document of user's input;
Described generation module, for utilizing digest algorithm to calculate the summary info in original watermark information, generates new watermark
Information, obtains the length of new watermark information position according to new watermark information;
Described memory module, for being stored in original watermark information and new watermark information collectively as a data storehouse record
Data base, inquiry original watermark information when being used for extracting watermark;
Described first embeds module, and for the character in document is divided into two-layer, the character according to document ground floor is total
The length of watermark information position several, new, obtains the group number of the new watermark information of document ground floor to be embedded, suitable according to from front to back
Sequence is respectively embedded in organizing new watermark information in the attribute position in document ground floor more, and organize utilizes separation more between new watermark information
Symbol separates;
Described second embeds module, for being respectively embedded in document according to order from back to front by organizing new watermark information more
In attribute position in the second layer, organize utilizes separators more between new watermark information, embeds new watermark letter in the document second layer
The group number of breath is two times of the group number embedding new watermark information in document ground floor.
Further, a kind of document multiple digital watermarking extraction element, including detection module, extraction module, computing module and
Matching module;
Described detection module, for detecting in pending document whether embed watermark information, if it is, all characters are pressed
Rule is divided into two-layer, proceeds to extraction module, and otherwise, end processes;
Described extraction module, for extracting watermark information, at the genus of the document second layer in the attribute position of document ground floor
Property position in extract watermark information, obtain every layer of actual extracting group number of watermark information extracted according to separator respectively;
Described computing module, for the character sum according to document ground floor, carries in the ground floor and the second layer of document
The length of the watermark information position taken out, respectively obtains the predetermined extraction group of the watermark information embedding document ground floor and the second layer
Number;
Described matching module, for when the many groups watermark information that extract unanimously and all match a data storehouse record,
When the actual extracting group number of every layer is the most equal with predetermined extraction group number, the most all watermark informations are normal, and document is not attacked,
Original watermark information is exported after inquiry data base;Otherwise, error correct is carried out.
Accompanying drawing explanation
Fig. 1 is document multiple digital watermarking embedding grammar flow chart of the present invention;
Fig. 2 is document multiple digital watermarking extracting method flow chart of the present invention;
Fig. 3 is document multiple digital watermarking flush mounting structure chart of the present invention;
Fig. 4 is document multiple digital watermarking extraction element structure chart of the present invention.
In accompanying drawing, the list of parts representated by each label is as follows:
1, acquisition module, 2, generation module, 3, memory module, 4, first embeds module, and 5, second embeds module, and 6, detection
Module, 7, extraction module, 8, matching module.
Detailed description of the invention
Being described principle and the feature of the present invention below in conjunction with accompanying drawing, example is served only for explaining the present invention, and
Non-for limiting the scope of the present invention.
As it is shown in figure 1, be document multiple digital watermarking embedding grammar flow chart of the present invention;Fig. 2 is that document of the present invention is multiple
Digital watermarking extracting method flow chart;Fig. 3 is document multiple digital watermarking flush mounting structure chart of the present invention;Fig. 4 is the present invention
Document multiple digital watermarking extraction element structure chart.
Embodiment 1
A kind of document multiple digital watermarking embedding grammar, comprises the following steps:
Step 1: obtain original watermark information, key and the pending document of user's input;
Step 2: utilize digest algorithm to calculate the summary info in original watermark information, generate new watermark information, according to newly
Watermark information obtains the length of new watermark information position;
Step 3: original watermark information and new watermark information are stored in data base collectively as a data storehouse record, are used for
Inquiry original watermark information when extracting watermark;
Step 4: the character in document is divided into two-layer, watermark information position total according to the character of document ground floor, new
Length, obtain the group number of the new watermark information of document ground floor to be embedded, new watermark will be organized according to order from front to back more
Information is respectively embedded in the attribute position in document ground floor, and organize utilizes separators between new watermark information more;
Step 5: the attribute that new watermark information is respectively embedded in the document second layer will be organized according to order from back to front more
In Wei, organize utilizes separators more between new watermark information, and embedding the group number of new watermark information in the document second layer is to embed
The group two times of number of new watermark information in document ground floor.
Character in document is divided into the method for two-layer and specifically includes following steps:
Obtaining the Unicode coding of the character being used as key, the Unicode of the character that will act as key is encoded translated is two
System sequence, using binary sequence last two as key sequence;
Obtain the Unicode coding of all characters in document, the Unicode of character each in document is encoded and converts respectively
For binary sequence;
The binary sequence that each character changes into document respectively by key sequence carries out xor operation, if result is
00,10, then this character is divided into document ground floor;If result is 01,11, then it is divided into the document second layer.
Described separator is the binary sequence of the non-visible character being arbitrarily of little use in Unicode coding.
Specifically include following steps by organizing the different attribute position that new watermark information is respectively embedded in document more:
For ground floor, it is respectively modified the NoProofing property value of all characters in ground floor, if the most to be embedded
New watermark information is 1, then NoProofing property value is revised as True, otherwise, keeps original value False constant;
For the second layer, it is respectively modified the LanguageIDOther property value of all characters in the second layer, if currently treating embedding
The new watermark information entered is 00, then keep original value constant, if new watermark information position the most to be embedded is 01, then revise
LanguageIDOther property value is wdBasque, if new watermark information position the most to be embedded is 10, then revises
LanguageIDOther property value is wdVenda, if new watermark information position the most to be embedded is 11, then revises
LanguageIDOther property value is wdEstonian.
A kind of document multiple digital watermarking extracting method, comprises the following steps:
Step 1a: detect in pending document whether embed watermark information, if it is, all characters are divided into two by rule
Layer, proceeds to step 2a, and otherwise, end processes;
Step 2a: extract watermark information in the attribute position of document ground floor, extract in the attribute position of the document second layer
Watermark information, obtains the actual extracting group number of every layer of watermark information extracted respectively according to separator;
Step 3a: according to the character of document ground floor sum, the watermark that extracts in the ground floor and the second layer of document
The length of information bit, respectively obtains the predetermined extraction group number of the watermark information embedding document ground floor and the second layer;
Step 4a: when the many groups watermark information extracted unanimously and all matches a data storehouse record, the reality of every layer
When extraction group number is the most equal with predetermined extraction group number, the most all watermark informations are normal, and document is not attacked, and inquire about data base
Rear output original watermark information;Otherwise, error correct is carried out.
In described step 4a when the many groups watermark information extracted unanimously and all match a data storehouse record, every layer
When actual extracting group number is the most equal with predetermined extraction group number, it is additionally included in the attribute position in the document second layer watermark extracted
During the group two times of number that group number is the watermark information extracted in the attribute position in document ground floor of information, all watermark informations
Normally.
In described step 1a, the detection method of watermark information is:
In document, NoProofing property value and the LanguageIDOther property value of each character are predefined for writing from memory by system
Recognize value, detect the character attibute of each character in the document of watermark information to be extracted one by one, if there is NoProofing property value
The character different from default value with LanguageIDOther property value, then the document is the document embedding watermark information, otherwise,
The document is the document being not embedded into watermark information.
Described error correct specifically includes following steps:
Step 3a.1: the many groups watermark information extracted by separator, if how group watermark information is not quite identical, and at least one
When group watermark information matches a data storehouse record, return the watermark information extracted and point out document damage situations;Otherwise,
Turn 3a.2;
Step 3a.2: if how group watermark information the most not with any database record matching, prompting document is impaired seriously, extracts
Go out watermark information failure.
A kind of document multiple digital watermarking flush mounting, including acquisition module 1, generation module 2, memory module 3, first is embedding
Enter module 4 and second and embed module 5;
Described acquisition module 1, for obtaining original watermark information, key and the pending document of user's input;
Described generation module 2, for utilizing digest algorithm to calculate the summary info in original watermark information, generates new watermark
Information, obtains the length of new watermark information position according to new watermark information;
Described memory module 3, for depositing original watermark information and new watermark information collectively as a data storehouse record
Enter data base, inquiry original watermark information when being used for extracting watermark;
Described first embeds module 4, and for the character in document is divided into two-layer, the character according to document ground floor is total
The length of watermark information position several, new, obtains the group number of the new watermark information of document ground floor to be embedded, suitable according to from front to back
Sequence is respectively embedded in organizing new watermark information in the attribute position in document ground floor more, and organize utilizes separation more between new watermark information
Symbol separates;
Described second embeds module 5, for being respectively embedded in literary composition according to order from back to front by organizing new watermark information more
In attribute position in the shelves second layer, organize utilizes separators more between new watermark information, embeds new watermark in the document second layer
The group number of information is two times of the group number embedding new watermark information in document ground floor.
A kind of document multiple digital watermarking extraction element, including detection module 6, extraction module 7, computing module 8 and coupling
Module 9;
Described detection module 6, for detecting in pending document whether embed watermark information, if it is, all characters
Being divided into two-layer by rule, proceed to extraction module 7, otherwise, end processes;
Described extraction module 7, for extracting watermark information, at the genus of the document second layer in the attribute position of document ground floor
Property position in extract watermark information, obtain every layer of actual extracting group number of watermark information extracted according to separator respectively;
Described computing module 8, for the character sum according to document ground floor, carries in the ground floor and the second layer of document
The length of the watermark information position taken out, respectively obtains the predetermined extraction group of the watermark information embedding document ground floor and the second layer
Number;
Described matching module 9, for when the many groups watermark information that extract unanimously and all match a data storehouse record,
When the actual extracting group number of every layer is the most equal with predetermined extraction group number, the most all watermark informations are normal, and document is not attacked,
Original watermark information is exported after inquiry data base;Otherwise, error correct is carried out.
In being embodied as, the embedding grammar of the present invention includes following 6 steps:
1) input original watermark information, key and pending Word document;
2) summary info of original watermark is calculated by the message digest algorithm such as MD5 or SHA1, using this as using afterwards
Watermark data;
3) watermark information of generation and original watermark information are stored in data base as a record, inquire about when being used for extracting
Raw information;
4) all characters of Word document are divided into two-layer, for different layers, watermark information are embedded into different genus
Property position;
5) if total number of characters is N, summary info bit length is M, then embed K=N/M group watermark, and group numerical value rounds downwards.Often
Separator is needed, as the RLO of the non-visible character of Unicode coding can be chosen as the separation between often group between group watermark
Symbol, its value is 0010000000101110.For ground floor character, embed K group watermark according to order from front to back;
6) for second layer character, with step 5,2*K group watermark is embedded according to order from back to front.
Above-mentioned steps 4), 5), 6) be the core of this method.
Step 4), the method for text layering is: obtains the Unicode coding being used as key character, converts thereof into two and enter
Sequence processed, take last two as key.Simultaneously in telescopiny, obtain text character Unicode coding one by one, also by it
It is converted into binary sequence, takes last two, carry out xor operation with key, if
● result is 00,10, then be divided into ground floor, revises NoProofing position;
● result is 01,11, then be divided into the second layer, revises LanguageIDOther position.
Step 5), 6), this method uses the OLE interfacing of Microsoft official, it is achieved the operation to character attibute.Embed water
The ultimate principle of print is to utilize the attribute of single word: NoProofing and LanguageIDOther in Word document.The two
The effect of attribute is as follows: for Selection object (such as single character etc.), if this value of NoProofing attribute is True,
Then spelling and grammar checking tools will ignore the word specified;The LanguageIDOther attribute of character, this attribute position can set
Being set to the enumerated value of the less language of number of users, Microsoft recommends this attribute and arranges or return at Microsoft Word
The language used by document Chinese and western language word that language version is created from right to left.LanguageIDOther attribute has 64
Enumerated value, through research screening, this method choose wherein three less language of number of users enumerated value (wdBasque,
WdVenda, wdEstonian) as modified values, the most each character can embed two watermark bit, and the second layer can embed ground floor
Twice information, thus improve watermark capacity.Two above character attibute have by programming could find, add and revise
Feature, this watermark feature can not be removed in the operation of common Word program, possess stronger disguise and attack resistance
Property.Repeating to embed is repeatedly to improve its robustness, even if suffering that deleting amendment etc. attacks, as long as it is normal for having one group of watermark
, then just can recover the raw information of watermark.
The extracting method of watermark is the inverse process of embedding grammar, for:
1) all characters of Word document to be detected are divided into two-layer by rule;
2) character to every layer is pressed embedding method one by one and is read data, obtains n group watermark information;
3) consistent when n group watermark, it is possible to when matching a data storehouse record, then to may indicate that all watermark informations are complete
Complete normal, document is not attacked, and exports original watermark information after inquiry data base.Otherwise, error correct algorithm is turned.
The error correction method of watermark is:
1) pressing the n group watermark that separator extracts, if n group is not quite identical, but at least 1 group watermark matches is to a data
During the record of storehouse, destroy as document suffers that increase, deletion character etc. are attacked, return the watermark information extracted and point out document impaired
Situation.Otherwise, 2 are turned;
2) n group watermark all non-matched data storehouse records, represent that often group watermark information is destroyed in various degree, points out document
Impaired seriously, it is impossible to extract watermark information.
The detection method of watermark is:
In Word, the default value of NoProoing and LanguageIDOther of each character is respectively FALSE and 1033
(wdEnglishUS), the character attibute of input is detected one by one, if there is the two attribute not for the character of default value, then should
Document is the document embedding watermark.
Good effect
The character attibute embedding watermark information is invisible attribute, so being visually can not perception after embedding watermark
, there is good disguise.
For from statistical theory, every layer of number of characters averagely accounts for 50%, and 100 characters are divided into two-layer, every layer of average mark
There are not 50 characters, then watermark capacity is exactly 150%.Through experimental demonstration, result is as shown in table 1, and actual watermark capacity is close
150%.It is greatly improved relative to other text watermarking algorithm, as shown in table 2.
When embedding watermark, original watermark information message digest algorithm is encrypted, even if obtaining the watermark information embedded,
Original watermark information can not be obtained, improve the safety of watermark.It addition, use key to be layered, if extracted
The key of input error, then will mistake be layered, causes the attribute dislocation extracted, then will can not get watermark information, from
And further increase the safety of watermark.
If the document after embedding watermark suffers to increase, delete the attacks such as character, after extracting watermark information, according to separation
Symbol judges, as in figure 2 it is shown, underscore is separator, rectangle frame is watermark information.If there being complete watermark to believe after separator
Breath, then extracted, can be ensured the robustness of water mark method to a certain extent.
Table 1 watermark capacity is added up
Table 2 text watermarking algorithm capacity comparison
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all spirit in the present invention and
Within principle, any modification, equivalent substitution and improvement etc. made, should be included within the scope of the present invention.
Claims (10)
1. a document multiple digital watermarking embedding grammar, it is characterised in that comprise the following steps:
Step 1: obtain the original watermark information of user's input and pending document;
Step 2: utilize digest algorithm to calculate the summary info in original watermark information, generate new watermark information, according to new watermark
Information obtains the length of new watermark information position;
Step 3: original watermark information and new watermark information are stored in data base collectively as a data storehouse record, are used for extracting
Original watermark information is inquired about during watermark;
Step 4: the character in document is divided into two-layer, the length of watermark information position total according to the character of document ground floor, new
Degree, obtains the group number of the new watermark information of document ground floor to be embedded, will organize new watermark information according to order from front to back more
Being respectively embedded in the attribute position in document ground floor, organize utilizes separators between new watermark information more;
Step 5: be respectively embedded in organizing new watermark information in the attribute position in the document second layer according to order from back to front more,
Organize utilizes separators more between new watermark information, embed in the document second layer group number of new watermark information for embedding document the
The group two times of number of new watermark information in one layer.
Document multiple digital watermarking embedding grammar the most according to claim 1, it is characterised in that: the character in document is drawn
The method being divided into two-layer specifically includes following steps:
Obtaining the Unicode coding of the character being used as key, the Unicode of the character that will act as key is encoded translated for binary system
Sequence, using binary sequence last two as key sequence;
Obtain the Unicode coding of all characters in document, the Unicode of character each in document is encoded and is separately converted to two
System sequence;
The binary sequence that each character changes into document respectively by key sequence carries out xor operation, if result be 00,
10, then this character is divided into document ground floor;If result is 01,11, then it is divided into the document second layer.
Document multiple digital watermarking embedding grammar the most according to claim 1, it is characterised in that: described separator is
The binary sequence of the non-visible character being arbitrarily of little use in Unicode coding.
Document multiple digital watermarking embedding grammar the most according to claim 1, it is characterised in that: new watermark information will be organized more
The different attribute position being respectively embedded in document specifically includes following steps:
For ground floor, it is respectively modified the NoProofing property value of all characters in ground floor, if new water the most to be embedded
Official seal breath is 1, then NoProofing property value is revised as True, otherwise, keeps original value False constant;
For the second layer, it is respectively modified the LanguageIDOther property value of all characters in the second layer, if the most to be embedded
New watermark information is 00, then keep original value constant, if new watermark information position the most to be embedded is 01, then revise
LanguageIDOther property value is wdBasque, if new watermark information position the most to be embedded is 10, then revises
LanguageIDOther property value is wdVenda, if new watermark information position the most to be embedded is 11, then revises
LanguageIDOther property value is wdEstonian.
5. a document multiple digital watermarking extracting method, it is characterised in that comprise the following steps:
Step 1a: detect in pending document whether embed watermark information, if it is, all characters are divided into two-layer by rule,
Proceeding to step 2a, otherwise, end processes;
Step 2a: extract watermark information in the attribute position of document ground floor, extracts watermark in the attribute position of the document second layer
Information, obtains the actual extracting group number of every layer of watermark information extracted respectively according to separator;
Step 3a: according to the character of document ground floor sum, the watermark information that extracts in the ground floor and the second layer of document
The length of position, respectively obtains the predetermined extraction group number of the watermark information embedding document ground floor and the second layer;
Step 4a: when the many groups watermark information extracted unanimously and all matches a data storehouse record, the actual extracting of every layer
When group number is the most equal with predetermined extraction group number, the most all watermark informations are normal, and document is not attacked, defeated after inquiry data base
Go out original watermark information;Otherwise, error correct is carried out.
Document multiple digital watermarking extracting method the most according to claim 5, it is characterised in that: when carrying in described step 4a
The many groups watermark information taken out unanimously and all matches a data storehouse record, the actual extracting group number of every layer and predetermined extraction group
When number is the most equal, it is additionally included in the attribute position in the document second layer group number of the watermark information extracted at document ground floor
In attribute position in extract the group two times of number of watermark information time, all watermark informations are normal.
Document multiple digital watermarking extracting method the most according to claim 5, it is characterised in that watermark in described step 1a
The detection method of information is:
In document, NoProofing property value and the LanguageIDOther property value of each character are predefined for acquiescence by system
Value, detect the character attibute of each character in the document of watermark information to be extracted one by one, if exist NoProofing property value and
The character that LanguageIDOther property value is different from default value, then the document is the document embedding watermark information, otherwise, should
Document is the document being not embedded into watermark information.
Document multiple digital watermarking extracting method the most according to claim 5, it is characterised in that: described error correct is concrete
Comprise the following steps:
Step 3a.1: the many groups watermark information extracted by separator, if how group watermark information is not quite identical, and least one set water
When official seal breath matches a data storehouse record, return the watermark information extracted and point out document damage situations;Otherwise, turn
3a.2;
Step 3a.2: if how group watermark information the most not with any database record matching, prompting document is impaired seriously, extracts water outlet
Official seal ceases unsuccessfully.
9. a document multiple digital watermarking flush mounting, it is characterised in that: include acquisition module (1), generation module (2), deposit
Storage module (3), first embeds module (4) and second embeds module (5);
Described acquisition module (1), for obtaining the original watermark information of user's input and pending document;
Described generation module (2), for utilizing digest algorithm to calculate the summary info in original watermark information, generates new watermark letter
Breath, obtains the length of new watermark information position according to new watermark information;
Described memory module (3), for being stored in original watermark information and new watermark information collectively as a data storehouse record
Data base, inquiry original watermark information when being used for extracting watermark;
Described first embeds module (4), and for the character in document is divided into two-layer, the character according to document ground floor is total
The length of watermark information position several, new, obtains the group number of the new watermark information of document ground floor to be embedded, suitable according to from front to back
Sequence is respectively embedded in organizing new watermark information in the attribute position in document ground floor more, and organize utilizes separation more between new watermark information
Symbol separates;
Described second embeds module (5), for being respectively embedded in document according to order from back to front by organizing new watermark information more
In attribute position in the second layer, organize utilizes separators more between new watermark information, embeds new watermark letter in the document second layer
The group number of breath is two times of the group number embedding new watermark information in document ground floor.
10. a document multiple digital watermarking extraction element, it is characterised in that: include detection module (6), extraction module (7), meter
Calculate module (8) and matching module (9);
Described detection module (6), for detecting in pending document whether embed watermark information, if it is, all characters are pressed
Rule is divided into two-layer, proceeds to extraction module (7), and otherwise, end processes;
Described extraction module (7), for extracting watermark information, at the attribute of the document second layer in the attribute position of document ground floor
Extract watermark information in Wei, obtain the actual extracting group number of every layer of watermark information extracted respectively according to separator;
Described computing module (8), for the character sum according to document ground floor, extracts in the ground floor and the second layer of document
The length of the watermark information position gone out, respectively obtains the predetermined extraction group number of the watermark information embedding document ground floor and the second layer;
Described matching module (9), records, often for unanimously and all matching a data storehouse when the many groups watermark information extracted
When the actual extracting group number of layer is the most equal with predetermined extraction group number, the most all watermark informations are normal, and document is not attacked, and looks into
Original watermark information is exported after asking data base;Otherwise, error correct is carried out.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410035906.7A CN103761459B (en) | 2014-01-24 | 2014-01-24 | A kind of document multiple digital watermarking embedding, extracting method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410035906.7A CN103761459B (en) | 2014-01-24 | 2014-01-24 | A kind of document multiple digital watermarking embedding, extracting method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103761459A CN103761459A (en) | 2014-04-30 |
CN103761459B true CN103761459B (en) | 2016-08-17 |
Family
ID=50528695
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410035906.7A Active CN103761459B (en) | 2014-01-24 | 2014-01-24 | A kind of document multiple digital watermarking embedding, extracting method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103761459B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023043931A1 (en) * | 2021-09-20 | 2023-03-23 | The Nielsen Company (Us), Llc | Systems, apparatus, and methods to improve watermark detection in acoustic environments |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104376236B (en) * | 2014-12-02 | 2017-08-29 | 上海理工大学 | Scheme self-adaptive digital watermark embedding grammar and extracting method based on camouflage science |
CN104504342B (en) * | 2014-12-04 | 2018-04-03 | 中国科学院信息工程研究所 | Method using invisible character hiding information is encoded based on Unicode |
CN104715168B (en) * | 2015-02-13 | 2018-10-09 | 陈佳阳 | A kind of file security management and control based on digital finger-print and the method and system traced to the source |
CN106803047A (en) * | 2017-01-13 | 2017-06-06 | 中国电建集团成都勘测设计研究院有限公司 | Database water mark labeling method |
CN110874456B (en) * | 2018-08-31 | 2022-04-26 | 浙江大学 | Watermark embedding method, watermark extracting method, watermark embedding device, watermark extracting device and data processing method |
CN109800547B (en) * | 2019-01-09 | 2023-04-07 | 杭州基尔区块链科技有限公司 | Method for quickly embedding and extracting information for WORD document protection and distribution tracking |
CN110414194B (en) * | 2019-07-02 | 2023-08-04 | 南京理工大学 | Text watermark embedding and extracting method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8189861B1 (en) * | 2011-04-05 | 2012-05-29 | Google Inc. | Watermarking digital documents |
CN102708535A (en) * | 2012-05-11 | 2012-10-03 | 宁波大学 | Zero-watermark insertion and extraction methods with multiple keys for digital images |
CN102890760A (en) * | 2012-10-30 | 2013-01-23 | 南京信息工程大学 | Textual zero-knowledge watermark detection method based on asymmetric encryption |
CN103093127A (en) * | 2013-01-21 | 2013-05-08 | 深圳大学 | Method and system of dynamic copyright protection based on sudoku and multiple digital watermarks |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6912294B2 (en) * | 2000-12-29 | 2005-06-28 | Contentguard Holdings, Inc. | Multi-stage watermarking process and system |
-
2014
- 2014-01-24 CN CN201410035906.7A patent/CN103761459B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8189861B1 (en) * | 2011-04-05 | 2012-05-29 | Google Inc. | Watermarking digital documents |
CN102708535A (en) * | 2012-05-11 | 2012-10-03 | 宁波大学 | Zero-watermark insertion and extraction methods with multiple keys for digital images |
CN102890760A (en) * | 2012-10-30 | 2013-01-23 | 南京信息工程大学 | Textual zero-knowledge watermark detection method based on asymmetric encryption |
CN103093127A (en) * | 2013-01-21 | 2013-05-08 | 深圳大学 | Method and system of dynamic copyright protection based on sudoku and multiple digital watermarks |
Non-Patent Citations (2)
Title |
---|
《中文文本多重水印算法应用研究》;袁树雄 等;《计算机工程与应用》;20090501;第96-99页 * |
《英文文本多重数字水印算法设计与实现》;袁树雄 等;《计算机工程》;20060805;第146-148、154页 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023043931A1 (en) * | 2021-09-20 | 2023-03-23 | The Nielsen Company (Us), Llc | Systems, apparatus, and methods to improve watermark detection in acoustic environments |
US11843825B2 (en) | 2021-09-20 | 2023-12-12 | The Nielsen Company (Us), Llc | Systems, apparatus, and methods to improve watermark detection in acoustic environments |
Also Published As
Publication number | Publication date |
---|---|
CN103761459A (en) | 2014-04-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103761459B (en) | A kind of document multiple digital watermarking embedding, extracting method and device | |
Taleby Ahvanooey et al. | A comparative analysis of information hiding techniques for copyright protection of text documents | |
US7730037B2 (en) | Fragile watermarks | |
Xiang et al. | A novel linguistic steganography based on synonym run-length encoding | |
Roy et al. | A novel approach to format based text steganography | |
CN100447812C (en) | Document data waterprint embedded method | |
CN103049682A (en) | Character pitch encoding-based dual-watermark embedded text watermarking method | |
CN110414194B (en) | Text watermark embedding and extracting method | |
CN102096787A (en) | Method and device for hiding information based on word2007 text segmentation | |
Kaur et al. | An existential review on text watermarking techniques | |
CN102855423A (en) | Tracking method and device of literary works | |
Singh et al. | A survey on text based steganography | |
Alginahi et al. | An enhanced Kashida-based watermarking approach for Arabic text-documents | |
Zhang et al. | A novel robust text watermarking for word document | |
CN104050400B (en) | A kind of web page interlinkage guard method that steganography is encoded based on command character | |
CN101923700B (en) | Double-effect digital watermarking method | |
Myers et al. | Signal separation for nonlinear dynamical systems | |
CN102855424A (en) | Digital fingerprint extraction method and device and literary works identification method and device | |
Chaudhary et al. | Text steganography based on feature coding method | |
CN109800547B (en) | Method for quickly embedding and extracting information for WORD document protection and distribution tracking | |
CN102682248B (en) | Watermark embedding and extracting method for ultrashort Chinese text | |
Rui et al. | A multiple watermarking algorithm for texts mixed Chinese and English | |
JP4863017B2 (en) | Information hiding system, apparatus and method | |
Ji et al. | Coverless information hiding method based on the keyword | |
Bashardoost et al. | A novel zero-watermarking scheme for text document authentication |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |