CN110457873A - A kind of watermark embedding and detection method and device - Google Patents

A kind of watermark embedding and detection method and device Download PDF

Info

Publication number
CN110457873A
CN110457873A CN201810432660.5A CN201810432660A CN110457873A CN 110457873 A CN110457873 A CN 110457873A CN 201810432660 A CN201810432660 A CN 201810432660A CN 110457873 A CN110457873 A CN 110457873A
Authority
CN
China
Prior art keywords
watermark
row
sub
embedded
rights file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810432660.5A
Other languages
Chinese (zh)
Other versions
CN110457873B (en
Inventor
董军
李莉
段云峰
王宝晗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongchang (suzhou) Software Technology Co Ltd
China Mobile Communications Group Co Ltd
Original Assignee
Zhongchang (suzhou) Software Technology Co Ltd
China Mobile Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongchang (suzhou) Software Technology Co Ltd, China Mobile Communications Group Co Ltd filed Critical Zhongchang (suzhou) Software Technology Co Ltd
Priority to CN201810432660.5A priority Critical patent/CN110457873B/en
Publication of CN110457873A publication Critical patent/CN110457873A/en
Application granted granted Critical
Publication of CN110457873B publication Critical patent/CN110457873B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/16Program or content traceability, e.g. by watermarking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0021Image watermarking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2201/00General purpose image data processing
    • G06T2201/005Image watermarking
    • G06T2201/0062Embedding of the watermark in text images, e.g. watermarking text documents using letter skew, letter distance or row distance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2201/00General purpose image data processing
    • G06T2201/005Image watermarking
    • G06T2201/0065Extraction of an embedded watermark; Reliable detection

Abstract

This application involves copyright protection technology field more particularly to a kind of watermark embedding and detection method and devices, to solve the problems, such as that current watermark embedding method will increase the capacity overhead of watermark;Watermark embedding method provided by the embodiments of the present application includes: the copyright information for obtaining rights file and copyright owner's offer, and rights file does not have the text file of relation of interdependence between each row content;Based on copyright information and current time stamp, the watermark in rights file to be embedded is generated;At least a watermark is embedded into rights file, wherein any row insertion is sub- watermark in rights file, and it and is determined according to the content of text of the row, hash algorithm and sub- watermark number that sub- watermark, which is split to the watermark,;The corresponding character of each row content of text in rights file is determined using hash algorithm, and rule is hidden according to the corresponding character of each line of text content, the corresponding character string of watermark and preset watermark, watermark information is hidden in rights file.

Description

A kind of watermark embedding and detection method and device
Technical field
This application involves copyright protection technology field more particularly to a kind of watermark embedding and detection method and devices.
Background technique
With the civilization and progress of society, the Copyright Awareness of people also gradually increased, and in order to protect, copyright owner's is legal Equity, digital watermark come into being, and digital watermark refers to the mark of copyright lawful owner and legal possesses the information such as time It is embedded into rights file, to realize anti-fake trace to the source and the technology of copyright protection.
In the prior art, first the watermark in rights file to be embedded is divided to obtain more one's share of expenses for a joint undertaking watermarks, is pressed again later This more one's share of expenses for a joint undertaking watermark is sequentially embedded in each row of rights file according to the stripe sequence of sub- watermark, to guarantee subsequent can detecte To sequence correctly complete watermark, in this way, for the every a line for being embedded in sub- watermark, in addition to needing to be embedded in sub- watermark in the row, also It needs to enter to be embedded with the line number of the next line of sub- watermark in the row write, and space shared by general watermark is all smaller, record row It number is also required to occupy corresponding space, therefore, will increase the capacity overhead of watermark, be not suitable for the limited copyright text of redundant space Part.
Also, if criminal gets a part of sub- watermark, so that it may easily be got according to line number complete Watermark, the concealment of watermark is also bad.
Summary of the invention
The embodiment of the present application provides a kind of watermark embedding and detection method and device, to solve watermark in the prior art Embedding grammar will increase the capacity overhead of watermark, not be suitable for the concealment of the limited rights file of redundant space and watermark not Good problem.
In a first aspect, a kind of watermark embedding method provided by the embodiments of the present application, comprising:
Obtain the copyright information of rights file and copyright owner's offer, wherein rights file does not have between each row content The text file of relation of interdependence based on copyright information and current timestamp information, generates rights file to be embedded later In watermark, and then at least a watermark is embedded into rights file, wherein any row insertion is sub- water in rights file Print, the sub- watermark are split to the watermark, and are according to the content of text of the row, hash algorithm and sub- watermark What number determined, and can use hash algorithm and determine the corresponding character of each row content of text in rights file, further according to each Rule is hidden in the corresponding character of row content of text, the corresponding character string of watermark and preset watermark, and watermark information is hidden in version It weighs in file, in this way, subsequent can be according to hiding watermark information to extracting from rights file when carrying out watermark detection The correctness of watermark is verified.
Using the above scheme, first watermark is split to obtain more one's share of expenses for a joint undertaking watermarks, later, is embedded in an one's share of expenses for a joint undertaking water in any row Print, and the one's share of expenses for a joint undertaking watermark be according to the content of text of the row, hash algorithm and sub- watermark number determine, in this way, be not required to according to The sequence of sub- watermark is embedded in sub- watermark, and reserves line number space in not needing to be expert at and therefore can save the capacity of watermark Expense.In addition, because must have front and back serial relation between the sub- watermark for not requiring each row to be embedded in, even if criminal A part of watermark is got, complete watermark information can not be easily obtained, the concealment of watermark is also relatively good.
Under a kind of possible embodiment, copyright information and timestamp information can be combined, and then to group Copyright information and timestamp information after conjunction are encrypted, are encoded, and the watermark in rights file to be embedded is obtained.
Using aforesaid way, being embedded into the watermark in rights file is that the peace of watermark can be enhanced by encryption Quan Xing, the attack tolerant for improving watermark.
Under a kind of possible embodiment, watermark can be split according to the watermark length that every row is embedded in, and right Divide obtained sub- watermark to be numbered, later, for every a line of sub- watermark to be embedded in rights file, utilizes the text of the row This content, hash algorithm and sub- watermark number determine the number of the sub- watermark to be embedded to the row, will son corresponding with the number Watermark is embedded into the row.
In this way, for every a line of sub- watermark to be embedded in rights file, using the content of text of the row, hash algorithm and Sub- watermark number determines the number of the sub- watermark to be embedded to the row, then sub- watermark corresponding with the number is embedded into the row, It is not required to be embedded in sub- watermark according to the number order of sub- watermark, and reserve line number space in not needing to be expert at, water can be saved The capacity overhead of print.
It, can be according to following to every a line of sub- watermark to be embedded in rights file under a kind of possible embodiment Formula determines the number k of the sub- watermark to be embedded to the row:
K=CHASH%K;
Wherein, CHASHFor the cryptographic Hash of S content of text preceding in the row, S is the integer greater than zero;K is to the watermark The sub- watermark number being split.
Using aforesaid way, it is ensured that each one's share of expenses for a joint undertaking watermark is uniformly embedded into rights file.
Under a kind of possible embodiment, the character for including in each sub- watermark is character visible, then will be any It, can be according to the conversion between preset character visible and invisible character when numbering corresponding sub- watermark and being embedded into corresponding line Each character visible in sub- watermark is converted to invisible character by rule, obtains the corresponding invisible character string of sub- watermark, into And invisible character string is embedded into the designated position in the row.
Using aforesaid way, the watermark being embedded in rights file is all invisible character, will not change rights file Readability, the influence to rights file are smaller.
Under a kind of possible embodiment, to every a line in rights file, it can use hash algorithm and determine the row The cryptographic Hash of content of text, and then the cryptographic Hash and preset value are subjected to XOR operation, determine that the result of XOR operation is the row The corresponding character of content of text.
It, can be according to the corresponding character of each line of text content, the corresponding character of watermark under a kind of possible embodiment Rule is hidden in string and preset watermark, carries out capable exchange to rights file, wherein rule is hidden in watermark are as follows: copyright text after exchange Continuously the character string of the corresponding character composition of several rows is the corresponding character string of watermark in part, and continuous m row before initial row The character string of character composition is the character group of continuous n row after the preset character string for being used to identify watermark start bit, end line At character string be preset for identifying the character string of watermark stop bits, wherein initial row refers in continuous several rows The first row, end line refers to the last line in continuous several rows, and m and n are the integer greater than zero.
Using the above scheme, assigning a character for every a line in rights file later can be corresponding according to each row Character and the corresponding character string of watermark, swap the row in rights file, to should will actually be embedded into rights file In watermark be hidden in rights file, convenient for it is subsequent carry out detection watermark when the verifying watermark extracted from rights file just True property, also, when hiding watermark, therefore the redundant space of rights file is wanted without increasing any information in rights file Ask also smaller.
Second aspect, a kind of watermark embedding device provided by the embodiments of the present application, comprising:
Module is obtained, for obtaining the copyright information of rights file and copyright owner's offer, the rights file is in each row Do not have the text file of relation of interdependence between appearance;
Generation module, for generating the copyright text to be embedded based on the copyright information and current timestamp information Watermark in part;
It is embedded in module, at least a watermark to be embedded into the rights file, wherein the rights file The insertion of middle any row is sub- watermark, and it and is the text according to the row that the sub- watermark, which is split to the watermark, What this content, hash algorithm and sub- watermark number determined;
Hidden module, for determining the corresponding character of each row content of text in the rights file using hash algorithm, Rule is hidden according to the corresponding character of each line of text content, the corresponding character string of the watermark and preset watermark, watermark is believed Breath is hidden in the rights file.
Technical effect brought by any design method can be found in different real in first aspect in the application second aspect Technical effect brought by existing mode, details are not described herein again.
The third aspect, a kind of method of detecting watermarks provided by the embodiments of the present application, comprising:
Rights file is obtained, the rights file does not have the text file of relation of interdependence between each row content, Later, the corresponding character of each row content of text in rights file is determined using hash algorithm, and then according to each line of text content Rule is hidden in corresponding character and preset watermark, determines the watermark hidden in rights file, and extract embedding in rights file The watermark entered, wherein any row extraction is sub- watermark from rights file, and sub- watermark is to be split to obtain to the watermark , and be according to the content of text of the row, hash algorithm and sub- watermark number determine, however, it is determined that the watermark of extraction and hide Watermark is identical, then can parse any watermark and obtain the copyright information of copyright owner and when copyright owner starts to possess the rights file Timestamp information.
Using the above scheme, the watermark that be embedded into rights file can be got simultaneously and is actually embedded in copyright Watermark in file when only two parts of watermarks are identical, just illustrates that watermark detection is correctly, therefore, when the watermark for determining extraction It is identical as hiding watermark, then parse watermark and obtain the copyright information of copyright owner, it is ensured that the rights file propagating source traced back to Confidence level.
Under a kind of possible embodiment, to every a line in rights file, the style of writing sheet is determined using hash algorithm Cryptographic Hash and preset value are carried out XOR operation later by the cryptographic Hash of content, determine that the result of XOR operation is in the style of writing sheet Hold corresponding character.
Under a kind of possible embodiment, rule are hidden according to the corresponding character of each line of text content and preset watermark Then, the character string for determining the corresponding character composition of continuous m row in rights file is the character string for identifying watermark start bit, and It, will be each between m row and n row when continuously the character string of the corresponding character composition of n row is the character string for identifying watermark stop bits The character string of the corresponding character composition of row, is determined as being hidden in the watermark in rights file, wherein m and n is whole greater than zero Number.
Using aforesaid way, the corresponding character of each row content of text in rights file, using these characters and The preset character string for identifying watermark stop bits and the character string for identifying watermark stop bits, so that it may find and be hidden in Watermark in rights file, it is convenient and efficient.
Under a kind of possible embodiment, for the every a line for being embedded in sub- watermark in rights file, according to preset water The watermark length for printing embedded location and the insertion of every row determines the sub- watermark for being embedded into the row, and utilizes the content of text of the row, Kazakhstan Uncommon algorithm and sub- watermark number determine the number for being embedded into the sub- watermark of the row, and then according to the sub- watermark and son for being embedded into each row The number of watermark determines the watermark being embedded into rights file.
Under a kind of possible embodiment, being embedded into each sub- watermark in rights file the character for including is not Character visible, the then watermark length that can be embedded according to preset watermark embedded location and every row, is partitioned into sub- water from the row Corresponding invisible character string is printed, and then according to the transformation rule between preset character visible and invisible character, by character Each invisible character in string is converted to character visible, is determined as being embedded into the sub- water in the row for obtained character visible string Print.
It, can be according to following public affairs to the every a line for being embedded in sub- watermark in rights file under a kind of possible embodiment Formula determines the number k for being embedded into the sub- watermark of the row:
K=CHASH%K;
Wherein, CHASHFor the cryptographic Hash of S content of text preceding in the row, S is the integer greater than zero;K is to the watermark The sub- watermark number being split.
Under a kind of possible embodiment, for each number, the most sub- watermark of frequency of occurrence can be counted, by this Sub- watermark is determined as sub- watermark corresponding with the number, and then the sequence according to number from small to large, and each sub- watermark is spelled It connects, obtains being embedded into the watermark in rights file.
Aforesaid way is used, even if watermark is partially distorted or removed, nor affects on the correct extraction of watermark, and Effective protection can be carried out to rights file.
Under a kind of possible embodiment, watermark is decoded, decrypts the copyright information after being combined and time Stab information, and then to after combination copyright information and timestamp information split to obtain copyright owner copyright information and when Between stab information.
Fourth aspect, a kind of watermark detecting apparatus provided by the embodiments of the present application, comprising:
Module is obtained, for obtaining rights file, the rights file does not have the pass that interdepends between each row content The text file of system;
Determining module, for determining the corresponding character of each row content of text in the rights file using hash algorithm, Rule is hidden according to the corresponding character of each line of text content and preset watermark, determines the water hidden in the rights file Print;
Extraction module, for extracting the watermark being embedded in the rights file, wherein any row from the rights file Extraction is sub- watermark, and the sub- watermark is to be split to the watermark, and be according to the content of text of the row, breathe out What uncommon algorithm and sub- watermark number determined;
Parsing module, for if it is determined that extract watermark with hide watermark it is identical, then parse any watermark and obtain copyright The copyright information of people and copyright owner start to possess the timestamp information when rights file.
Technical effect brought by any design method can be found in different real in the third aspect in the application fourth aspect Technical effect brought by existing mode, details are not described herein again.
5th aspect, a kind of computer provided by the embodiments of the present application, including at least one processing unit and at least one A storage unit, wherein the storage unit is stored with program code, when said program code is executed by the processing unit When, so that the step of computer executes above-mentioned watermark insertion and/or method of detecting watermarks.
6th aspect, a kind of computer readable storage medium provided by the embodiments of the present application, including program code, when described When program code is run on computers, the step of making the computer execute above-mentioned watermark insertion and/or method of detecting watermarks.
These aspects or other aspects of the application can more straightforward in the following description.
Detailed description of the invention
Fig. 1 is the application scenarios schematic diagram of watermark embedding method provided by the embodiments of the present application;
Fig. 2 is watermark embedding method flow chart provided by the embodiments of the present application;
Fig. 3 is another watermark embedding method flow chart provided by the embodiments of the present application;
Fig. 4 is method of detecting watermarks flow chart provided by the embodiments of the present application;
Fig. 5 is a kind of signal of watermark insertion and extraction system based on big data platform provided by the embodiments of the present application Figure;
Fig. 6 is the flow chart of another watermark embedding method provided by the embodiments of the present application;
Fig. 7 is the flow chart of another method of detecting watermarks provided by the embodiments of the present application;
Fig. 8 is the structure chart of watermark embedding device provided by the embodiments of the present application;
Fig. 9 is the structure chart of watermark detecting apparatus provided by the embodiments of the present application;
Figure 10 is provided by the embodiments of the present application for realizing the hard of the computer of watermark insertion and/or method of detecting watermarks Part structural schematic diagram.
Specific embodiment
It will increase the capacity overhead of watermark to solve watermark embedding method in the prior art, be not suitable for redundant space The limited rights file and bad problem of the concealment of watermark, the embodiment of the present application provide a kind of watermark embedding and detection Method and device.
Firstly the need of explanation, the rights file referred in the embodiment of the present application is text file, and in text file Between each line of text content independently of one another, do not have relation of interdependence, such as csv file, to each in such rights file When row content of text is interchangeable, the correctness and readability of content of text in rights file will not influence.
The application scenarios schematic diagram of watermark embedding method provided by the embodiments of the present application is shown referring to Fig. 1, Fig. 1, in figure Including terminal 11 and server 12, terminal such as personal computer, iPad, mobile phone etc., server can be capable of providing interconnection to be any The equipment for netting service.
When it is implemented, user (original owner of rights file) by terminal by the rights file of watermark to be embedded and The copyright information of legitimate buyer is uploaded to server, and the copyright information of legitimate buyer is embedded into copyright by request server In file, server, can be according to copyright information and current after receiving the copyright information of rights file and legitimate buyer Timestamp information, generate the watermark in rights file to be embedded, later, at least a watermark be embedded into rights file.
Specifically, watermark can be split according to the watermark length that every row is embedded in, obtains more one's share of expenses for a joint undertaking watermarks, later, It is true according to the content of text of the row, hash algorithm and sub- watermark number for every a line of sub- watermark to be embedded in rights file The fixed sub- watermark to be embedded to the row, is embedded in this way, being determined using hash algorithm, the content of text of every a line and sub- watermark number To the sub- watermark of the row, do not require there must be front and back serial relation between the sub- watermark being embedded into each row, i.e., in insertion The row number information of next sub- watermark row of insertion need not be written in the row of watermark again, therefore, when being embedded in watermark into rights file Required space is smaller, is more applicable for the limited rights file of redundant space, also, due to the sub- watermark of each row insertion it Between do not have front and back serial relation, so even if criminal gets a part of watermark, can not be easily obtained complete The concealment of watermark information, watermark is also relatively good.
Also, server can also determine the corresponding character of each row content of text in rights file using hash algorithm, And then watermark information is hidden in rights file according to the corresponding character of each line of text content and watermark corresponding character string, with Continue after an action of the bowels when carrying out watermark detection to rights file, it can be according to hiding watermark information to the water extracted from rights file The correctness of print is verified, and here, the watermark information that actual needs is embedded into each rights file is hidden in the copyright In file, without recording one by one in server side, the pressure of server can be reduced.
In practical application, due to the distributed data processing mechanism of big data platform, so that rights file is flat through big data It not can guarantee data sequence after platform processing to be consistent with original sequence, and existing watermark embedding and detection method is stringent at present The record sequence for relying on initial data, therefore, is not suitable for big data platform.And watermark provided by the embodiments of the present application insertion with Detection method does not require the record sequence of file content in rights file, therefore, is highly suitable for big data platform.
It should be noted that above-mentioned application scenarios, which are merely for convenience of related personnel, understands spirit herein and principle, The limitation to the application embodiment application scenarios is not constituted.
Preferred embodiment of the present application is illustrated below in conjunction with Figure of description, it should be understood that described herein Preferred embodiment is only used for describing and explaining the application, is not used to limit the application, and in the absence of conflict, this Shen Please in embodiment and embodiment in feature can be combined with each other.
As shown in Fig. 2, being watermark embedding method flow chart provided by the embodiments of the present application, comprising the following steps:
S201: obtain the copyright information that rights file and copyright owner provide, wherein rights file between each row content not Text file with relation of interdependence.
Wherein, the copyright information that copyright owner provides can be used for identifying a buyer, cell-phone number, identity card such as buyer Deng.
S202: based on copyright information and current timestamp information, the watermark in rights file to be embedded is generated.
Wherein, current timestamp information can be used as time when copyright owner starts to possess rights file.
Specifically, copyright information and timestamp information are combined, later to the copyright information and timestamp after combination Information is encrypted, is encoded, using obtained character string as the watermark in rights file to be embedded.
S203: at least a watermark is embedded into rights file, wherein any row insertion is sub- water in rights file Print, the sub- watermark are split to the watermark, and are according to the content of text of the row, hash algorithm and sub- watermark What number determined.
In order to improve accuracy and efficiency when subsequent watermark detection, it is possible to specify the redundancy of watermark, the redundancy of watermark Degree determines the watermark number being embedded in rights file.Generally, the redundancy of watermark can be according to the line number of rights file It determines, the line number of rights file is more, the redundancy of watermark is bigger;The line number of rights file is fewer, the redundancy of watermark It is smaller.
When it is implemented, firstly, be split according to the watermark length that every row is embedded in watermark, and to the obtained son of segmentation Watermark, which is started from scratch, to be numbered.
For example, the corresponding character string of watermark are as follows: 111000011100, the regular length of every row insertion watermark is 4, then should Watermark can be partitioned into 3 one's share of expenses for a joint undertaking watermarks, be respectively as follows: 1110,0001,1100, the corresponding number of this little watermark is followed successively by 0,1, 2。
Secondly, for every a line of sub- watermark to be embedded in rights file, using the content of text of the row, hash algorithm and Sub- watermark number determines the number of the sub- watermark to be embedded to the row.
For example, can be determined according to the following formula to be embedded to this to every a line of sub- watermark to be embedded in rights file The number k of capable sub- watermark:
K=CHASH%K;
Wherein, CHASHFor the cryptographic Hash of S content of text preceding in the row, S is the integer greater than zero;K is to carry out to watermark Divide obtained sub- watermark number.
Further, sub- watermark corresponding with the number is embedded into the row.
In practical application, the character for including in each sub- watermark is character visible, will son corresponding with any number , can be according to the transformation rule between preset character visible and invisible character when watermark is embedded into the row, it will be in sub- watermark Each character visible be converted to invisible character, obtain the corresponding invisible character string of sub- watermark, and then by invisible character The designated position that string is embedded into the row, in this way, the concealment of watermark can be enhanced, and it is possible to reduce to rights file It influences.
S204: the corresponding character of each row content of text in rights file is determined using hash algorithm, according to each line of text Rule is hidden in the corresponding character of content, the corresponding character string of watermark and preset watermark, and watermark information is hidden in rights file In, to be carried out when carrying out watermark detection according to correctness of the hiding watermark information to the watermark extracted from rights file Verifying.
When it is implemented, can determine the corresponding character of each row content of text in rights file according to following steps:
To every a line in rights file, the cryptographic Hash of the row content of text is determined using hash algorithm, and then by the Kazakhstan Uncommon value carries out XOR operation with preset value, and the result of XOR operation is determined as the corresponding character of row content of text.
For example, the corresponding character i of each row content of text can be determined according to the following formula:
Wherein, LHASHFor the cryptographic Hash of the row content of text.
Further, it is hidden according to the corresponding character of each line of text content, the corresponding character string of watermark and preset watermark Rule carries out capable exchange to rights file, and here, rule is hidden in watermark are as follows: continuous several rows are corresponding in rights file after exchange The character string of character composition be the corresponding character string of watermark, and before initial row the corresponding character composition of continuous m row character Go here and there is the preset character string for identifying watermark start bit, the character string that continuously the corresponding character of n row forms after end line To be preset for identifying the character string of watermark stop bits, wherein initial row refers to the first row in continuous several rows, knot Beam row refers to the last line in continuous several rows, and m and n are the integer greater than zero, also, the value of m and n can phase Together.
It is above-mentioned for character string to identify watermark start bit is m 1, the character string of mark watermark stop bits is n 0 Process can execute according to the following steps:
Step 1: random number generator generates pseudo random number, the starting row in rights file is chosen according to this pseudo random number i。
Step 2: carrying out capable exchange to rights file since the i-th row, make the corresponding character of i~i+m row content of text All become 1, from i+m+1~i+m+LwmThe character string that the corresponding character of row content of text sequentially forms is the corresponding word of watermark Symbol string, from i+m+Lwm+ 1~i+m+n+LwmWhen the corresponding character of+1 row content of text all becomes 0, hiding for watermark is completed, In, LwmFor watermark length.
It should be noted that there is no the successive of determination to execute sequence between above-mentioned steps S203 and S204.
The above process is described in detail below with reference to specific embodiment.
It, can be automatic by copyright information and system after the copyright information that server gets rights file and copyright owner provides The current time stamp message linkage of generation at regular length plaintext string, then to plaintext string using symmetric cryptography, compile The string of binary characters constituted by 0,1 is generated after code, which is determined as to be embedded to the water in rights file Print.
Further, it is determined that the watermark redundancy being embedded into rights file, and it is long according to the fixedly embedded watermark of every row Degree divides watermark character string, obtains more one's share of expenses for a joint undertaking watermark character strings, and each one's share of expenses for a joint undertaking watermark character string is numbered, it Afterwards, for every a line of watermark to be embedded, the number of the row watermark character string to be embedded is determined, by the corresponding son of the number Watermark character string maps are invisible character string, and then are embedded in the invisible character string in the end of line of the row.
Specifically, the above process can be carried out according to process shown in Fig. 3:
S301: dividing watermark, and each one's share of expenses for a joint undertaking watermark obtained to division is numbered in order.
Specifically, according to formula: K=LwmWatermark is divided into K parts by/l, and the K one's share of expenses for a joint undertaking watermark obtained to division is compiled Number, wherein LwmFor the length of watermark character string, l is the watermark length of every row insertion.
S302: reading the row of a sub- watermark to be embedded in rights file, determines the volume of the sub- watermark to be embedded to the row Number.
Specifically, the cryptographic Hash C of content of text before fixing position in row is calculatedHASH, by CHASHTo sub- watermark number K into Row modulo operation obtains the number of the sub- watermark to be embedded to the row: k=CHASH%K.
S303: the corresponding sub- watermark string of the number is mapped as invisible character string, and invisible character string is embedded into The end of line of the row.
For example, the corresponding string of binary characters of certain sub- watermark are as follows: 1001, between preset character visible and invisible character Mapping ruler are as follows: 0-> space, 1-> Tab, then 1001 mapping after invisible character string are as follows: the space Tab space Tab, it Afterwards, which is embedded into end of line.
S304: judge whether there is also the rows of sub- watermark to be embedded in rights file, if so, returning to S302;Otherwise, terminate Watermark insertion.
Here, suppose that the redundancy of watermark is r, the then sub- watermark total number being embedded into rights file are as follows: N=K*r, it Afterwards, it need to only be grasped according to the insertion that the row that certain row selection rule chooses respective numbers from rights file can carry out sub- watermark Make.
Wherein, row selection is regular such as: a sub- watermark is embedded in every 5 rows, or after 10 rows, next 3 Sub- watermark is all embedded in row.It is only citing herein, does not constitute the limitation to sub- watermark row to be embedded is determined in the application.
And it is possible to the corresponding character of each row content of text in rights file is determined using hash algorithm, later, according to The corresponding character of each line of text content, the corresponding character string of watermark and preset watermark hide rule and watermark information are hidden in version It weighs in file.
For example, the corresponding character string of watermark are as follows: 111000011100, at this point, watermark length LwmIt is 12, it is assumed that watermark starting The character string of position are as follows: 1111, the character string of watermark stop bits are as follows: 0000, at this point, m=n=4.
It so, can be according to the corresponding character of each line of text content and watermark after determining the starting row i in rights file Corresponding character string swaps the row in rights file, makes the corresponding character all 1 of i~i+4 row content of text, from i+ 4+1~i+4+LwmThe character string of the corresponding character sequence composition of row content of text is 111000011100, from i+4+Lwm+ 1~i+ 2*4+LwmThe corresponding character all 0 of+1 row content of text.
As shown in figure 4, being method of detecting watermarks flow chart provided by the embodiments of the present application, comprising the following steps:
S401: rights file is obtained, wherein rights file does not have the text of relation of interdependence between each row content File.
S402: the corresponding character of each row content of text in rights file is determined using hash algorithm, according to each line of text Rule is hidden in the corresponding character of content and preset watermark, determines the watermark hidden in rights file.
Specifically, the corresponding character of each row content of text in rights file is determined using hash algorithm, comprising:
To every a line in rights file, the cryptographic Hash of the row content of text is determined using hash algorithm, and then by the Kazakhstan Uncommon value carries out XOR operation with preset value, determines that the result of XOR operation is the corresponding character of row content of text.
For example, the corresponding character i of each row content of text can be calculated according to the following formula:
Wherein, LHASHFor the cryptographic Hash of the row content of text.
Further, rule is hidden according to the corresponding character of each line of text content and preset watermark, determines rights file In the character string of the corresponding character sequence composition of continuous m row content of text be character string for identifying watermark start bit, and even When the character string of the corresponding character sequence composition of continuous n row content of text is the character string for identifying watermark stop bits, by m row The character string of the corresponding character sequence composition of each line of text content is determined as being hidden in the watermark in rights file between n row, Wherein, m and n is the integer greater than zero, and the value of m and n can be identical.
It is each in detection line by line by taking the character string of watermark start bit is m 1, the character string of watermark stop bits is n 0 as an example When the corresponding character of row content of text, however, it is determined that since certain row i, the corresponding character of i~i+m row content of text all 1, i+ m+Lwm+ 1~i+m+n+LwmThe corresponding character all 0 of+1 row content of text, then can be by i+m+1~i+m+LwmIt composes a piece of writing in this Hold the character string that corresponding character sequentially forms and is determined as the corresponding character string of watermark.
S403: the watermark being embedded in rights file is extracted, wherein any row extraction is sub- watermark from rights file, Sub- watermark is split to the watermark, and is according to the content of text of the row, hash algorithm and sub- watermark number Determining.
When it is implemented, for the every a line for being embedded in sub- watermark in rights file position can be embedded according to preset watermark It sets and the watermark length of every row insertion, determination is embedded into the sub- watermark of the row.
Specifically, for the every a line for being embedded in sub- watermark in rights file, according to preset watermark embedded location and every row The watermark length of insertion is partitioned into the corresponding invisible character string of sub- watermark from the row, and then according to preset character visible With the transformation rule between invisible character, each invisible character in invisible character string is converted into character visible, it will Obtained character visible string is determined as being embedded into the sub- watermark in the row.
And it is possible to determine the sub- water for being embedded into the row using the content of text of the row, hash algorithm and sub- watermark number The number of print.
For example, can determine the son for being embedded into the row according to the following formula to the every a line for being embedded in watermark in rights file The number k of watermark:
K=CHASH%K;
Wherein, CHASHFor the cryptographic Hash of S content of text preceding in the row, S is the integer greater than zero;K is to carry out to watermark Divide obtained sub- watermark number.
Further, according to the number for sub- watermark and the sub- watermark for being embedded into each row, determination is embedded into rights file Watermark.
Specifically, for each number, the most sub- watermark of frequency of occurrence is counted, which is determined as and the number According to the sequence of number from small to large, each sub- watermark is spliced later for corresponding sub- watermark, obtains being embedded into copyright text Watermark in part.
For example, the corresponding sub- watermark of number 0 have it is multiple, but frequency of occurrence at most be 1110, then can by 1110 determine For the unique corresponding sub- watermark of number 0;The corresponding sub- watermark of number 1 equally have it is multiple, but frequency of occurrence at most be 0001, then It can be determined as the unique corresponding sub- watermark of number 1 for 0001;The corresponding sub- watermark of number 2 also has multiple, but frequency of occurrence is most Mostly is 1100, then can be determined as the unique corresponding sub- watermark of number 2, and then the sequence according to number from small to large for 1100 1110,0001,1100 are stringed together, obtains watermark character string: 111000011100.
This is because in practical application, it is embedded into the sub- watermark in rights file and is possible to maliciously to be deleted or portion Divide and accidentally delete, thus may cause the type that the sub- watermark type extracted is more than original partition, in order to solve this problem, the application In embodiment, for each number, the most sub- watermark of frequency of occurrence is determined as its corresponding sub- watermark, in this way, even if water Print is partially distorted or is removed, and nor affects on the correct extraction of watermark, and can also carry out effective protection to rights file.
S404: it if it is determined that the watermark extracted is identical as hiding watermark, then parses any watermark and obtains the copyright of copyright owner Information and copyright owner start to possess timestamp information when rights file.
Specifically, any watermark is decoded, decrypts the copyright information after being combined and timestamp information, and to group Copyright information and timestamp information after conjunction split, obtain the copyright information of copyright owner and copyright owner starts to possess copyright text Timestamp information when part.
It should be noted that there is no the successive of determination to execute sequence between above-mentioned steps S402 and S403.
In the embodiment of the present application, the watermark being hidden in rights file is the water that should be actually embedded into rights file Print, the watermark extracted is the watermark being actually embedded in rights file, only when the two is identical, just illustrates the water detected Print is correctly, therefore, when determining that the two is identical, then to parse watermark and obtain copyright owner's information, it can be ensured that according to copyright owner The rights file that information is traced back to disseminates the reliability in source.
It describes in detail below with reference to specific embodiment to the above process.
After server gets rights file, it can use hash algorithm and calculate the corresponding character of every row content of text, when When detecting the character string for identifying watermark start bit, the corresponding character of each line of text content after record, until detecting Stopping when for identifying the character string of watermark stop bits, using the character string of the character sequence recorded composition as water to be compared Print character string.
It is m 1 to identify the character string of watermark start bit, for the character string of mark watermark stop bits is m 0, works as clothes Business device detects that the corresponding character of continuous m row is 1 in rights file, then will be under the corresponding character record of the row since next line Come, when detecting that the corresponding character of continuous m row is 0, stops detection, and then the character that the character sequence recorded is formed String is as string of binary characters to be compared.
For example, the corresponding character of row content of text can be calculated according to the following formula to every a line in rights file I:
Wherein, LHASHFor the cryptographic Hash of the row content of text.
Further, to the every a line for being embedded in sub- watermark in rights file, using the content of text of the row, hash algorithm and Sub- watermark number determines the number for being embedded into the sub- watermark of the row, and the invisible character at row end is converted to visible binary system Character for each number, counts the most sub- watermark of frequency of occurrence, which is determined as corresponding with the number later Sub- watermark, this little watermark is spliced according still further to number sequence from small to large, obtains being actually embedded in rights file In watermark.
When it is implemented, the above process can execute according to the following steps:
Step 1: reading the content of text in rights file line by line, the sub- watermark and sub- watermark being embedded in the row are determined Number.
Specifically, for every a line in rights file, detecting row end whether there is invisible character, and if it exists, then will The text content part of the row and the invisible watermark character portion at end are split, and later, are calculated before the row in S texts The cryptographic Hash C of appearanceHASH, and then calculate the number k=C of the sub- watermark of row insertionHASH%K, wherein K is to be split to watermark Obtained sub- watermark number.
Also, the invisible watermark character portion at row end is converted into visible string of binary characters, which is made It is stored for kth one's share of expenses for a joint undertaking watermark.
Step 2: determining the water being embedded in rights file according to the number of the sub- watermark and sub- watermark that are embedded in each row Print.
Specifically, for each number, count the most sub- watermark of frequency of occurrence, using the sub- watermark as with the number pair This little watermark is together in series according to the sequence of number from small to large by the sub- watermark answered later, so that it may obtain in copyright text The watermark character string being embedded in part.
Further, however, it is determined that watermark character string to be compared is identical with the watermark character string being embedded in rights file, Any binary system watermark character string can be then decrypted using symmetric cryptographic algorithm, the plaintext string after decryption is carried out Plaintext watermark element information is obtained after Gray code, includes that the copyright information of copyright owner's offer and copyright owner start to possess in the information Temporal information when rights file, wherein when temporal information when copyright owner starts to possess rights file can use insertion watermark System timestamp information indicate.
As shown in figure 5, being a kind of watermark insertion and extraction system based on big data platform provided by the embodiments of the present application Schematic diagram, including watermark embedding module and watermark extracting module, in which:
Watermark embedding module, including watermark generation unit and watermark embedder unit, wherein watermark generation unit is used for water Print bit string is converted to binary system watermark character string;Watermark embedder unit, for binary system watermark character string to be embedded into copyright In file.
Watermark extracting module, including watermark extracting unit and watermark recovery unit, wherein watermark extracting unit is used for will be embedding The binary system watermark text string extracting entered in rights file comes out;Watermark recovery unit, for turning binary system watermark character string It is changed to watermark information string, obtains the copyright information of rights file.
Corresponding to Fig. 5, Fig. 6 is the flow chart of another watermark embedding method provided by the embodiments of the present application, comprising:
S601: the copyright information and current time stamp provided according to copyright owner generates binary system watermark character string.
Specifically, the current time stamp information of copyright information copyright owner provided and system automatically generated is connected into solid The plaintext string of measured length uses Advanced Encryption Standard (Advanced Encryption to the plaintext string Standard, AES) the binary system watermark character string that is made of 0,1 character is generated after symmetric cryptography and BASE64 coding.
S602: watermark is divided to obtain more one's share of expenses for a joint undertaking watermarks, and each sub- watermark is numbered.
S603: the row of a sub- watermark to be embedded in rights file is read, and determines the volume for being embedded into the sub- watermark of the row Number.
It is alternatively possible to calculate the MD5 value h of 5 content of text before the rowMD5, by hMD5Modulus is carried out to sub- watermark number K Operation obtains the number for the sub- watermark for being embedded into the row.
S604: being mapped as invisible character string for the sub- watermark string of the corresponding binary system of the number, and by invisible character string It is embedded in the end of the row.
S605: judge whether there is also the rows of sub- watermark to be embedded in rights file, if so, returning to S603;Otherwise, into Enter S606.
S606: the line number of starting row when determining row exchange.
For example, pseudo random number i is generated using random number generator, using i as beginning line number.
S607: the line number based on starting row carries out capable exchange to rights file.
Specifically, by i~i+mCapableValue is all exchanged for 1, from i+m+1~i+m+LwmCapableValue It is exchanged for binary system watermark encoder value, from i+m+Lwm+ 1~i+2*m+Lwm+ 1 rowValue is all exchanged for 0, terminates Watermark insertion, wherein m is setting for identifying the length of watermark initial character string and terminator-string.
Corresponding to Fig. 5, Fig. 7 is the flow chart of another method of detecting watermarks provided by the embodiments of the present application, comprising:
S701: the watermark hidden in rights file is determined.
Specifically, the content of text for reading in rights file line by line, determines the corresponding character of row content of text, if detection To continuous m rowValue be 1, since next line byValue record, until detecting continuous m rowValue when being 0, stop detection, the string of binary characters recorded be denoted as watermark1 (i.e. hiding water Print).
S702: the sub- watermark of each row insertion and the number of sub- watermark are determined.
Optionally, when reading in the content of text of rights file line by line, row end can also be detected with the presence or absence of invisible word Symbol, and if it exists, be then split the invisible watermark character portion at the text content part of the row and end.
Further, MD5 value h is calculated to preceding 5 content of textMD5, then calculate the number k for being embedded into the sub- watermark of the row =hMD5%K, wherein K is sub- watermark number.
Also, the invisible watermark character portion of current row is converted into binary string, is deposited as kth one's share of expenses for a joint undertaking watermark Storage, until the reading of entire file finishes.
S703: according to the number of the sub- watermark of each row insertion and sub- watermark, the watermark being embedded into rights file is determined.
Specifically, for each number, count the most sub- watermark of frequency of occurrence, using the sub- watermark as with the number pair The sub- watermark answered combines sub- watermark according to the sequence of number from small to large, obtains complete binary system watermark character later String, is denoted as watermark2.
S704: judge whether the watermark of hiding watermark and extraction is identical, if so, into S705;Otherwise, it determines watermark Detection failure.
That is, whether compare watermark1 and watermark2 identical, if they are the same, then export watermark1 or watermark2。
S705: it parses any watermark and obtains watermark element information.
Specifically, to the binary system watermark character string of output carry out AES symmetrically decrypt, BASE64 Gray code obtain it is original Plaintext watermark element information, then it is carried out to split available copyright information and timestamp information.
It is not fixed for the record sequence of data in big data platform text-only file, the limited feature of redundant space, this Apply for that invisible character is embedded in copyright text by the watermark embedding method that embodiment provides, it is unrelated with the sequence of the record of data, Required watermark space is smaller, therefore, has very strong applicability to the text-only file in big data platform.
Based on the same inventive concept, it is embedding that a kind of watermark corresponding with watermark embedding method is additionally provided in the embodiment of the present application Enter device, since the principle that the device solves the problems, such as is similar to the embodiment of the present application watermark embedding method, the reality of the device The implementation for the method for may refer to is applied, overlaps will not be repeated.
As shown in figure 8, being the structure chart of watermark embedding device provided by the embodiments of the present application, comprising:
Module 801 is obtained, for obtaining the copyright information of rights file and copyright owner's offer, the rights file is each row Do not have the text file of relation of interdependence between content;
Generation module 802, for generating the copyright to be embedded based on the copyright information and current timestamp information Watermark in file;
It is embedded in module 803, at least a watermark to be embedded into the rights file, wherein the copyright Any row insertion is sub- watermark in file, and it and is according to the row that the sub- watermark, which is split to the watermark, Content of text, hash algorithm and sub- watermark number determine;
Hidden module 804, for determining the corresponding word of each row content of text in the rights file using hash algorithm Symbol hides rule according to the corresponding character of each line of text content, the corresponding character string of the watermark and preset watermark, by watermark Information hiding is in the rights file.
Under a kind of possible embodiment, the generation module 802 is specifically used for:
The copyright information and timestamp information are combined;
To after combination copyright information and timestamp information encrypted, encoded, obtain the water in rights file to be embedded Print.
Under a kind of possible embodiment, the insertion module 803 is specifically used for:
The watermark is split according to the watermark length that every row is embedded in, and the sub- watermark obtained to segmentation is compiled Number;
For every a line of sub- watermark to be embedded in the rights file, using the content of text of the row, hash algorithm and Sub- watermark number determines the number of the sub- watermark to be embedded to the row, and sub- watermark corresponding with the number is embedded into the row.
Under a kind of possible embodiment, to every a line of sub- watermark to be embedded in the rights file, the insertion Module 803 is specifically used for determining the number k of the sub- watermark to be embedded to the row according to the following formula:
K=CHASH%K;
Wherein, CHASHFor the cryptographic Hash of S content of text preceding in the row, S is the integer greater than zero;K is to the watermark The sub- watermark number being split.
Under a kind of possible embodiment, the character for including in each sub- watermark is character visible;The insertion mould Block 803 is specifically used for:
It, will be each visible in the sub- watermark according to the transformation rule between preset character visible and invisible character Character is converted to invisible character, obtains the corresponding invisible character string of the sub- watermark;
The invisible character string is embedded into the designated position in the row.
Under a kind of possible embodiment, the hidden module 804 is specifically used for:
To every a line in the rights file, the cryptographic Hash of the row content of text is determined using hash algorithm;
The cryptographic Hash and preset value are subjected to XOR operation, determine that the result of XOR operation is corresponding for the row content of text Character.
Under a kind of possible embodiment, the hidden module 804 is specifically used for:
Rule is hidden according to the corresponding character of each line of text content, the corresponding character string of the watermark and preset watermark, Capable exchange is carried out to the rights file, rule is hidden in the watermark are as follows: continuous several rows are corresponding in rights file after exchange The character string of character composition is the corresponding character string of the watermark, and the character string that the character of continuous m row forms before initial row Character string for the character composition of continuous n row after the preset character string for identifying watermark start bit, end line is default For identifying the character string of watermark stop bits, wherein initial row refers to the first row in continuous several rows, and end line is Refer to the last line in continuous several rows, m and n are the integer greater than zero.
Similarly, a kind of watermark detecting apparatus corresponding with method of detecting watermarks is additionally provided in the embodiment of the present application, by It is similar to the embodiment of the present application method of detecting watermarks in the principle that the device solves the problems, such as, therefore the implementation of the device may refer to The implementation of method, overlaps will not be repeated.
As shown in figure 9, being the structure chart of watermark embedding device provided by the embodiments of the present application, comprising:
Module 901 is obtained, for obtaining rights file, the rights file does not have between each row content to interdepend The text file of relationship;
Determining module 902, for determining the corresponding word of each row content of text in the rights file using hash algorithm Symbol hides rule according to the corresponding character of each line of text content and preset watermark, determines and hides in the rights file Watermark;
Extraction module 903, for extracting the watermark being embedded in the rights file, wherein appoint from the rights file A line extraction is sub- watermark, and it and is in the text according to the row that the sub- watermark, which is split to the watermark, What appearance, hash algorithm and sub- watermark number determined;
Parsing module 904, for if it is determined that extract watermark with hide watermark it is identical, then parse any watermark and obtain version The copyright information and copyright owner for weighing people start to possess the timestamp information when rights file.
Under a kind of possible embodiment, the determining module 902 is specifically used for:
To every a line in the rights file, the cryptographic Hash of the row content of text is determined using hash algorithm;
The cryptographic Hash and preset value are subjected to XOR operation, determine that the result of XOR operation is corresponding for the row content of text Character.
Under a kind of possible embodiment, the determining module 902 is specifically used for:
Rule is hidden according to the corresponding character of each line of text content and preset watermark, is determined continuous in the rights file The character string of the corresponding character composition of m row is the character string for identifying watermark start bit, and the corresponding character of continuous n row forms Character string when being the character string for identifying watermark stop bits, by the corresponding character composition of row each between the m row and n row Character string is determined as the watermark being hidden in the rights file, wherein m and n is the integer greater than zero.
Under a kind of possible embodiment, the extraction module 903 is specifically used for:
For the every a line for being embedded in sub- watermark in the rights file, it is embedded according to preset watermark embedded location and every row Watermark length, determine and be embedded into the sub- watermark of the row, and is true using the content of text, hash algorithm and sub- watermark number of the row Surely it is embedded into the number of the sub- watermark of the row;
According to the number for sub- watermark and the sub- watermark for being embedded into each row, the watermark being embedded into the rights file is determined.
Under a kind of possible embodiment, it is equal to be embedded into each sub- watermark in the rights file character for including For invisible character;The extraction module 903 is specifically used for:
According to the watermark length that preset watermark embedded location and every row are embedded in, it is corresponding that sub- watermark is partitioned into from the row Invisible character string;
It, can not by each in the character string according to the transformation rule between preset character visible and invisible character See that character is converted to character visible, obtained character visible string is determined as being embedded into the sub- watermark in the row.
Under a kind of possible embodiment, to the every a line for being embedded in sub- watermark in the rights file, the extraction mould Block 903 is specifically used for determining the number k for being embedded into the sub- watermark of the row according to the following formula:
K=CHASH%K;
Wherein, CHASHFor the cryptographic Hash of S content of text preceding in the row, S is the integer greater than zero;K is to the watermark The sub- watermark number being split.
Under a kind of possible embodiment, the extraction module 903 is specifically used for:
For each number, the most sub- watermark of frequency of occurrence is counted, which is determined as corresponding with the number Sub- watermark;
According to the sequence of number from small to large, each sub- watermark is spliced, obtains being embedded into the rights file Watermark.
Under a kind of possible embodiment, the parsing module 904 is specifically used for:
The watermark is decoded, decrypts the copyright information after being combined and timestamp information;
To after combination copyright information and timestamp information split to obtain the copyright information and timestamp information.
As shown in Figure 10, to be provided by the embodiments of the present application for realizing watermark insertion or the computer of method of detecting watermarks Hardware structural diagram, including processor 1010, communication interface 1020, memory 1030 and communication bus 1040, wherein place Device 1010, communication interface 1020 are managed, memory 1030 completes mutual communication by communication bus 1040.
Memory 1030, for storing computer program;
Processor 1010, when for executing the program stored on memory 1030, so that computer executes above-mentioned watermark The step of insertion or method of detecting watermarks.
A kind of computer readable storage medium provided by the embodiments of the present application, including program code, work as said program code When running on computers, the step of making computer execute above-mentioned watermark insertion and/or method of detecting watermarks.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The application is process of the reference according to method, apparatus (system) and computer program product of the embodiment of the present application Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
Although the preferred embodiment of the application has been described, it is created once a person skilled in the art knows basic Property concept, then additional changes and modifications may be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as It selects embodiment and falls into all change and modification of the application range.
Obviously, those skilled in the art can carry out various modification and variations without departing from the essence of the application to the application Mind and range.In this way, if these modifications and variations of the application belong to the range of the claim of this application and its equivalent technologies Within, then the application is also intended to include these modifications and variations.

Claims (32)

1. a kind of watermark embedding method characterized by comprising
Obtain the copyright information that rights file and copyright owner provide, the rights file for each row content between without mutually according to The text file for the relationship of relying;
Based on the copyright information and current timestamp information, the watermark in the rights file to be embedded is generated;
At least a watermark is embedded into the rights file, wherein any row, which is embedded in, in the rights file is Sub- watermark, the sub- watermark be the watermark is split, and be according to the content of text of the row, hash algorithm and What sub- watermark number determined;
The corresponding character of each row content of text in the rights file is determined using hash algorithm, according to each line of text content pair Rule is hidden in character, the corresponding character string of the watermark and the preset watermark answered, and watermark information is hidden in the copyright text In part.
2. the method as described in claim 1, which is characterized in that raw based on the copyright information and current timestamp information At the watermark in the rights file to be embedded, comprising:
The copyright information and timestamp information are combined;
To after combination copyright information and timestamp information encrypted, encoded, obtain the watermark in rights file to be embedded.
3. the method as described in claim 1, which is characterized in that at least a watermark is embedded into the rights file In, comprising:
The watermark is split according to the watermark length that every row is embedded in, and the sub- watermark obtained to segmentation is numbered;
For every a line of sub- watermark to be embedded in the rights file, the content of text, hash algorithm and Zi Shui of the row are utilized The number for printing the sub- watermark that number determines to be embedded to the row, is embedded into the row for sub- watermark corresponding with the number.
4. method as claimed in claim 3, which is characterized in that every a line of sub- watermark to be embedded in the rights file, The number k of the sub- watermark to be embedded to the row is determined according to the following formula:
K=CHASH%K;
Wherein, CHASHFor the cryptographic Hash of S content of text preceding in the row, S is the integer greater than zero;K is to divide the watermark The sub- watermark number cut.
5. method as claimed in claim 3, which is characterized in that the character for including in each sub- watermark is character visible;It will Sub- watermark corresponding with the number is embedded into the row, comprising:
According to the transformation rule between preset character visible and invisible character, by each character visible in the sub- watermark Invisible character is converted to, the corresponding invisible character string of the sub- watermark is obtained;
The invisible character string is embedded into the designated position in the row.
6. the method as described in claim 1, which is characterized in that determine each style of writing in the rights file using hash algorithm The corresponding character of this content, comprising:
To every a line in the rights file, the cryptographic Hash of the row content of text is determined using hash algorithm;
The cryptographic Hash and preset value are subjected to XOR operation, determine that the result of XOR operation is the corresponding word of row content of text Symbol.
7. the method as described in claim 1, which is characterized in that according to the corresponding character of each line of text content, the watermark pair Rule is hidden in the character string answered and preset watermark, and watermark information is hidden in the rights file, comprising:
Rule is hidden according to the corresponding character of each line of text content, the corresponding character string of the watermark and preset watermark, to institute It states rights file and carries out capable exchange, rule is hidden in the watermark are as follows: the corresponding character of continuous several rows in rights file after exchange The character string of composition is the corresponding character string of the watermark, and the character string of the character composition of continuous m row is pre- before initial row If the character string for identifying the character composition of continuous n row after the character string of watermark start bit, end line be preset use In the character string of mark watermark stop bits, wherein initial row refers to the first row in continuous several rows, and end line refers to institute The last line in continuous several rows is stated, m and n are the integer greater than zero.
8. a kind of method of detecting watermarks characterized by comprising
Rights file is obtained, the rights file does not have the text file of relation of interdependence between each row content;
The corresponding character of each row content of text in the rights file is determined using hash algorithm, according to each line of text content pair Rule is hidden in the character answered and preset watermark, determines the watermark hidden in the rights file;
Extract the watermark being embedded in the rights file, wherein any row extraction is sub- watermark, institute from the rights file Stating sub- watermark is split to the watermark, and is according to the content of text of the row, hash algorithm and sub- watermark part Number determination;
If it is determined that the watermark extracted is identical as hiding watermark, then parses any watermark and obtain the copyright information and copyright of copyright owner People starts to possess the timestamp information when rights file.
9. method according to claim 8, which is characterized in that determine each style of writing in the rights file using hash algorithm The corresponding character of this content, comprising:
To every a line in the rights file, the cryptographic Hash of the row content of text is determined using hash algorithm;
The cryptographic Hash and preset value are subjected to XOR operation, determine that the result of XOR operation is the corresponding word of row content of text Symbol.
10. method according to claim 8, which is characterized in that according to the corresponding character of each line of text content and preset water Print hides rule, determines the watermark hidden in the rights file, comprising:
Rule is hidden according to the corresponding character of each line of text content and preset watermark, determines continuous m row in the rights file The character string of corresponding character composition is the character string for identifying watermark start bit, and the corresponding character composition of continuous n row When character string is the character string for identifying watermark stop bits, by the word of the corresponding character composition of row each between the m row and n row Symbol string, is determined as the watermark being hidden in the rights file, wherein m and n is the integer greater than zero.
11. method according to claim 8, which is characterized in that extract the watermark being embedded in the rights file, comprising:
For the every a line for being embedded in sub- watermark in the rights file, the water being embedded according to preset watermark embedded location and every row Length is printed, determines and is embedded into the sub- watermark of the row, and is determined using the content of text of the row, hash algorithm and sub- watermark number embedding Enter the number to the sub- watermark of the row;
According to the number for sub- watermark and the sub- watermark for being embedded into each row, the watermark being embedded into the rights file is determined.
12. method as claimed in claim 11, which is characterized in that be embedded into each sub- watermark in the rights file and wrap The character contained is invisible character;According to the watermark length that preset watermark embedded location and every row are embedded in, determination is embedded into The sub- watermark of the row, comprising:
According to the watermark length that preset watermark embedded location and every row are embedded in, being partitioned into that sub- watermark is corresponding from the row can not See character string;
According to the transformation rule between preset character visible and invisible character, by each invisible word in the character string Symbol is converted to character visible, is determined as being embedded into the sub- watermark in the row for obtained character visible string.
13. method as claimed in claim 11, which is characterized in that the every a line for being embedded in sub- watermark in the rights file, The number k for being embedded into the sub- watermark of the row is determined according to the following formula:
K=CHASH%K;
Wherein, CHASHFor the cryptographic Hash of S content of text preceding in the row, S is the integer greater than zero;K is to divide the watermark The sub- watermark number cut.
14. method as claimed in claim 11, which is characterized in that according to the volume for sub- watermark and the sub- watermark for being embedded into each row Number, determine the watermark being embedded into the rights file, comprising:
For each number, the most sub- watermark of frequency of occurrence is counted, which is determined as sub- water corresponding with the number Print;
According to the sequence of number from small to large, each sub- watermark is spliced, the watermark being embedded into the rights file is obtained.
15. method according to claim 8, which is characterized in that parse any watermark and obtain the copyright information and version of copyright owner Power people starts to possess the timestamp information when rights file, comprising:
The watermark is decoded, decrypts the copyright information after being combined and timestamp information;
To after combination copyright information and timestamp information split to obtain the copyright information and timestamp information.
16. a kind of watermark embedding device characterized by comprising
Obtain module, the copyright information provided for obtaining rights file and copyright owner, the rights file be each row content it Between do not have relation of interdependence text file;
Generation module, for generating in the rights file to be embedded based on the copyright information and current timestamp information Watermark;
It is embedded in module, at least a watermark to be embedded into the rights file, wherein appoint in the rights file A line insertion is sub- watermark, and it and is in the text according to the row that the sub- watermark, which is split to the watermark, What appearance, hash algorithm and sub- watermark number determined;
Hidden module, for determining the corresponding character of each row content of text in the rights file using hash algorithm, according to Rule is hidden in the corresponding character of each line of text content, the corresponding character string of the watermark and preset watermark, and watermark information is hidden It ensconces in the rights file.
17. device as claimed in claim 16, which is characterized in that the generation module is specifically used for:
The copyright information and timestamp information are combined;
To after combination copyright information and timestamp information encrypted, encoded, obtain the watermark in rights file to be embedded.
18. device as claimed in claim 16, which is characterized in that the insertion module is specifically used for:
The watermark is split according to the watermark length that every row is embedded in, and the sub- watermark obtained to segmentation is numbered;
For every a line of sub- watermark to be embedded in the rights file, the content of text, hash algorithm and Zi Shui of the row are utilized The number for printing the sub- watermark that number determines to be embedded to the row, is embedded into the row for sub- watermark corresponding with the number.
19. device as claimed in claim 18, which is characterized in that each of sub- watermark to be embedded in the rights file Row, the insertion module are specifically used for determining the number k of the sub- watermark to be embedded to the row according to the following formula:
K=CHASH%K;
Wherein, CHASHFor the cryptographic Hash of S content of text preceding in the row, S is the integer greater than zero;K is to divide the watermark The sub- watermark number cut.
20. device as claimed in claim 18, which is characterized in that the character for including in each sub- watermark is character visible; The insertion module is specifically used for:
According to the transformation rule between preset character visible and invisible character, by each character visible in the sub- watermark Invisible character is converted to, the corresponding invisible character string of the sub- watermark is obtained;
The invisible character string is embedded into the designated position in the row.
21. device as claimed in claim 16, which is characterized in that the hidden module is specifically used for:
To every a line in the rights file, the cryptographic Hash of the row content of text is determined using hash algorithm;
The cryptographic Hash and preset value are subjected to XOR operation, determine that the result of XOR operation is the corresponding word of row content of text Symbol.
22. device as claimed in claim 16, which is characterized in that the hidden module is specifically used for:
Rule is hidden according to the corresponding character of each line of text content, the corresponding character string of the watermark and preset watermark, to institute It states rights file and carries out capable exchange, rule is hidden in the watermark are as follows: the corresponding character of continuous several rows in rights file after exchange The character string of composition is the corresponding character string of the watermark, and the character string of the character composition of continuous m row is pre- before initial row If the character string for identifying the character composition of continuous n row after the character string of watermark start bit, end line be preset use In the character string of mark watermark stop bits, wherein initial row refers to the first row in continuous several rows, and end line refers to institute The last line in continuous several rows is stated, m and n are the integer greater than zero.
23. a kind of watermark detecting apparatus characterized by comprising
Module is obtained, for obtaining rights file, the rights file does not have relation of interdependence between each row content Text file;
Determining module, for determining the corresponding character of each row content of text in the rights file using hash algorithm, according to Rule is hidden in the corresponding character of each line of text content and preset watermark, determines the watermark hidden in the rights file;
Extraction module, for extracting the watermark being embedded in the rights file, wherein any row extracts from the rights file It is sub- watermark, it and is calculated according to the content of text of the row, Hash that the sub- watermark, which is split to the watermark, What method and sub- watermark number determined;
Parsing module, for if it is determined that extract watermark with hide watermark it is identical, then parse any watermark and obtain copyright owner's Copyright information and copyright owner start to possess the timestamp information when rights file.
24. device as claimed in claim 23, which is characterized in that the determining module is specifically used for:
To every a line in the rights file, the cryptographic Hash of the row content of text is determined using hash algorithm;
The cryptographic Hash and preset value are subjected to XOR operation, determine that the result of XOR operation is the corresponding word of row content of text Symbol.
25. device as claimed in claim 23, which is characterized in that the determining module is specifically used for:
Rule is hidden according to the corresponding character of each line of text content and preset watermark, determines continuous m row in the rights file The character string of corresponding character composition is the character string for identifying watermark start bit, and the corresponding character composition of continuous n row When character string is the character string for identifying watermark stop bits, by the word of the corresponding character composition of row each between the m row and n row Symbol string, is determined as the watermark being hidden in the rights file, wherein m and n is the integer greater than zero.
26. device as claimed in claim 23, which is characterized in that the extraction module is specifically used for:
For the every a line for being embedded in sub- watermark in the rights file, the water being embedded according to preset watermark embedded location and every row Length is printed, determines and is embedded into the sub- watermark of the row, and is determined using the content of text of the row, hash algorithm and sub- watermark number embedding Enter the number to the sub- watermark of the row;
According to the number for sub- watermark and the sub- watermark for being embedded into each row, the watermark being embedded into the rights file is determined.
27. device as claimed in claim 26, which is characterized in that be embedded into each sub- watermark in the rights file and wrap The character contained is invisible character;The extraction module is specifically used for:
According to the watermark length that preset watermark embedded location and every row are embedded in, being partitioned into that sub- watermark is corresponding from the row can not See character string;
According to the transformation rule between preset character visible and invisible character, by each invisible word in the character string Symbol is converted to character visible, is determined as being embedded into the sub- watermark in the row for obtained character visible string.
28. device as claimed in claim 26, which is characterized in that the every a line for being embedded in sub- watermark in the rights file, The extraction module is specifically used for determining the number k for being embedded into the sub- watermark of the row according to the following formula:
K=CHASH%K;
Wherein, CHASHFor the cryptographic Hash of S content of text preceding in the row, S is the integer greater than zero;K is to divide the watermark The sub- watermark number cut.
29. device as claimed in claim 26, which is characterized in that the extraction module is specifically used for:
For each number, the most sub- watermark of frequency of occurrence is counted, which is determined as sub- water corresponding with the number Print;
According to the sequence of number from small to large, each sub- watermark is spliced, the watermark being embedded into the rights file is obtained.
30. device as claimed in claim 23, which is characterized in that the parsing module is specifically used for:
The watermark is decoded, decrypts the copyright information after being combined and timestamp information;
To after combination copyright information and timestamp information split to obtain the copyright information and timestamp information.
31. a kind of computer, which is characterized in that including at least one processing unit and at least one storage unit, wherein The storage unit is stored with program code, when said program code is executed by the processing unit, so that the computer Perform claim requires the step of 1~7 and/or 8~15 any the method.
32. a kind of computer readable storage medium, which is characterized in that including program code, when said program code is in computer When upper operation, the step of making the computer perform claim require 1~7 and/or 8~15 any the method.
CN201810432660.5A 2018-05-08 2018-05-08 Watermark embedding and detecting method and device Active CN110457873B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810432660.5A CN110457873B (en) 2018-05-08 2018-05-08 Watermark embedding and detecting method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810432660.5A CN110457873B (en) 2018-05-08 2018-05-08 Watermark embedding and detecting method and device

Publications (2)

Publication Number Publication Date
CN110457873A true CN110457873A (en) 2019-11-15
CN110457873B CN110457873B (en) 2021-04-27

Family

ID=68480476

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810432660.5A Active CN110457873B (en) 2018-05-08 2018-05-08 Watermark embedding and detecting method and device

Country Status (1)

Country Link
CN (1) CN110457873B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111145069A (en) * 2019-12-03 2020-05-12 支付宝(杭州)信息技术有限公司 Image watermarking processing method and device based on block chain
CN112884631A (en) * 2021-02-24 2021-06-01 江苏保旺达软件技术有限公司 Watermark processing method, device, equipment and storage medium
CN112948895A (en) * 2019-12-10 2021-06-11 航天信息股份有限公司 Data watermark embedding method, watermark tracing method and device
CN113177193A (en) * 2021-04-23 2021-07-27 深圳依时货拉拉科技有限公司 Watermark adding method, watermark verifying method and terminal equipment
CN113255008A (en) * 2021-07-01 2021-08-13 支付宝(杭州)信息技术有限公司 Method and system for outputting multimedia file
WO2022056989A1 (en) * 2020-09-18 2022-03-24 Huawei Cloud Computing Technologies Co., Ltd. Digital watermarking for textual data
CN116362953A (en) * 2023-05-30 2023-06-30 南京师范大学 High-precision map watermarking method based on invisible characters
CN117272333A (en) * 2022-10-28 2023-12-22 北京鸿鹄元数科技有限公司 Relational database watermark embedding and tracing method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101751656A (en) * 2008-12-22 2010-06-23 北京大学 Watermark embedding and extraction method and device
US20170329943A1 (en) * 2016-05-12 2017-11-16 Markany Inc. Method and apparatus for embedding and extracting text watermark

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101751656A (en) * 2008-12-22 2010-06-23 北京大学 Watermark embedding and extraction method and device
US20170329943A1 (en) * 2016-05-12 2017-11-16 Markany Inc. Method and apparatus for embedding and extracting text watermark

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
WENGANG CHENG,HUI FENG,CUIRU YANG: "A Robust Text Digital Watermarking Algorithm Based on Fragments Regrouping Strategy", 《2010 IEEE INTERNATIONAL CONFERENCE ON INFORMATION THEORY AND INFORMATION SECURITY》 *
张振宇,李千目,戚湧: "基于不可见字符的文本水印设计", 《南京理工大学学报》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111145069A (en) * 2019-12-03 2020-05-12 支付宝(杭州)信息技术有限公司 Image watermarking processing method and device based on block chain
CN112948895A (en) * 2019-12-10 2021-06-11 航天信息股份有限公司 Data watermark embedding method, watermark tracing method and device
WO2022056989A1 (en) * 2020-09-18 2022-03-24 Huawei Cloud Computing Technologies Co., Ltd. Digital watermarking for textual data
US11669601B2 (en) 2020-09-18 2023-06-06 Huawei Cloud Computing Technologies Co., Ltd. Digital watermarking for textual data
CN112884631A (en) * 2021-02-24 2021-06-01 江苏保旺达软件技术有限公司 Watermark processing method, device, equipment and storage medium
CN113177193A (en) * 2021-04-23 2021-07-27 深圳依时货拉拉科技有限公司 Watermark adding method, watermark verifying method and terminal equipment
CN113255008A (en) * 2021-07-01 2021-08-13 支付宝(杭州)信息技术有限公司 Method and system for outputting multimedia file
CN117272333A (en) * 2022-10-28 2023-12-22 北京鸿鹄元数科技有限公司 Relational database watermark embedding and tracing method
CN116362953A (en) * 2023-05-30 2023-06-30 南京师范大学 High-precision map watermarking method based on invisible characters
CN116362953B (en) * 2023-05-30 2023-08-01 南京师范大学 High-precision map watermarking method based on invisible characters

Also Published As

Publication number Publication date
CN110457873B (en) 2021-04-27

Similar Documents

Publication Publication Date Title
CN110457873A (en) A kind of watermark embedding and detection method and device
US10176309B2 (en) Systems and methods for authenticating video using watermarks
CN109040341B (en) Intelligent contract address generation method and device, computer equipment and readable storage medium
US20190158296A1 (en) Redactable document signatures
CN103605950B (en) Method and system for hiding signature in credible two-dimensional code
Tayan et al. A hybrid digital-signature and zero-watermarking approach for authentication and protection of sensitive electronic documents
CN113065169B (en) File storage method, device and equipment
EP3637674A1 (en) Computer system, secret information verification method, and computer
CN104850765A (en) Watermark processing method, device and system
CN105303075B (en) Adaptive Text Watermarking method based on PDF format
Melkundi et al. A robust technique for relational database watermarking and verification
CN109977684A (en) A kind of data transmission method, device and terminal device
CN104320253B (en) A kind of Quick Response Code Verification System and method based on CBS signature mechanisms
CN110232021A (en) The method and device of page test
CN104168117B (en) A kind of speech digit endorsement method
CN110322386A (en) A kind of insertion of digital text watermarking and detection method and device
CN114356919A (en) Watermark embedding method, tracing method and device for structured database
CN107171808B (en) A kind of verification method and device of electronic record authenticity
JP2997483B2 (en) Verification data generator
KR20210109164A (en) First copyright holder authentication system using blockchain and its method
Alkhudaydi et al. Integrating light-weight cryptography with diacritics Arabic text steganography improved for practical security applications
CN110457916A (en) A kind of electronic contract encryption method, device and terminal device
JP5788681B2 (en) Handwritten signature acquisition apparatus, handwritten signature acquisition program, and handwritten signature acquisition method
CN111970237A (en) Encryption and decryption method, system and medium based on water depth measurement data
CN109064379A (en) The mask method and the method for inspection and device of a kind of digital watermarking

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant