CN115114597A - Tracing watermark embedding and extracting method based on character information - Google Patents

Tracing watermark embedding and extracting method based on character information Download PDF

Info

Publication number
CN115114597A
CN115114597A CN202210693843.9A CN202210693843A CN115114597A CN 115114597 A CN115114597 A CN 115114597A CN 202210693843 A CN202210693843 A CN 202210693843A CN 115114597 A CN115114597 A CN 115114597A
Authority
CN
China
Prior art keywords
font
information
watermark
data
carrier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210693843.9A
Other languages
Chinese (zh)
Inventor
陈明志
梁镇
施友安
翁才杰
姚宏玮
许春耀
张瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Beika Technology Co ltd
Original Assignee
Beijing Beika Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Beika Technology Co ltd filed Critical Beijing Beika Technology Co ltd
Priority to CN202210693843.9A priority Critical patent/CN115114597A/en
Publication of CN115114597A publication Critical patent/CN115114597A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/16Program or content traceability, e.g. by watermarking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/106Enforcing content protection by specific content processing
    • G06F21/1063Personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/109Font handling; Temporal or kinetic typography
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0021Image watermarking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2201/00General purpose image data processing
    • G06T2201/005Image watermarking
    • G06T2201/0065Extraction of an embedded watermark; Reliable detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Technology Law (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Editing Of Facsimile Originals (AREA)

Abstract

The invention discloses a traceable watermark embedding and extracting method based on text information, wherein a steganographic algorithm is adopted during embedding, and a target font is replaced by a fusion font to obtain a carrier containing secret; during extraction, watermark information is extracted according to the font type containing the secret carrier data and the writing stroke number of the font type. The method uses the writing stroke number and the font type of the character as a carrier, replaces the target font with the fusion font which is highly similar to the target font to realize the embedding of the watermark, can ensure that the traceable watermark is invisible, can also meet the requirements of concealment and robustness, can extract the watermark information through a divulgence medium after sensitive information is photographed, intercepted and recorded, and can trace and position the divulgence source.

Description

Tracing watermark embedding and extracting method based on character information
Technical Field
The invention belongs to the technical field of information security, and particularly relates to a traceability watermark embedding and extracting method based on text information.
Background
The tracing watermark can realize the divulgence tracing of sensitive information, common visible watermarks are easy to erase and tamper, and conventional invisible watermarks have poor robustness and concealment, cannot meet the requirements of practical application and need to be improved.
Disclosure of Invention
The invention aims to provide a traceability watermark embedding and extracting method based on character information, which can ensure that a traceability watermark is invisible and can meet the requirements of concealment and robustness.
In order to achieve the above purpose, the solution of the invention is:
a tracing watermark embedding method based on character information adopts steganography algorithm to replace target font with fusion font to obtain a carrier containing secret.
The steganographic algorithm may employ a (7,4) hamming code-based information hiding method, an LSB algorithm, a matrix coding-based information hiding method, or an STC coding-based information hiding method.
The method specifically comprises the following steps:
step A1, selecting 1 font as the target font, and collecting n-style fused fonts as candidate replacement fonts of the target font, wherein n is 2 α -1, α is the codeword width of the information carried by each word, and its value is an integer no less than 1;
step A2, assume watermark information as L 4 Binary data M of bits, using
Figure BDA0003701659480000011
Representing a font, x 0 For the target font, x j Represents the jth fusion font, j ═ 1,2 α -1;
Step A3, assume that there is L in the carrier data 0 A word, if
Figure BDA0003701659480000012
Continuing to execute the subsequent operation, otherwise returning the prompt message of insufficient capacity;
step A4, extracting pre-L 1 The number of strokes written in each word is recorded as SN i Wherein i is 0,1 1 -1,L 1 =L 0 -L 0 % 7,% represents the remainder operation;
step a5, calculating the information represented by each word in the carrier, denoted as R,
Figure BDA0003701659480000021
the method comprises the following specific steps:
r i =SN i %(2 α ),i=0,1,...,L 1 -1
step a6, converting each element in R into an alpha-bit binary sequence, denoted as C,
Figure BDA0003701659480000022
L 2 =α*L 1 ,c j the values of (A) are as follows:
Figure BDA0003701659480000023
wherein j is 0,1 2 -1,
Figure BDA0003701659480000024
β=α-j%α,
Figure BDA0003701659480000025
Represents rounding down;
step A7, divide C into L 3 Sub-blocks of 7 bits of data each, represented by a row vector, and denoted as D k ,k=0,1,...,L 3 -1,
Figure BDA0003701659480000026
Step A8, mixingAmplification of watermark information M to 3L 3 The bit data as the information to be embedded is marked as M ', M' is the number of
Figure BDA0003701659480000027
Taking the front 3L of the M after splicing 3 The result of the individual data is,
Figure BDA0003701659480000028
represents rounding up;
step A9, divide M' into L 3 Each subblock is 3-bit data, and each subblock is represented by a row vector and is denoted as m k ,k=0,1,...,L 3 -1;
Step A10, calculate D k In modifying the bit position, if
Figure BDA0003701659480000029
The carrier data does not need to be modified, when D k '=D k Otherwise d will be k Indexing the positions appearing in the check matrix H by D k The result of negating the element at the corresponding position in the sequence is recorded as D k ' repeating the above operation by successively increasing the value of k until the watermark information is completely embedded in the carrier, d k The specific calculation of (a) is as follows:
Figure BDA0003701659480000031
wherein
Figure BDA0003701659480000032
To replace the multiplication of the matrix and vector after the addition operation with a modulo-2 sum operation,
Figure BDA0003701659480000033
for XOR operation, m k For the kth group to be embedded with information, m k =[z k0 ,z k1 ,z k2 ],z i The check matrix is formed by the following specific forms:
Figure BDA0003701659480000034
step A11, with D k ' replacement of corresponding D in C k Obtaining secret-containing data C';
step A12, divide C' into L 2 Each sub-block comprises alpha bit data, and the data of each sub-block is converted into corresponding decimal number which is recorded as r i ', by r i ' Replacing the corresponding R in R i Obtaining R';
step A13, replacing fonts according to R' and R if R i '=r i Then the original font is kept unchanged, if r is i '≠r i Then the font x is used 0 Substitution into fonts
Figure BDA0003701659480000035
Wherein λ i =(r i '-r i +2 α )%2 α Thereby obtaining the secrecy-containing vector.
A source tracing watermark extraction method based on character information extracts watermark information according to font types containing secret carrier data and writing stroke numbers of the font types.
The method specifically comprises the following steps:
step B1, assuming that the secrecy-containing vector contains L 0 ' word, extract preceding L 1 ' number of strokes written for each word in a word and type of font thereof are recorded as SN i ' and y i Wherein y is i ∈X,
Figure BDA0003701659480000036
For the same set of fonts consisting of the target font and the merged font as the embedding process, i is 0,1 1 ′-1,L 1 ′=L 0 ′-L 0 '% 7,% indicates the remainder operation;
step B2, calculating information expressed by the number of writing strokes of each word in the data containing secret carrier, recording as R',
Figure BDA0003701659480000037
the method comprises the following specific steps:
r i '=SN i '%(2 α ),i=0,1,...,L 1 '-1
step B3, according to the information and font type represented by the number of writing strokes, calculating the information R carried by the secret-containing carrier,
Figure BDA0003701659480000041
r i =(r i '+λ i )%(2 α ),i=0,1,...,L 1 '-1,λ i the values of (a) are as follows:
Figure BDA0003701659480000042
step B4, converting each element in R into an alpha-bit binary sequence, denoted as C',
Figure BDA0003701659480000043
L 2 ′=α*L 1 ′,c j ' take the following values:
Figure BDA0003701659480000044
wherein j is 0,1 2 ′-1,
Figure BDA0003701659480000045
β=α-j%α,
Figure BDA0003701659480000046
Represents rounding down;
step B5, divide C' into L 3 ' sub-blocks of 7 bits of data each, represented by a row vector, and denoted as D k ′,k=0,1,...,L 3 ′-1,
Figure BDA0003701659480000047
Step B6, calculating D k ' the watermark information is represented by a row vectorAnd denote it as m k ', its set is noted as
Figure BDA0003701659480000048
m k The specific calculation method of' is as follows:
Figure BDA0003701659480000049
wherein m is k =[z k0 ,z k1 ,z k2 ],z i ∈{0,1},i=0,1,2;
Step B7, adding the first 3L of M 3 ′-3L 3 ′%L 4 Bit data partitioning into L 5 Individual blocks of each L 4 The bit data is expressed by a row vector and is marked as
Figure BDA00037016594800000410
ξ hi ∈M′,h=0,1,...,L 5 -1,
Figure BDA00037016594800000411
Step B8, using wk h Constructing a matrix M ', counting the occurrence frequency of each element value in each column vector of the matrix M', and marking the element with the highest occurrence frequency as xi g ' the row vector formed by each row of elements with the highest occurrence frequency is the proposed watermark information and is marked as WK,
Figure BDA0003701659480000051
the specific form of M' is as follows:
Figure BDA0003701659480000052
when the same watermark is circularly embedded, the font type of the previous word at each repeated watermark information embedding position is modified
Figure BDA0003701659480000053
Having the same characteristics as the fonts in the merged font library X but with the same characteristics
Figure BDA0003701659480000054
After the scheme is adopted, the invention has the following beneficial effects:
(1) the invention combines the fusion font and the steganography based on the (7,4) Hamming code, and provides a new invisible traceability water technology, the watermark generated by the method is invisible to human eyes, the watermark information can be extracted through a divulgence medium after sensitive information is photographed, screen shot and recorded, and the divulgence source is tracked and positioned;
(2) the writing stroke number of the character is used as a carrier for embedding the source tracing watermark, the carrier information represented by different characters with the same font type is different, compared with a scheme for directly replacing the font (the carrier information represented by different characters with the same font type in the scheme is the same), the carrier data is more diverse, on the basis, a steganography based on a (7,4) Hamming code is used, the embedding efficiency is obviously improved, and the modification amount of the carrier during embedding information is reduced;
(3) the invention designs a unique font type replacement mode, the secret carrier containing the same information can correspond to different font types, the statistical significance of the font types in the data containing the secret carrier after the watermark is embedded is damaged to a certain degree, and the safety of the watermark information is enhanced.
Drawings
Fig. 1 is a flow chart of watermark embedding in the present invention;
fig. 2 is a flow chart of watermark extraction in the present invention;
fig. 3 is a schematic diagram of information correspondence represented by a character and a font type thereof when α is 2.
Detailed Description
The technical solution and the advantages of the present invention will be described in detail with reference to the accompanying drawings.
The invention provides a traceability watermarking method based on text information, the text information is writing stroke number and font type of the text, the method mainly comprises embedding and extracting watermark, and mainly replaces target font by fusion font (Tian Y.zi2zi: Master chip mapping with conditional adaptive adaptation network, 2017J. corrected Jun,2017,3:2017. He K, Zhang X, Ren S, et al. deep residual deletion leaving for imaging and C. Proceedings of the IEEE communication on video and mapping, 2016: Kun 770 778.) to realize traceability watermarking embedding, and simultaneously adopts hidden technology (CAO and Yikan, Youhai, Zhao, etc.) based on (7,4) Chinese plain code to hide the target font, and the hidden technology (CAO and Yihao) is applied to the national security information research, and the like, and the hidden electronic communication technology is based on the Beijing society of China Union, Zhao, Beijing communication and the Beijing communication society of China general information, and the hidden technology is based on the Beijing communication society of China plain code of China, the hidden electronic society of China, the hidden technology of China Union, the hidden technology of China, the hidden electronic society of China, China, 2015.) to improve watermark embedding efficiency; and extracting watermark information according to the font type containing the secret carrier data and the writing stroke number of the font type.
As shown in fig. 1 and fig. 2, the specific steps of watermark embedding and extracting are as follows:
1. watermark embedding
1) Selecting 1 font as a target font, collecting n styles of fusion fonts as candidate replacement fonts of the target font, wherein the fusion fonts need to have certain difference with the target font but are highly similar to each other, human eyes can not distinguish the difference between the fusion fonts and the target font, but the specific types of the target font and the candidate replacement fonts can be identified through a machine learning algorithm, and n is 2 α 1, α is the code word width of the information carried by each word, and the value thereof is an integer no less than 1, and the specific value can be set according to the actual requirement;
2) suppose watermark information is L 4 Binary data M of bits, using
Figure BDA0003701659480000061
Representing a font, x 0 Is the target font, i.e. the font of the original text, x j Represents the jth fusion font, j ═ 1,2 α -1;
3) Suppose there is L in the carrier data 0 A word, if
Figure BDA0003701659480000062
Continuing to execute the subsequent operation, otherwise returning the prompt message of insufficient capacity;
4) l before extraction 1 The number of strokes written in each word is recorded as SN i Wherein i is 0,1 1 -1,L 1 =L 0 -L 0 % 7,% represents the remainder operation;
5) the information represented by each word in the calculation carrier, denoted R,
Figure BDA0003701659480000063
the method comprises the following specific steps:
r i =SN i %(2 α ),i=0,1,...,L 1 -1
6) each element in R is converted to an alpha-bit binary sequence, denoted C,
Figure BDA0003701659480000071
L 2 =α*L 1 ,c j the values of (A) are as follows:
Figure BDA0003701659480000072
wherein j is 0,1 2 -1,
Figure BDA0003701659480000073
β=α-j%α,
Figure BDA0003701659480000074
Indicating a rounding down.
7) Dividing C into L 3 Sub-blocks of 7 bits of data each, represented by a row vector, and denoted as D k ,k=0,1,...,L 3 -1,
Figure BDA0003701659480000075
8) Amplifying watermark information M into 3L 3 The bit data as the information to be embedded is marked as M ', M' is the data of the embedded information
Figure BDA0003701659480000076
Taking the front 3L of the M after splicing 3 The result of the individual data is,
Figure BDA0003701659480000077
represents rounding up;
9) dividing M' into L 3 Each subblock is 3-bit data, and each subblock is represented by a row vector and is denoted as m k ,k=0,1,...,L 3 -1;
10) Calculating D k In modifying the bit position, if
Figure BDA0003701659480000078
The carrier data does not need to be modified, when D k '=D k Otherwise d will be k Indexing the positions appearing in the check matrix H by D k The result of negating the element at the corresponding position in the sequence is recorded as D k ' repeating the above operation by successively increasing the value of k until the watermark information is completely embedded in the carrier, d k The specific calculation of (a) is as follows:
Figure BDA0003701659480000079
wherein
Figure BDA00037016594800000710
To replace the multiplication of the matrix and vector after the addition operation with a modulo-2 sum operation,
Figure BDA00037016594800000711
for XOR operation, m k For the kth group of information to be embedded, m k =[z k0 ,z k1 ,z k2 ],z i The check matrix is formed by the following specific forms:
Figure BDA0003701659480000081
11) by D k ' replacement of corresponding D in C k Obtaining secret-containing data C';
12) divide C intoIs L 2 Each sub-block comprises alpha bit data, and the data of each sub-block is converted into corresponding decimal number which is recorded as r i ', by r i ' Replacing the corresponding R in R i Obtaining R';
13) performing font replacement according to R' and R if R i '=r i Then the original font is kept unchanged, if r is i '≠r i Then the font x is written 0 Substitution into fonts
Figure BDA0003701659480000082
Wherein λ is i =(r i '-r i +2 α )%2 α
2. Watermark extraction
1) Extracting text information from secret-containing carrier, assuming secret-containing carrier contains L 0 ' word, extract preceding L 1 ' number of strokes written for each word in a word and type of font thereof are recorded as SN i ' and y i Wherein y is i ∈X,
Figure BDA0003701659480000083
For the same set of fonts consisting of the target font and the fusion font as the embedding process, i ═ 0,1 1 ′-1,L 1 ′=L 0 ′-L 0 '% 7,% indicates the remainder operation;
2) calculating information expressed by the number of writing strokes of each word in the data containing the secret carrier, recording as R',
Figure BDA0003701659480000084
the method comprises the following specific steps:
r i '=SN i '%(2 α ),i=0,1,...,L 1 '-1
3) according to the information and font type represented by the number of writing strokes, the information R carried by the secret-containing carrier is calculated,
Figure BDA0003701659480000085
r i =(r i '+λ i )%(2 α ),i=0,1,...,L 1 '-1,λ i the values of (A) are as follows:
Figure BDA0003701659480000086
4) each element in R is converted to an alpha-bit binary sequence, denoted C',
Figure BDA0003701659480000087
L 2 ′=α*L 1 ′,c j ' take the following values:
Figure BDA0003701659480000091
wherein j is 0,1 2 ′-1,
Figure BDA0003701659480000092
β=α-j%α,
Figure BDA0003701659480000093
Indicating a rounding down.
5) Division of C' into L 3 ' sub-blocks of 7 bits of data each, represented by a row vector, and denoted as D k ′,k=0,1,...,L 3 ′-1,
Figure BDA0003701659480000094
6) Calculate D k ' the watermark information is represented by a row vector and is denoted as m k ', its set is noted as
Figure BDA0003701659480000095
m k The specific calculation method of' is as follows:
Figure BDA0003701659480000096
wherein m is k =[z k0 ,z k1 ,z k2 ],z i ∈{0,1},i=0,1,2;
7) The first 3L of M 3 ′-3L 3 ′%L 4 Bit data division into L 5 Individual blocks of each L 4 The bit data is expressed by a row vector and is marked as
Figure BDA0003701659480000097
ξ hi ∈M′,h=0,1,...,L 5 -1,
Figure BDA0003701659480000098
8) By wk h Constructing a matrix M ', counting the occurrence frequency of each element value in each column vector of the matrix M', and marking the element with the highest occurrence frequency as xi g ' the row vector formed by each row of elements with the highest occurrence frequency is the proposed watermark information and is marked as WK,
Figure BDA0003701659480000099
the specific form of M "is as follows:
Figure BDA00037016594800000910
the font type of the previous word in each repeated watermark information embedding position can be modified when the same watermark is embedded circularly
Figure BDA00037016594800000911
Having the same characteristics as the fonts in the merged font library X but with the same characteristics
Figure BDA00037016594800000912
Fig. 3 shows a schematic diagram of information correspondence represented by a character and its font type when α is 2.
It should be noted that, the steganography based on the (7,4) hamming code used in the embedding and extracting processes of the watermark information in the method provided by the present invention can be replaced by other suitable steganography algorithms according to the needs, such as LSB algorithm, information hiding method based on matrix coding, information hiding method based on STC coding, and the like.
In summary, the traceability watermark embedding and extracting method based on the character information is characterized in that the writing stroke number and the font type of the character are used as carriers, a unique font replacement rule is designed, the target font is replaced by the fusion font which is highly similar to the target font to realize the embedding of the watermark, the method can ensure that the traceability watermark is invisible, can meet the requirements of concealment and robustness, and can extract the watermark information through a leakage medium after sensitive information is photographed, intercepted and recorded, thereby tracking and positioning a leakage source.
The above embodiments are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modifications made on the basis of the technical scheme according to the technical idea of the present invention fall within the protection scope of the present invention.

Claims (5)

1. A tracing watermark embedding method based on text information is characterized in that: and replacing the target font with the fusion font by adopting a steganography algorithm to obtain the secret-containing carrier.
2. The method of claim 1, wherein the method comprises: the steganographic algorithm may employ a (7,4) hamming code-based information hiding method, an LSB algorithm, a matrix coding-based information hiding method, or an STC coding-based information hiding method.
3. The method for embedding a traceable watermark based on textual information according to claim 1, comprising the steps of:
step A1, selecting 1 font as the target font, and collecting n-style fused fonts as candidate replacement fonts of the target font, wherein n is 2 α -1, α being the codeword width of the information carried by each word, the value of which is an integer no less than 1;
step A2, assume watermark information as L 4 Binary data M of bits, using
Figure FDA0003701659470000011
Representing a font, x 0 For the target font, x j Denotes the jth font, j ═ 1,2,., 2 α -1;
Step A3, assume that there is L in the carrier data 0 A word if
Figure FDA0003701659470000012
Continuing to execute the subsequent operation, otherwise returning the prompt message of insufficient capacity;
step A4, extracting pre-L 1 The number of strokes written in each word is recorded as SN i Wherein i is 0,1 1 -1,L 1 =L 0 -L 0 % 7,% represents the remainder operation;
step a5, calculating the information represented by each word in the carrier, denoted as R,
Figure FDA0003701659470000013
the method comprises the following specific steps:
r i =SN i %(2 α ),i=0,1,...,L 1 -1
step a6, converting each element in R into an alpha-bit binary sequence, denoted as C,
Figure FDA0003701659470000014
L 2 =α*L 1 ,c j the values of (A) are as follows:
Figure FDA0003701659470000021
wherein j is 0,1 2 -1,
Figure FDA0003701659470000022
β=α-j%α,
Figure FDA0003701659470000023
Represents rounding down;
step A7, divide C into L 3 Sub-blocks of 7 bits of data each, represented by a row vector, and denoted as D k ,k=0,1,...,L 3 -1,
Figure FDA0003701659470000024
Step A8, amplifying watermark information M into 3L 3 The bit data as the information to be embedded is marked as M ', M' is the number of
Figure FDA0003701659470000025
Taking the front 3L of the M after splicing 3 The result of the individual data is,
Figure FDA0003701659470000026
represents rounding up;
step A9, divide M' into L 3 Each subblock is 3-bit data, and each subblock is represented by a row vector and is denoted as m k ,k=0,1,...,L 3 -1;
Step A10, calculating D k In modifying the bit position, if
Figure FDA0003701659470000027
The carrier data does not need to be modified, when D k '=D k Otherwise d will be k Indexing the positions appearing in the check matrix H by D k The result of negating the element at the corresponding position in the sequence is recorded as D k ' repeating the above operation by successively increasing the value of k until the watermark information is completely embedded in the carrier, d k The specific calculation of (a) is as follows:
Figure FDA0003701659470000028
wherein
Figure FDA0003701659470000029
To replace the multiplication of the matrix and vector after the addition operation with a modulo-2 sum operation,
Figure FDA00037016594700000210
for XOR operation, m k For the kth group to be embedded with information, m k =[z k0 ,z k1 ,z k2 ],z i Where, H is a check matrix, i is 0,1,2, and the specific form is as follows:
Figure FDA00037016594700000211
step A11, with D k ' replacement of corresponding D in C k Obtaining secret-containing data C';
step A12, divide C' into L 2 Each sub-block comprises alpha bit data, and the data of each sub-block is converted into corresponding decimal number which is recorded as r i ', by r i ' Replacing the corresponding R in R i Obtaining R';
step A13, replacing fonts according to R' and R, if R i '=r i Then the original font is kept unchanged, if r is i '≠r i Then the font x is used 0 Substitution into fonts
Figure FDA0003701659470000031
Wherein λ i =(r i '-r i +2 α )%2 α Thereby obtaining the secrecy-containing vector.
4. A tracing watermark extraction method based on text information is characterized in that: and extracting watermark information according to the font type containing the secret carrier data and the writing stroke number of the font type.
5. The method for extracting a source-tracing watermark based on text information according to claim 4, characterized by comprising the following steps:
step B1, assuming that the secrecy-containing vector contains L 0 ' word, extract preceding L 1 ' wordThe number of strokes and the type of font of each character are recorded as SN i ' and y i Wherein y is i ∈X,
Figure FDA0003701659470000032
For the same set of fonts consisting of the target font and the merged font as the embedding process, i is 0,1 1 ′-1,L 1 ′=L 0 ′-L 0 '% 7,% indicates the remainder operation;
step B2, calculating information expressed by the number of writing strokes of each word in the data containing secret carrier, recording as R',
Figure FDA0003701659470000033
the method comprises the following specific steps:
r i '=SN i '%(2 α ),i=0,1,...,L 1 '-1
step B3, according to the information and font type represented by the number of writing strokes, calculating the information R carried by the secret-containing carrier,
Figure FDA0003701659470000034
r i =(r i '+λ i )%(2 α ),i=0,1,...,L 1 '-1,λ i the values of (A) are as follows:
Figure FDA0003701659470000035
step B4, converting each element in R into an alpha-bit binary sequence, denoted as C',
Figure FDA0003701659470000036
L 2 ′=α*L 1 ′,c j ' take the following values:
Figure FDA0003701659470000041
wherein j is 0,1 2 ′-1,
Figure FDA0003701659470000042
β=α-j%α,
Figure FDA0003701659470000043
Represents rounding down;
step B5, divide C' into L 3 ' sub-blocks of 7 bits of data each, represented by a row vector, and denoted as D k ′,k=0,1,...,L 3 ′-1,
Figure FDA0003701659470000044
Step B6, calculating D k ' the watermark information is represented by a row vector and is denoted as m k ', its set is noted as
Figure FDA0003701659470000045
m k The specific calculation method of' is as follows:
Figure FDA0003701659470000046
wherein m is k =[z k0 ,z k1 ,z k2 ],z i ∈{0,1},i=0,1,2;
Step B7, top 3L of M 3 ′-3L 3 ′%L 4 Bit data partitioning into L 5 Individual blocks of each L 4 The bit data is expressed by a row vector and is denoted as
Figure FDA0003701659470000047
ξ hi ∈M′,h=0,1,...,L 5 -1,
Figure FDA0003701659470000048
Step B8, using wk h Constructing a matrix M ', counting the occurrence frequency of each element value in each column vector of the matrix M', and recording the element with the highest occurrence frequency as xi g ', the row vector composed of the elements with the highest frequency of occurrence in each row is the watermark information which is proposed and is marked as WK, and the WK is xi 0 ′,ξ 1 ′,...,ξ L4-1 ′]The specific form of M' is as follows:
Figure FDA0003701659470000049
when the same watermark is circularly embedded, the font type of the previous word at the embedding position of each repeated watermark information is modified
Figure FDA0003701659470000051
Figure FDA0003701659470000052
Having the same characteristics as the fonts in the merged font library X but with the same characteristics
Figure FDA0003701659470000053
CN202210693843.9A 2022-06-19 2022-06-19 Tracing watermark embedding and extracting method based on character information Pending CN115114597A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210693843.9A CN115114597A (en) 2022-06-19 2022-06-19 Tracing watermark embedding and extracting method based on character information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210693843.9A CN115114597A (en) 2022-06-19 2022-06-19 Tracing watermark embedding and extracting method based on character information

Publications (1)

Publication Number Publication Date
CN115114597A true CN115114597A (en) 2022-09-27

Family

ID=83328952

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210693843.9A Pending CN115114597A (en) 2022-06-19 2022-06-19 Tracing watermark embedding and extracting method based on character information

Country Status (1)

Country Link
CN (1) CN115114597A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116433454A (en) * 2023-06-12 2023-07-14 北京和人广智科技有限公司 Method, device and storage medium for embedding document watermark based on micro-variant
CN117648681A (en) * 2024-01-30 2024-03-05 北京点聚信息技术有限公司 OFD format electronic document hidden information extraction and embedding method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116433454A (en) * 2023-06-12 2023-07-14 北京和人广智科技有限公司 Method, device and storage medium for embedding document watermark based on micro-variant
CN116433454B (en) * 2023-06-12 2023-09-01 北京和人广智科技有限公司 Method, device and storage medium for embedding document watermark based on micro-variant
CN117648681A (en) * 2024-01-30 2024-03-05 北京点聚信息技术有限公司 OFD format electronic document hidden information extraction and embedding method
CN117648681B (en) * 2024-01-30 2024-04-05 北京点聚信息技术有限公司 OFD format electronic document hidden information extraction and embedding method

Similar Documents

Publication Publication Date Title
CN115114597A (en) Tracing watermark embedding and extracting method based on character information
US7159118B2 (en) Methods and apparatus for embedding and recovering watermarking information based on host-matching codes
CN112311954B (en) Ciphertext domain reversible information hiding method based on complementary code mark and bitmap embedding
CN107689026B (en) Reversible steganography method based on optimal coding
Bravo-Solorio et al. Fast fragile watermark embedding and iterative mechanism with high self-restoration performance
CN109657769A (en) A kind of two-dimensional barcode information hidden method run-length coding based
CN105303075B (en) Adaptive Text Watermarking method based on PDF format
CN110913092B (en) Reversible information hiding method for encrypted image
CN104036531B (en) Information hiding method based on vector quantization and bintree
CN112016061A (en) Excel document data protection method based on robust watermarking technology
CN115689853A (en) Robust text watermarking method based on Chinese character characteristic modification and grouping
CN115297218B (en) Reversible data hiding method based on Huffman coding rule and position diagram compression
CN111970507A (en) Reversible data hiding method of ciphertext domain image based on pixel difference coding
CN109859090A (en) Reversible water mark method and device based on human visual system
CN115952528A (en) Multi-scale combined text steganography method and system
CN114745475B (en) Robust reversible information hiding method for encrypted image
CN113095992A (en) Novel bar code screenshot steganography traceability combined algorithm
Zhang et al. V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection
CN111242825A (en) ENC electronic nautical chart zero-watermarking method based on water depth characteristics
CN114817873A (en) Watermark generating and reading method and device based on deformation
CN111489278A (en) Text watermark embedding and extracting method based on scrambling diffusion
Tripathi et al. Invertible secret sharing: Using meaningful shadows based on Sorted Indexed Code
CN111400670A (en) Watermark adding method, device, equipment and storage medium
CN115134142B (en) Information hiding method and system based on file segmentation
Nguyen et al. Stable Messenger: Steganography for Message-Concealed Image Generation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination