CN115455383A - Method, device and equipment for processing watermark information of database - Google Patents

Method, device and equipment for processing watermark information of database Download PDF

Info

Publication number
CN115455383A
CN115455383A CN202211417478.5A CN202211417478A CN115455383A CN 115455383 A CN115455383 A CN 115455383A CN 202211417478 A CN202211417478 A CN 202211417478A CN 115455383 A CN115455383 A CN 115455383A
Authority
CN
China
Prior art keywords
database
watermark
data
decision tree
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211417478.5A
Other languages
Chinese (zh)
Other versions
CN115455383B (en
Inventor
李公宝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yizhixuan Technology Co ltd
Original Assignee
Beijing Yizhixuan Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yizhixuan Technology Co ltd filed Critical Beijing Yizhixuan Technology Co ltd
Priority to CN202211417478.5A priority Critical patent/CN115455383B/en
Publication of CN115455383A publication Critical patent/CN115455383A/en
Application granted granted Critical
Publication of CN115455383B publication Critical patent/CN115455383B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/16Program or content traceability, e.g. by watermarking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/217Database tuning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Technology Law (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Storage Device Security (AREA)
  • Editing Of Facsimile Originals (AREA)

Abstract

The invention provides a method, a device and equipment for processing watermark information of a database, wherein the method comprises the following steps: generating watermark information; embedding the watermark information into a preset database to obtain a watermark database; generating a first decision tree according to first data in the preset database and generating a second decision tree according to second data in the watermark database; modifying second data in the watermark database according to the first decision tree and the second decision tree to obtain a target watermark database; wherein a third decision tree generated from third data in the target watermark database is the same as the first decision tree; the scheme of the invention overcomes the threat of the database to the data security in copyright authentication and tracing, ensures the validity of the data, improves the accuracy of extracting the watermark information and has good robustness.

Description

Method, device and equipment for processing watermark information of database
Technical Field
The invention relates to the technical field of watermarks, in particular to a method, a device and equipment for processing watermark information of a database.
Background
Big Data (Big Data) is a hot research field, is highly concerned by various subject fields, and increasingly influences and changes people's thinking mode, business operation mode, scientific research and education concept, medical health concept and the like. Big data mining and analysis refers to the fact that simple, isolated, scattered and fragmented data are connected through data sharing and fusion, and therefore deep, hidden and valuable information and knowledge can be found.
In view of the current development, a bottleneck of the deep development of big data is a data security problem in data sharing transaction. The big data often contains many sensitive data, including secret-related data, personal privacy and the like, and once the data is leaked, damaged or tampered, the data can have serious consequences. The development and application of big data technology are like a double-edged sword, which brings convenience and brings great threat to the safety of data. The existing database watermarking technology can realize copyright authentication and leakage source tracking in data transmission, but the usability of data used for a data mining algorithm cannot be guaranteed, the embedding of the watermark causes the result of data mining to change, and the actual value of the data is damaged.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a method, a device and equipment for processing watermark information of a database, so that the threat of the database on data security in copyright authentication and tracing is solved, the validity of data is ensured, the accuracy of extracting the watermark information is improved, and the robustness is good.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a watermark information processing method for a database comprises the following steps:
generating watermark information;
embedding the watermark information into a preset database to obtain a watermark database;
generating a first decision tree according to first data in the preset database and generating a second decision tree according to second data in the watermark database;
modifying second data in the watermark database according to the first decision tree and the second decision tree to obtain a target watermark database; wherein a third decision tree generated from third data in the target watermark database is the same as the first decision tree.
Optionally, the watermark information is a random sequence with a length of n
Figure 114407DEST_PATH_IMAGE001
Figure 310508DEST_PATH_IMAGE002
For the ith bit watermark information in the random sequence W,
Figure 621404DEST_PATH_IMAGE003
Figure 282192DEST_PATH_IMAGE004
embedding the watermark information into a preset database to obtain a watermark database, comprising:
carrying out Hash grouping on M tuples in a preset database according to a primary key value of each tuple to obtain a plurality of groups;
the ith bit watermark information w in the watermark information is processed i Is embedded into the t-th i And obtaining a watermark database in each group.
Optionally, modifying second data in the watermark database according to the first decision tree and the second decision tree to obtain a target watermark database, including:
acquiring a first segmentation value of a first decision tree and a second segmentation value of a second decision tree;
determining the modification direction of second data in the watermark database according to the first segmentation value and the second segmentation value; the modification direction is to modify the target tag in the first database segment in the watermark database to the second database segment;
modifying the second data in the watermark database according to a preset constraint equation and the modification direction to obtain a modified watermark database until a third decision tree generated according to the data in the modified watermark database is the same as the first decision tree;
and determining the modified watermark database as a target watermark database.
Optionally, the first segmentation value divides a preset database into two database segments; the second partitioning value divides the watermark database into two database segments;
determining a modification direction of second data in the watermark database according to the first segmentation value and the second segmentation value, including:
determining a first index and a second index of the first segmentation value in a preset database;
determining a third index and a fourth index of the second segmentation value in a watermark database; the index represents the probability of the data in a random number in the database segment being mistaken;
calculating the difference value of the first index and the third index to obtain a first difference;
calculating a difference value between the second index and the fourth index to obtain a second difference;
and comparing the first difference with the second difference to determine the modification direction of the second data in the watermark database.
Optionally, modifying the second data in the watermark database according to a preset constraint equation and the modification direction to obtain a modified watermark database, including:
acquiring a first variation of the first segmentation value in the watermark database and a second variation of a second segmentation value in the watermark database;
obtaining a third difference according to the first variation and the second variation;
and modifying the second data in the watermark database according to the modification direction under the condition of meeting the preset constraint equation according to the third difference to obtain a modified watermark database.
Optionally, the method for processing watermark information in a database further includes:
and carrying out watermark extraction on the target watermark database to obtain target watermark information.
Optionally, the watermark extraction is performed on the target watermark database to obtain target watermark information, and the method includes:
performing Hash grouping on watermark data in a target watermark database to obtain a plurality of target groups;
and watermark extraction is carried out on the watermark data in each target group to obtain target watermark information.
The invention also provides a watermark information processing device of the database, which comprises:
the generating module is used for generating watermark information;
the processing module is used for embedding the watermark information into a preset database to obtain a watermark database; generating a first decision tree according to first data in the preset database and generating a second decision tree according to second data in the watermark database; modifying second data in the watermark database according to the first decision tree and the second decision tree to obtain a target watermark database; wherein a third decision tree generated from third data in the target watermark database is the same as the first decision tree.
The present invention also provides a computing device comprising: a processor, a memory storing a computer program which, when executed by the processor, performs the method as described above.
The present invention also provides a computer-readable storage medium storing instructions which, when executed on a computer, cause the computer to perform the method as described above.
The scheme of the invention at least comprises the following beneficial effects:
according to the scheme of the invention, watermark information is generated; embedding the watermark information into a preset database to obtain a watermark database; generating a first decision tree according to first data in the preset database and generating a second decision tree according to second data in the watermark database; modifying second data in the watermark database according to the first decision tree and the second decision tree to obtain a target watermark database; wherein a third decision tree generated from third data in the target watermark database is the same as the first decision tree; the method overcomes the threat of the database on data security in copyright authentication and tracing, ensures the validity of the data, improves the accuracy of extracting the watermark information, and has good robustness.
Drawings
Fig. 1 is a schematic flow chart of a method for processing watermark information of a database according to an embodiment of the present invention;
fig. 2 is a schematic diagram of the positioning of a first partition value and a second partition value in a watermark database according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a first decision tree in a specific embodiment provided by the present invention;
FIG. 4 is a diagram of a second decision tree in a specific embodiment provided by the present invention;
FIG. 5 is a schematic diagram of a third decision tree in a specific embodiment provided by the present invention;
fig. 6 is a flowchart illustrating a watermark information processing method in an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a watermark information processing apparatus of a database according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As shown in fig. 1, an embodiment of the present invention provides a method for processing watermark information of a database, including:
step 11, generating watermark information;
step 12, embedding the watermark information into a preset database to obtain a watermark database;
step 13, generating a first decision tree according to the first data in the preset database and generating a second decision tree according to the watermark database;
step 14, modifying second data in the watermark database according to the first decision tree and the second decision tree to obtain a target watermark database; wherein a third decision tree generated from third data in the target watermark database is the same as the first decision tree.
In this embodiment, the watermark information processing process of the database includes: embedding watermark information into a preset database and modifying a decision tree; embedding watermark information into a preset database, wherein the embedding of the watermark information into the preset database preferably adopts an LSB (Least Significant Bit) watermark technology, and the embedding of the watermark information based on the LSB watermark technology has better robustness effect; embedding watermark information into a preset database to obtain a watermark database, and respectively generating a first decision tree and a second decision tree according to first data in the preset database and second data in the watermark database, wherein the second decision tree generated by the watermark database added with the watermark information can cause a data result to change when carrying out watermark identification, so that the second data in the watermark database needs to be modified according to the first decision tree and the second decision tree to obtain a target watermark database; therefore, the threat to the data security brought by copyright authentication and tracing of the target watermark database can be overcome, the effectiveness of the data is ensured, the accuracy of extracting the watermark information is improved, and the robustness effect is good.
In an optional embodiment of the present invention, the watermark information W is a random sequence with a length n
Figure 627723DEST_PATH_IMAGE001
Figure 715765DEST_PATH_IMAGE002
For the ith bit watermark information in the random sequence W,
Figure 264689DEST_PATH_IMAGE003
Figure 729168DEST_PATH_IMAGE004
step 12 comprises:
step 121, performing hash grouping on the M tuples in the preset database according to the primary key value of each tuple to obtain a plurality of groups;
step 122, the ith bit watermark information in the watermark information W is processed
Figure 929206DEST_PATH_IMAGE002
Is embedded in the t-th i And obtaining a watermark database in each group.
In this embodiment, the watermark information W is a random sequence with length n
Figure 188149DEST_PATH_IMAGE001
Embedding the watermark information W into a preset database,
Figure 473636DEST_PATH_IMAGE002
is the ith bit watermark information in the random sequence W of watermark information,
Figure 226960DEST_PATH_IMAGE003
Figure 281504DEST_PATH_IMAGE004
the predetermined database, preferably a numeric database, comprises M tuples
Figure 976927DEST_PATH_IMAGE005
Wherein, in the step (A),
Figure 484132DEST_PATH_IMAGE006
is an attribute column; l is a label, and if the preset database is a binary data set, the label
Figure 290414DEST_PATH_IMAGE007
Figure 215776DEST_PATH_IMAGE008
A unique identification primary key value for each tuple;
embedding watermark information W into a preset database according to the LSB watermark technology, which specifically comprises the following steps: according to the length n of the watermark information, carrying out hash grouping on M tuples in a preset database according to primary key values to obtain a plurality of groups, namely, through a formula
Figure 816521DEST_PATH_IMAGE009
Performing a hash grouping, wherein t i For the group number of the packet to be,
Figure 811022DEST_PATH_IMAGE010
is a connector, n is the total number of packets, H (.) is a hash function, K is a key, mod is a modulus operator;
the ith bit watermark information w in the watermark information is processed i Sequentially embedded into the t-th i In each group, a watermark database is obtained, in particular, by a formula
Figure 420995DEST_PATH_IMAGE011
Watermarking information w of ith bit i Is embedded in the t-th i In a group, where w i Is the ith bit watermark information, a is the tth bit i The data in each of the packets is transmitted,
Figure 184552DEST_PATH_IMAGE012
for embedding i-th bit watermark information w i The latter data;
traversing random sequences of watermark information
Figure 487357DEST_PATH_IMAGE001
And sequentially embedding the n-bit watermark information into a preset database D according to the embedding method of the watermark information to obtain a watermark database.
In one embodiment, the predetermined database is a numeric database
Figure 287315DEST_PATH_IMAGE005
(ii) a Wherein the content of the first and second substances,
Figure 700979DEST_PATH_IMAGE006
is the attribute column of the numeric database, L is the label,
Figure 319042DEST_PATH_IMAGE008
a unique identification key value for each tuple in the numeric database, wherein the numeric database comprises M tuples;
carrying out Hash grouping on the numerical database D to obtain a plurality of groups;
random sequence with length n and related to identity identification is generated by using random sequence
Figure 792749DEST_PATH_IMAGE001
Traversing a random sequence W related to the identity, and sequentially embedding n-bit identity information into a numerical database D to obtain a watermark database
Figure 496262DEST_PATH_IMAGE013
Figure 182459DEST_PATH_IMAGE014
Wherein, in the step (A),
Figure 936919DEST_PATH_IMAGE014
the attribute column of the watermark database, L is the label,
Figure 315948DEST_PATH_IMAGE008
a unique identifying primary key value for each tuple in the watermark database, which comprises M tuples.
After the watermark database is obtained, in order to ensure that the decision tree can be accurately established according to the watermark database, the second decision tree corresponding to the watermark database is adjusted according to the first decision tree of the preset database, and then the second data in the watermark database is modified, so that the modified second data can generate the correct second decision tree.
Specifically, in an optional embodiment of the present invention, step 14 includes:
step 141, obtaining a first partition value of the first decision tree and a second partition value of the second decision tree;
step 142, determining the modification direction of the second data in the watermark database according to the first division value and the second division value; the modification direction is to modify the target tag in the first database segment in the watermark database to the second database segment;
step 143, modifying the second data in the watermark database according to a preset constraint equation and the modification direction to obtain a modified watermark database until a third decision tree generated according to the data in the modified watermark database is the same as the first decision tree;
step 144, determining the modified watermark database as a target watermark database.
In this embodiment, the first decision tree is respectively established according to the first data in the preset database and the second data in the watermark database
Figure 772337DEST_PATH_IMAGE015
And a second decision tree
Figure 262224DEST_PATH_IMAGE016
Wherein F is a set of first partition values of the first decision tree, in particular
Figure 854880DEST_PATH_IMAGE017
Figure 155542DEST_PATH_IMAGE018
M1 is the sum of the number of the first partition values, and Y is a leaf node of the first decision tree; f w A set of second partition values for a second decision tree, in particular
Figure 364806DEST_PATH_IMAGE019
Figure 658385DEST_PATH_IMAGE020
Is a second division value, m2 is the sum of the numbers of the second division values, Y w Is a leaf node of the second decision tree;
it should be noted that, here, the first decision Tree is generated according to a preset database And the second decision Tree is generated according to a watermark database, preferably through a CART (Classification And Regression Tree) decision Tree algorithm.
Wherein the first and second division values may divide the watermark database into three segments.
Specifically, the watermark database is sorted according to the attribute column to obtain a sorted watermark database, and then the first separation value and the second separation value are positioned on the sorted watermark database, specifically:
by the formula
Figure 371126DEST_PATH_IMAGE021
The positioning is performed on the sorted watermark database, wherein,
Figure 826378DEST_PATH_IMAGE022
and
Figure 8092DEST_PATH_IMAGE023
for adjacent second data in the sorted watermark database,
Figure 839781DEST_PATH_IMAGE024
is a segmentation value;
formula (la)
Figure 407029DEST_PATH_IMAGE021
Represents: if the adjacent watermark can be found in the attribute column of the sorted watermark database
Figure 298761DEST_PATH_IMAGE022
And
Figure 951460DEST_PATH_IMAGE023
two data to divide the value
Figure 600222DEST_PATH_IMAGE024
Satisfy the formula
Figure 756397DEST_PATH_IMAGE021
Then will be
Figure 84610DEST_PATH_IMAGE022
And
Figure 959025DEST_PATH_IMAGE023
position of between as a division value
Figure 663676DEST_PATH_IMAGE024
Of the position of (a).
In a further particular embodiment, as shown in fig. 2, in the sorted watermark database D w Respectively locating the first separation value
Figure 939937DEST_PATH_IMAGE024
And a second division value
Figure 189784DEST_PATH_IMAGE025
First separation value
Figure 551495DEST_PATH_IMAGE024
In the position of
Figure 794257DEST_PATH_IMAGE022
And
Figure 925024DEST_PATH_IMAGE023
position in between, second division value
Figure 595040DEST_PATH_IMAGE025
In the position of
Figure 194780DEST_PATH_IMAGE026
And
Figure 710075DEST_PATH_IMAGE027
in between, it can be seen that according to the first separation value
Figure 226507DEST_PATH_IMAGE024
And a second division value
Figure 801845DEST_PATH_IMAGE025
The position of the watermark can be sorted into the database D of the watermark w Is divided into three sections.
In an optional embodiment of the present invention, the first segmentation value divides a preset database into two database segments; the second partitioning value divides the watermark database into two database segments;
step 142 includes:
step 1421, determining a first index and a second index of the first segmentation value in a preset database;
step 1422, determining a third index and a fourth index of the second segmentation value in the watermark database; the index represents the probability of the data in a random number in the database segment being mistaken;
step 1423, calculating a difference between the first index and the third index to obtain a first difference;
step 1424, calculating a difference between the second index and the fourth index to obtain a second difference;
step 1425, comparing the first difference with the second difference, and determining a modification direction of the second data in the watermark database.
In this embodiment, the modification direction of the second data is determined, so that the modification of the second data can effectively adjust the second decision tree, so that the second decision tree is similar to or equal to the first decision tree, and the modification of the second data in the watermark database may be performed by determining the modification direction of the second data first, or by determining the second data to be modified first;
here, determining the modification direction thereof includes: dividing a preset database into two database sections through a first division value, dividing a watermark database into two database sections through a second division value, positioning the first division value into the watermark database, and determining a first index and a second index of the first division value in the preset database; the first index represents an index value of a first database segment in the preset database after the first segmentation value is segmented, and the second index represents an index value of a second database segment in the preset database after the second segmentation value is segmented; the index represents the probability of the data in a random number in the database segment being mistaken;
specifically, the index is given by the formula
Figure 403727DEST_PATH_IMAGE028
And calculating to obtain the result, wherein,
Figure 722713DEST_PATH_IMAGE029
is an index to the database segment and,
Figure 844384DEST_PATH_IMAGE030
the variable quantity of the database section is represented by p, and the proportion of the data of the target label in a preset database is represented by p;
respectively calculating to obtain a first index of the first database segment and a second index of the second database segment after the preset database is segmented by the first partitioning value according to the formula; the third index of the third database segment and the fourth index of the fourth database segment after the watermark database is divided by the second partition value can be respectively calculated through the formula;
it should be noted that the first database segment partitioned by the first partition value to the preset database is on the first side of the first partition value, the second database segment is on the second side of the first partition value, the third database segment partitioned by the second partition value to the watermark database is on the first side of the second partition value, the fourth database segment is on the second side of the second partition value, that is, the first database segment corresponds to the third database segment, and the second database segment corresponds to the fourth database segment;
and then can pass through the formula
Figure 590623DEST_PATH_IMAGE031
A difference (first difference) between the first exponent and the third exponent is calculated, wherein,
Figure 148644DEST_PATH_IMAGE032
is a first difference value of the first difference value,
Figure 536900DEST_PATH_IMAGE033
is a first index of the number of bits,
Figure 496765DEST_PATH_IMAGE034
is a second index;
and by the formula
Figure 427288DEST_PATH_IMAGE035
Calculating a second index and a fourth indexA difference (second difference) of (a) wherein,
Figure 738183DEST_PATH_IMAGE036
in order to be the second difference value,
Figure 664551DEST_PATH_IMAGE037
is a third index of the number of the first and second indices,
Figure 478923DEST_PATH_IMAGE038
is a fourth index;
further, comparing the first difference value with the second difference value to obtain a comparison result, and determining the modification direction of the second data in the watermark database according to the comparison result; the comparison results here include at least three of the following:
1) The first difference value and the second difference value have the same positive and negative values;
2) The first difference and/or the second difference is 0;
3) The first difference and/or the second difference is not 0;
when the first difference value and the second difference value are consistent in positive and negative, the first separation value is represented to be identical in leaf nodes in the first data and the second data;
when the first difference value and/or the second difference value is 0, the target label is represented to have the same proportion in the preset database and/or the watermark database; according to the first difference and/or the second difference, third differences obtained by second data in different modification directions can be screened;
when the first difference and/or the second difference is not 0, the second data in the watermark database may be selected to be modified so that the first difference and/or the second difference becomes smaller as a modification direction, and then third differences of the modification direction are respectively calculated, and a direction with the largest third difference is selected from the third differences to be determined as the modification direction of the data.
In an alternative embodiment of the present invention, step 143 comprises:
step 1431, obtaining a first variation of the first partition value in the watermark database and a second variation of the second partition value in the watermark database;
step 1432, obtaining a third difference according to the first variation and the second variation;
step 1433, according to the third difference, modifying the second data in the watermark database according to the modification direction under the condition that the preset constraint equation is satisfied, so as to obtain a modified watermark database.
In this embodiment, the first decision tree is obtained by the preset database according to a decision tree algorithm, and the decision tree algorithm selects the position of the second data with the smallest variation for segmentation each time, specifically, the position of the second data with the smallest variation may be segmented by a formula
Figure 832544DEST_PATH_IMAGE039
A target index is calculated for each segmentation, wherein,
Figure 630736DEST_PATH_IMAGE040
to segment the database on the first side when the second data is partitioned as a partition value,
Figure 111527DEST_PATH_IMAGE041
database segment for first side
Figure 780406DEST_PATH_IMAGE040
The number of tuples in the list (n),
Figure 570507DEST_PATH_IMAGE042
to segment the database on the second side when the second data is partitioned as a partition value,
Figure 590416DEST_PATH_IMAGE043
database segment for second side
Figure 593007DEST_PATH_IMAGE042
The number of tuples in the list (n),
Figure 398283DEST_PATH_IMAGE044
is composed of
Figure 93706DEST_PATH_IMAGE040
The target index of (2) is,
Figure 335332DEST_PATH_IMAGE045
is composed of
Figure 407193DEST_PATH_IMAGE042
The amount of change in the amount of change,
Figure 316243DEST_PATH_IMAGE024
is a segmentation value;
acquiring a first target index of the first segmentation value in the watermark database and a second target index of the second segmentation value in the watermark database according to the formula;
taking the example that the fifth database segment modifies the second data into the sixth database segment when the first partition value and the second partition value can partition the watermark database into three segments (the fifth database segment, the sixth database segment, and the seventh database segment), the following describes the calculation process of the first variation of the modified first partition value in the watermark database:
when the first division value is obtained
Figure 182568DEST_PATH_IMAGE024
Modifies a second data to the first partition value
Figure 927801DEST_PATH_IMAGE024
To the right (second side), a third target index of the first segmentation value in the watermark database is calculated:
Figure 537774DEST_PATH_IMAGE046
wherein, the first and the second end of the pipe are connected with each other,
Figure 301331DEST_PATH_IMAGE041
and
Figure 72978DEST_PATH_IMAGE043
respectively representing the number of tuples in a fifth database segment and a sixth database segment after the watermark database is divided by the first separation value;
Figure 820354DEST_PATH_IMAGE047
Representing the value from the first division
Figure 968439DEST_PATH_IMAGE024
Modifies a second data to the first partition value
Figure 599884DEST_PATH_IMAGE024
Right side (second side);
Figure 808011DEST_PATH_IMAGE048
and
Figure 777104DEST_PATH_IMAGE049
respectively representing the number of data of the first tag and the second tag in the fifth database segment.
Figure 463300DEST_PATH_IMAGE050
And
Figure 467029DEST_PATH_IMAGE051
respectively representing the number of the data of the first label and the second label in the sixth database segment;
Figure 331210DEST_PATH_IMAGE052
representing the number of tuples in the watermark database;
Figure 787600DEST_PATH_IMAGE053
is a first constant which is a function of the first,
Figure 277487DEST_PATH_IMAGE054
is a second constant, if the modified second data has the first label,
Figure 870142DEST_PATH_IMAGE055
if the modified second data is the second tag,
Figure 685651DEST_PATH_IMAGE056
further, according to the first target index and the third target index, calculating a first variation of the first division value in the watermark database as follows:
Figure 363757DEST_PATH_IMAGE057
wherein the content of the first and second substances,
Figure 408068DEST_PATH_IMAGE058
representing a segmentation value
Figure 855230DEST_PATH_IMAGE024
The tag in the left (first side) data is the data of the kth tag,
Figure 841640DEST_PATH_IMAGE059
representing a segmentation value
Figure 7042DEST_PATH_IMAGE024
The data labeled with the kth label in the right (second side) data,
Figure 104311DEST_PATH_IMAGE060
if the shifted tag is the first tag, then
Figure 156712DEST_PATH_IMAGE061
If the shifted tag is the second tag, then
Figure 48445DEST_PATH_IMAGE062
Correspondingly, when dividing the value from the first
Figure 966722DEST_PATH_IMAGE024
Modifies a second data to the first partition value
Figure 336524DEST_PATH_IMAGE024
On the left side (first side), the first division value is calculated in the watermark dataFourth target index in library:
Figure 758278DEST_PATH_IMAGE063
wherein, the first and the second end of the pipe are connected with each other,
Figure 86491DEST_PATH_IMAGE041
and
Figure 708709DEST_PATH_IMAGE043
respectively representing the tuple numbers in a fifth database segment and a sixth database segment after the watermark database is partitioned by the first partition value;
Figure 413360DEST_PATH_IMAGE064
representing the value from the first division
Figure 689620DEST_PATH_IMAGE024
Modifies a second data to the first partition value
Figure 188735DEST_PATH_IMAGE024
The left side (first side) of (a);
Figure 550446DEST_PATH_IMAGE048
and
Figure 278362DEST_PATH_IMAGE049
respectively representing the number of data of the first tag and the second tag in the fifth database segment.
Figure 674708DEST_PATH_IMAGE050
And
Figure 79144DEST_PATH_IMAGE051
respectively representing the number of the data of the first label and the second label in the sixth database segment;
Figure 193731DEST_PATH_IMAGE052
representing the number of tuples in the watermark database;
Figure 709026DEST_PATH_IMAGE053
is a first constant which is a function of the first,
Figure 225458DEST_PATH_IMAGE054
is a second constant, if the modified second data has the first label,
Figure 551528DEST_PATH_IMAGE055
if the modified second data is the second tag,
Figure 153411DEST_PATH_IMAGE056
further, according to the first target index and the fourth target index, calculating a first variation of the first division value in the watermark database as follows:
Figure DEST_PATH_IMAGE065
wherein, the first and the second end of the pipe are connected with each other,
Figure 737976DEST_PATH_IMAGE058
representing a division value
Figure 108914DEST_PATH_IMAGE024
The tag in the left (first side) data is the kth tag data,
Figure 855153DEST_PATH_IMAGE059
representing a division value
Figure 429485DEST_PATH_IMAGE024
The data labeled as the kth label in the right (second side) data,
Figure 817741DEST_PATH_IMAGE060
if the shifted tag is the first tag, then
Figure 777607DEST_PATH_IMAGE061
If shiftedThe label is a second label, then
Figure 694748DEST_PATH_IMAGE062
Similarly, a secondary cut value may also be calculated according to the above equation
Figure 271222DEST_PATH_IMAGE025
The second variation is not described herein; respectively calculating the first division values according to the formula
Figure 945393DEST_PATH_IMAGE024
First variance in watermark database
Figure 759765DEST_PATH_IMAGE066
And a second division value
Figure 113386DEST_PATH_IMAGE025
Second variance in watermark database
Figure 911578DEST_PATH_IMAGE067
(ii) a According to the first variation and the second variation, passing through a formula
Figure 110478DEST_PATH_IMAGE068
Obtaining a third difference, wherein d is the third difference;
the magnitude of the third difference d reflects the variation magnitude of the variation, and the larger d is, the closer the second segmentation value is to the first segmentation value after the second data is modified, and when the second data is modified, the second data with the largest value d needs to be continuously selected for modification, so that the position of the second segmentation value is the same as the position of the first segmentation value;
furthermore, it should be noted that, after determining the modification direction and determining the second data to be modified according to the third difference, the second data is modified, and in order that the modification of the second data does not affect the watermark information and the distortion of the data is small, the modification of the second data is defined as a constraint equation solving problem;
when the first and second partition values are located in the watermark database as shown in fig. 2, the first tag (L) in the fifth database segment is marked according to the modification direction 1 ) If the second data is modified into the sixth database segment, the preset constraint equation is as follows:
Figure 61248DEST_PATH_IMAGE069
wherein the content of the first and second substances,
Figure 585770DEST_PATH_IMAGE070
for the second data of the tag in the fifth database segment,
Figure 340099DEST_PATH_IMAGE071
represents the modified third data as a result of the modification,
Figure 873849DEST_PATH_IMAGE072
is an integer step (for LSB watermark algorithm, when the modified step p =2, the watermark information is not destroyed),
Figure DEST_PATH_IMAGE073
in fractional steps (e.g., 3.98 with a fractional precision of 0.01, then l = 0.001),
Figure 193972DEST_PATH_IMAGE074
is the value of the second division to be,
Figure 374548DEST_PATH_IMAGE024
is the first division value.
According to the third difference d, under the condition of meeting a preset constraint equation, carrying out multiple iterations to modify second data in the watermark database according to a modification direction, and obtaining a modified watermark database; wherein, by traversing the searching way, if found
Figure 147332DEST_PATH_IMAGE075
And
Figure 953614DEST_PATH_IMAGE076
if the preset constraint equation is satisfied, the second data in the watermark database is indicated
Figure 862664DEST_PATH_IMAGE070
Modification shift can be carried out without damage, and if the condition that the preset constraint equation is not met is not found
Figure 728989DEST_PATH_IMAGE075
And
Figure 474222DEST_PATH_IMAGE076
then, the second data in the watermark database is indicated
Figure 84195DEST_PATH_IMAGE070
The shift can not be carried out without destroying the watermark information, and the next sample type is required to be continuously found in the area as the first label L 1 Performing solution on the preset constraint equation until a solution meeting the preset constraint equation is found; and finally, the obtained third decision tree of the target watermark database is the same as the first decision tree of the preset database.
In a specific embodiment, the first data in the preset database D is shown in the following table:
TABLE 1
Figure 847752DEST_PATH_IMAGE077
Wherein the first division value in the database D is preset
Figure 884978DEST_PATH_IMAGE078
Is 50.6, the first division value
Figure 366775DEST_PATH_IMAGE078
Segmenting a predetermined database D into first database segments D 1 And a second database segment D2;
the second data in the watermark database sorted according to the attribute column is shown in the following table:
TABLE 2
Figure 780439DEST_PATH_IMAGE079
Wherein the watermark database D w Second division value of (1)
Figure 146305DEST_PATH_IMAGE080
60.5, first division value
Figure 354432DEST_PATH_IMAGE078
Between numbers 2 and 4 of Table 2 in the watermark database D w In (2), dividing the first division value
Figure 57946DEST_PATH_IMAGE078
Location to watermark database D w In (1), the first division value
Figure 9722DEST_PATH_IMAGE078
And a second division value
Figure 747870DEST_PATH_IMAGE080
To watermark database D w Dividing the data into a fifth database segment, a sixth database segment and a seventh database segment;
by means of exponential calculation formulas
Figure 126899DEST_PATH_IMAGE028
Calculating a first index of the first database segment as
Figure 334021DEST_PATH_IMAGE081
The second index of the second database segment and the seventh database segment is
Figure 823908DEST_PATH_IMAGE082
The third index of the fifth database segment and the sixth database segment is
Figure 682142DEST_PATH_IMAGE083
Fourth exponent of seventh database segment
Figure 232072DEST_PATH_IMAGE084
Wherein p represents the proportion of the label L1;
by the formula
Figure 644599DEST_PATH_IMAGE031
And formula
Figure 954489DEST_PATH_IMAGE035
Calculating the difference between the first index and the third index to obtain a first difference
Figure 401651DEST_PATH_IMAGE085
Calculating the difference between the second index and the fourth index to obtain a second difference
Figure 388061DEST_PATH_IMAGE086
Due to the second difference
Figure 553464DEST_PATH_IMAGE087
Then it represents the first partition value in the watermark database
Figure 650733DEST_PATH_IMAGE078
The data proportion of the label L1 in the database section on the right side (the sixth database section and the seventh database section) is larger than the first segmentation value in the preset database
Figure 952401DEST_PATH_IMAGE078
The data proportion occupied by the label L1 in the database section on the right side;
it should be noted that, here, the calculation is performed first
Figure 594866DEST_PATH_IMAGE032
And
Figure 247564DEST_PATH_IMAGE036
firstly judging the modification direction, and then determining second data with the minimum d value as second data to be modified; of course, it is also possible to calculate the second data to be modified first and then determine the modification direction, specificallyWhen the second data to be modified is calculated and then the modification direction is determined, the d values of different labels (L1, L2) moving to different positions are calculated and obtained, and then the calculation is carried out
Figure 882945DEST_PATH_IMAGE032
And
Figure 304699DEST_PATH_IMAGE036
then by
Figure 632912DEST_PATH_IMAGE032
And
Figure 507327DEST_PATH_IMAGE036
screening;
in order to ensure that the leaf nodes of the second decision tree and the first decision tree are identical, in the watermark database D w From the first division value
Figure 694201DEST_PATH_IMAGE078
Selects a second data with label L1 to modify to the first segmentation value
Figure 970462DEST_PATH_IMAGE078
In the left database segment (fifth database segment) to ensure that the leaf nodes are close in proportion, therefore, the modification direction is determined as: in a watermark database D w To select a first division value
Figure 203997DEST_PATH_IMAGE078
Is modified to the first partitioning value
Figure 831288DEST_PATH_IMAGE078
A left database segment of;
based on the modification direction, the watermark data x is obtained by calculation through a preset constraint equation 1 Of
Figure 808471DEST_PATH_IMAGE088
Figure 221129DEST_PATH_IMAGE089
Then, then
Figure 625565DEST_PATH_IMAGE090
. It can be seen that after the second data shift is performed according to the modification direction, the watermark data x 1 Becomes smaller, and
Figure 474573DEST_PATH_IMAGE091
is unchanged; wherein the first division value
Figure 255447DEST_PATH_IMAGE078
The second data of the right-hand database segment (sixth database segment and seventh database segment) labeled L1 is divided into the first division value
Figure 771879DEST_PATH_IMAGE078
And a second division value
Figure 347217DEST_PATH_IMAGE080
Therefore, only the d value of the second data labeled as L1 needs to be calculated; if the second division value
Figure 168673DEST_PATH_IMAGE080
If the second data labeled L1 also exists in the database segment on the right side, the value d of the second data labeled L1 also needs to be calculated, and the second data with the larger value d is selected as the second data to be modified.
In order to ensure that the distortion of the second data in the watermark database is small and the watermark information is unchanged; shifting the tuple with the primary key of 3 for the first time, modifying the second data with the primary key of 3 according to a preset constraint equation, regenerating a third decision tree by using a CART decision tree algorithm after one-time modification, and obtaining a first segmentation value
Figure 753238DEST_PATH_IMAGE078
And a second division value
Figure 858598DEST_PATH_IMAGE080
The positioning positions of (2) are matched as shown in the following table (first division value in table 3 below)
Figure 870416DEST_PATH_IMAGE078
And a second division value
Figure 694016DEST_PATH_IMAGE080
Equal and both between 3 and 4):
TABLE 3
Figure 567425DEST_PATH_IMAGE092
Further, the other segmentation values are adjusted in sequence, so that the segmentation values in the first decision tree and the third decision tree are completely the same.
In an optional embodiment of the present invention, the method for processing watermark information in a database further includes:
and step 15, extracting the watermark from the target watermark database to obtain target watermark information.
In this embodiment, the watermark extraction of the target watermark database refers to extracting a watermark from the target watermark database containing watermark information by using a watermark extraction algorithm, and using the extracted watermark information for copyright certification, piracy tracing, integrity authentication, and the like, and the watermark extraction stage of the present application includes: data grouping, watermark information extraction and watermark voting.
Specifically, in an optional embodiment of the present invention, step 15 includes:
step 151, performing hash grouping on the watermark data in the target watermark database to obtain a plurality of target groups;
step 152, performing watermark extraction on the watermark data in each target packet to obtain target watermark information.
In this embodiment, the data grouping method is similar to the watermark embedding stage, and the watermark data in the target watermark database is subjected to hash grouping according to the primary key value of the tuple to obtain a plurality of target groups, and each target group is subjected to hash groupingWatermark extraction is carried out on watermark data in the mark groups to obtain target watermark information, and specifically: by the formula
Figure 792870DEST_PATH_IMAGE093
Calculating to obtain watermark information in each tuple; wherein
Figure 975589DEST_PATH_IMAGE094
(ii) a Because the watermark information embedded in the same group is the same, when the watermark information is different, the watermark information in the same group is further voted, and the watermark information of the group is determined according to majority obeying minority principle.
In another specific embodiment, as shown in fig. 3 to 5, an original database containing binary data of a plurality of tuples is tested by the database watermarking information processing method in the embodiment of the present application; fig. 3 shows a first decision tree generated by an original database of classified data, and fig. 4 shows a second decision tree generated by a watermark database of classified data after embedding 64bits of watermark information, and it can be seen that the second decision tree of the watermark database of classified data generated after embedding watermark information is changed, unlike the first decision tree;
fig. 5 is a third decision tree generated by target watermark data of classified data after modification of the second decision tree, and it can be seen that the third decision tree of the target watermark data can be ensured to be the same as the first decision tree of the original database by the method for processing watermark information of a database in the embodiment of the present application.
In another specific embodiment, in order to ensure that the target watermark database in the embodiment of the present application can effectively perform copyright authentication and tracing during the actually applied shared transaction, the robustness of the target watermark database is tested:
respectively giving watermark robustness test results after deletion, insertion and modification of attribute values in an attribute column aiming at a target watermark database tuple, wherein the watermark robustness test results are determined by extracting Bit Error Rate (BER), the larger the BER value is, the more errors of the extracted watermark are shown, and when the BER is 0, the completely correct watermark information is shown;
deleting the target watermark database tuple to randomly select a database tuple with a given proportion, and obtaining a watermark robustness test result shown in the following table:
TABLE 4
Figure 20906DEST_PATH_IMAGE095
As can be seen from table 4, in the case of deletion, both the LSB watermark algorithm and the watermark information processing method of the database in the embodiment of the present application can correctly extract watermark information, and with the increase of the deletion ratio, the robustness of the method in the embodiment of the present application and the robustness of the LSB watermark algorithm can also be kept consistent;
the insertion of the target watermark database tuple is to insert the database tuple with the specified proportion at random, the inserted tuple key still has uniqueness, the attribute value of the inserted tuple is randomly selected from the attribute values of the target watermark database, and the tag of the tuple is randomly generated according to the existing tag to obtain the watermark robustness test result shown in the following table:
TABLE 5
Figure 681694DEST_PATH_IMAGE096
As can be seen from table 5, in the case of insertion, both the LSB watermark algorithm and the watermark information processing method of the database in the embodiment of the present application can correctly extract watermark information, and with the increase of the insertion ratio, the robustness of the method in the embodiment of the present application and the robustness of the LSB watermark algorithm can also be kept consistent, and the watermark information processing method of the database in the embodiment of the present application not only modifies the second decision tree into a decision tree that is the same as the first decision tree, but also well maintains the robustness;
wherein, the modification of the attribute value in the attribute column is to modify the attribute value randomly, and the modified attribute value does not violate the semantics of the current attribute, so as to obtain the watermark robustness test result shown in the following table:
TABLE 6
Figure 27225DEST_PATH_IMAGE097
As can be seen from table 6, under the condition of modifying the attribute value, both the LSB watermark algorithm and the watermark information processing method of the database in the embodiment of the present application can correctly extract the watermark information, and as the modification ratio increases, the robustness of the method in the embodiment of the present application and the robustness of the LSB watermark algorithm can also be kept consistent, and the watermark information processing method of the database in the embodiment of the present application not only modifies the second decision tree into a decision tree that is the same as the first decision tree, but also well maintains the robustness;
in summary, the watermark information processing method of the database in the embodiment of the application not only modifies the second decision tree into the same decision tree as the first decision tree, but also well maintains robustness.
As shown in fig. 6, in a specific embodiment, the original database D is processed with watermark information, and the watermark information W is embedded into the original database D by using a watermark embedding algorithm (LSB), so as to obtain the watermark database D w (ii) a First decision tree generated based on original database D and watermark database D w Generating a second decision tree, carrying out decision tree reconstruction on the second decision tree, carrying out division value positioning on the first decision tree and the second decision tree, judging a modification direction according to the division values of the first decision tree and the second decision tree, and carrying out decision tree reconstruction on the watermark database D based on the modification direction w Modifying the second data to obtain a target watermark database; wherein, a third decision tree generated according to third data in the target watermark database is the same as the first decision tree;
furthermore, watermark extraction can be performed on the target watermark database through a watermark extraction algorithm to obtain target watermark information, and the extracted target watermark information can be used for copyright confirmation and tracing, so that the effectiveness of data is ensured, the accuracy of watermark information extraction is improved, and the robustness is good.
Embodiments of the present invention generate watermark information; embedding the watermark information into a preset database to obtain a watermark database; generating a first decision tree according to first data in the preset database and generating a second decision tree according to second data in the watermark database; modifying second data in the watermark database according to the first decision tree and the second decision tree to obtain a target watermark database; the third decision tree generated according to the third data in the target watermark database is the same as the first decision tree, so that the threat of the database on data safety in copyright authentication and tracing is overcome, the effectiveness of the data is ensured, the accuracy of extracting watermark information is improved, and the robustness is good.
As shown in fig. 7, an embodiment of the present invention further provides a watermark information processing apparatus 70 for a database, including:
a generating module 71, configured to generate watermark information;
the processing module 72 is configured to embed the watermark information into a preset database to obtain a watermark database; generating a first decision tree according to first data in the preset database and generating a second decision tree according to second data in the watermark database; modifying second data in the watermark database according to the first decision tree and the second decision tree to obtain a target watermark database; wherein a third decision tree generated from third data in the target watermark database is the same as the first decision tree.
Optionally, the watermark information is a random sequence with a length of n
Figure 886507DEST_PATH_IMAGE001
Figure 419119DEST_PATH_IMAGE002
For the ith bit watermark information in the random sequence W,
Figure 149178DEST_PATH_IMAGE003
Figure 83636DEST_PATH_IMAGE004
embedding the watermark information into a preset database to obtain a watermark database, comprising:
performing hash grouping on M tuples in a preset database according to a key value of each tuple to obtain a plurality of groups;
the first in the watermark information
Figure 608158DEST_PATH_IMAGE098
Bit watermark information
Figure 628067DEST_PATH_IMAGE002
Is embedded into
Figure 646970DEST_PATH_IMAGE099
And obtaining a watermark database in each group.
Optionally, modifying second data in the watermark database according to the first decision tree and the second decision tree to obtain a target watermark database, including:
acquiring a first segmentation value of a first decision tree and a second segmentation value of a second decision tree;
determining the modification direction of second data in the watermark database according to the first segmentation value and the second segmentation value; the modification direction is to modify the target tag in the first database segment in the watermark database to the second database segment;
modifying second data in the watermark database according to a preset constraint equation and the modification direction to obtain a modified watermark database until a third decision tree generated according to data in the modified watermark database is the same as the first decision tree;
and determining the modified watermark database as a target watermark database.
Optionally, the first segmentation value divides the preset database into two database segments; the second partitioning value divides the watermark database into two database segments;
determining a modification direction of second data in the watermark database according to the first segmentation value and the second segmentation value, including:
determining a first index and a second index of the first segmentation value in a preset database;
determining a third index and a fourth index of the second segmentation value in a watermark database; the index represents the probability of the data in a random number in the database segment being mistaken;
calculating the difference value of the first index and the third index to obtain a first difference;
calculating the difference value of the second index and the fourth index to obtain a second difference value;
and comparing the first dispersion with the second dispersion to determine the modification direction of the second data in the watermark database.
Optionally, modifying the second data in the watermark database according to a preset constraint equation and the modification direction to obtain a modified watermark database, including:
acquiring a first variation of the first segmentation value in the watermark database and a second variation of the second segmentation value in the watermark database;
obtaining a third difference according to the first variation and the second variation;
and modifying the second data in the watermark database according to the modification direction under the condition of meeting the preset constraint equation according to the third difference to obtain a modified watermark database.
Optionally, the method for processing watermark information in a database further includes:
and carrying out watermark extraction on the target watermark database to obtain target watermark information.
Optionally, the watermark extraction is performed on the target watermark database to obtain target watermark information, and the method includes:
performing hash grouping on watermark data in a target watermark database to obtain a plurality of target groups;
and watermark extraction is carried out on the watermark data in each target group to obtain target watermark information.
It should be noted that the apparatus is an apparatus corresponding to the above method, and all the implementations in the above method embodiment are applicable to the embodiment of the apparatus, and the same technical effects can be achieved.
Embodiments of the present invention also provide a computing device, comprising: a processor, a memory storing a computer program which, when executed by the processor, performs the method as described above. All the implementation manners in the above method embodiment are applicable to this embodiment, and the same technical effect can be achieved.
Embodiments of the present invention also provide a computer-readable storage medium storing instructions which, when executed on a computer, cause the computer to perform the method as described above. All the implementation manners in the method embodiment are applicable to the embodiment, and the same technical effect can be achieved.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
Furthermore, it is to be noted that in the device and method of the invention, it is obvious that the individual components or steps can be decomposed and/or recombined. These decompositions and/or recombinations are to be regarded as equivalents of the present invention. Also, the steps of performing the series of processes described above may naturally be performed chronologically in the order described, but need not necessarily be performed chronologically, and some steps may be performed in parallel or independently of each other. It will be understood by those skilled in the art that all or any of the steps or elements of the method and apparatus of the present invention may be implemented in any computing device (including processors, storage media, etc.) or network of computing devices, in hardware, firmware, software, or any combination thereof, which can be implemented by those skilled in the art using their basic programming skills after reading the description of the present invention.
The object of the invention is thus also achieved by a program or a set of programs running on any computing device. The computing device may be a well-known general purpose device. The object of the invention is thus also achieved solely by providing a program product comprising program code for implementing the method or the apparatus. That is, such a program product also constitutes the present invention, and a storage medium storing such a program product also constitutes the present invention. It is to be understood that the storage medium may be any known storage medium or any storage medium developed in the future. It is further noted that in the apparatus and method of the present invention, it is apparent that each component or step can be decomposed and/or recombined. These decompositions and/or recombinations are to be considered as equivalents of the present invention. Also, the steps of executing the series of processes described above may naturally be executed chronologically in the order described, but need not necessarily be executed chronologically. Some steps may be performed in parallel or independently of each other.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (10)

1. A method for processing watermark information of a database, comprising:
generating watermark information;
embedding the watermark information into a preset database to obtain a watermark database;
generating a first decision tree according to first data in the preset database and generating a second decision tree according to second data in the watermark database;
modifying second data in the watermark database according to the first decision tree and the second decision tree to obtain a target watermark database; wherein a third decision tree generated from third data in the target watermark database is the same as the first decision tree.
2. The method as claimed in claim 1, wherein the watermark information is a random sequence of length n
Figure DEST_PATH_IMAGE001
Figure 429623DEST_PATH_IMAGE002
For the ith bit watermark information in the random sequence W,
Figure DEST_PATH_IMAGE003
Figure 48561DEST_PATH_IMAGE004
embedding the watermark information into a preset database to obtain a watermark database, wherein the watermark database comprises:
performing hash grouping on M tuples in a preset database according to a key value of each tuple to obtain a plurality of groups;
the ith bit watermark information w in the watermark information is processed i Is embedded in the t-th i And obtaining a watermark database in each group.
3. The method for processing watermark information of a database according to claim 1, wherein modifying second data in the watermark database according to the first decision tree and the second decision tree to obtain a target watermark database comprises:
acquiring a first segmentation value of a first decision tree and a second segmentation value of a second decision tree;
determining the modification direction of second data in the watermark database according to the first segmentation value and the second segmentation value; the modification direction is to modify the target tag in the first database segment in the watermark database to the second database segment;
modifying second data in the watermark database according to a preset constraint equation and the modification direction to obtain a modified watermark database until a third decision tree generated according to data in the modified watermark database is the same as the first decision tree;
and determining the modified watermark database as a target watermark database.
4. The method for processing the watermark information of the database according to claim 3, wherein the first partition value divides a preset database into two database segments; the second partition value divides the watermark database into two database segments;
determining a modification direction of second data in the watermark database according to the first segmentation value and the second segmentation value, including:
determining a first index and a second index of the first segmentation value in a preset database;
determining a third index and a fourth index of the second segmentation value in a watermark database; the index represents the probability of the data in a random number in the database segment being mistaken;
calculating the difference value of the first index and the third index to obtain a first difference;
calculating a difference value between the second index and the fourth index to obtain a second difference;
and comparing the first dispersion with the second dispersion to determine the modification direction of the second data in the watermark database.
5. The method for processing the watermark information of the database according to claim 3, wherein the step of modifying the second data in the watermark database according to a preset constraint equation and the modification direction to obtain a modified watermark database comprises:
acquiring a first variation of the first segmentation value in the watermark database and a second variation of the second segmentation value in the watermark database;
obtaining a third difference according to the first variation and the second variation;
and modifying the second data in the watermark database according to the modification direction under the condition of meeting the preset constraint equation according to the third difference to obtain a modified watermark database.
6. The method for processing watermark information of a database according to claim 1, further comprising:
and extracting the watermark from the target watermark database to obtain target watermark information.
7. The method for processing watermark information of a database according to claim 6, wherein extracting a watermark from the target watermark database to obtain target watermark information comprises:
performing hash grouping on watermark data in a target watermark database to obtain a plurality of target groups;
and watermark extraction is carried out on the watermark data in each target group to obtain target watermark information.
8. An apparatus for processing watermark information of a database, comprising:
the generating module is used for generating watermark information;
the processing module is used for embedding the watermark information into a preset database to obtain a watermark database; generating a first decision tree according to first data in the preset database and generating a second decision tree according to second data in the watermark database; modifying second data in the watermark database according to the first decision tree and the second decision tree to obtain a target watermark database; wherein a third decision tree generated from third data in the target watermark database is the same as the first decision tree.
9. A computing device, comprising: a processor, a memory storing a computer program which, when executed by the processor, performs the method of any of claims 1 to 7.
10. A computer-readable storage medium storing instructions that, when executed on a computer, cause the computer to perform the method of any one of claims 1 to 7.
CN202211417478.5A 2022-11-14 2022-11-14 Method, device and equipment for processing watermark information of database Active CN115455383B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211417478.5A CN115455383B (en) 2022-11-14 2022-11-14 Method, device and equipment for processing watermark information of database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211417478.5A CN115455383B (en) 2022-11-14 2022-11-14 Method, device and equipment for processing watermark information of database

Publications (2)

Publication Number Publication Date
CN115455383A true CN115455383A (en) 2022-12-09
CN115455383B CN115455383B (en) 2023-03-24

Family

ID=84295622

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211417478.5A Active CN115455383B (en) 2022-11-14 2022-11-14 Method, device and equipment for processing watermark information of database

Country Status (1)

Country Link
CN (1) CN115455383B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109101217A (en) * 2013-03-15 2018-12-28 先进元素科技公司 Method and system for purposefully calculating
US20190220873A1 (en) * 2018-01-15 2019-07-18 The Nielsen Company (Us), Llc Methods and apparatus for campaign mapping for total audience measurement
CN111160335A (en) * 2020-01-02 2020-05-15 腾讯科技(深圳)有限公司 Image watermarking processing method and device based on artificial intelligence and electronic equipment
CN112559985A (en) * 2020-12-22 2021-03-26 深圳昂楷科技有限公司 Watermark embedding and extracting method
CN114611077A (en) * 2022-03-23 2022-06-10 浙江电力交易中心有限公司 Self-adaptive selection method, system and device for digital watermarks of database and storage medium
CN115114598A (en) * 2022-06-28 2022-09-27 上海艺赛旗软件股份有限公司 Watermark generation method, and method and device for file tracing by using watermark

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109101217A (en) * 2013-03-15 2018-12-28 先进元素科技公司 Method and system for purposefully calculating
US20190220873A1 (en) * 2018-01-15 2019-07-18 The Nielsen Company (Us), Llc Methods and apparatus for campaign mapping for total audience measurement
CN111160335A (en) * 2020-01-02 2020-05-15 腾讯科技(深圳)有限公司 Image watermarking processing method and device based on artificial intelligence and electronic equipment
CN112559985A (en) * 2020-12-22 2021-03-26 深圳昂楷科技有限公司 Watermark embedding and extracting method
CN114611077A (en) * 2022-03-23 2022-06-10 浙江电力交易中心有限公司 Self-adaptive selection method, system and device for digital watermarks of database and storage medium
CN115114598A (en) * 2022-06-28 2022-09-27 上海艺赛旗软件股份有限公司 Watermark generation method, and method and device for file tracing by using watermark

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
周钢 等: "基于改进型C4.5算法的关系数据库零水印模型研究", 《计算机应用与软件》 *
顾力平 等: "基于决策树的数据库脆弱水印算法研究", 《舰船电子工程》 *

Also Published As

Publication number Publication date
CN115455383B (en) 2023-03-24

Similar Documents

Publication Publication Date Title
CN110532353B (en) Text entity matching method, system and device based on deep learning
US7730037B2 (en) Fragile watermarks
US7894630B2 (en) Tamper-resistant text stream watermarking
EP2693356B1 (en) Detecting pirated applications
CN105653984B (en) File fingerprint method of calibration and device
Baehr et al. Machine learning and structural characteristics for reverse engineering
KR20070086522A (en) Watermarking computer code by equivalent mathematical expressions
Oosterwijk et al. Optimal suspicion functions for Tardos traitor tracing schemes
Zhao et al. Towards graph watermarks
US8661559B2 (en) Software control flow watermarking
Li et al. Modelling features-based birthmarks for security of end-to-end communication system
CN115455383B (en) Method, device and equipment for processing watermark information of database
Hadler et al. An improved version of a tool mark comparison algorithm
CN106650504B (en) A kind of abstract extraction method and detection method for Web page face data
Koch et al. Toward the detection of polyglot files
Bento et al. Full characterization of a class of graphs tailored for software watermarking
CN110147516A (en) The intelligent identification Method and relevant device of front-end code in Pages Design
CN109241706A (en) Software plagiarism detection method based on static birthmark
JP3651777B2 (en) Digital watermark system, digital watermark analysis apparatus, digital watermark analysis method, and recording medium
CN115310087A (en) Website backdoor detection method and system based on abstract syntax tree
Yuan et al. Verify a valid message in single tuple: A watermarking technique for relational database
CN109063097B (en) Data comparison and consensus method based on block chain
Schaathun Fighting two pirates
Li Searching and extracting digital image evidence
CN117220911B (en) Industrial control safety audit system based on protocol depth analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant