CN117708779A - Data watermarking processing method, tracing method and storage medium - Google Patents

Data watermarking processing method, tracing method and storage medium Download PDF

Info

Publication number
CN117708779A
CN117708779A CN202410160736.9A CN202410160736A CN117708779A CN 117708779 A CN117708779 A CN 117708779A CN 202410160736 A CN202410160736 A CN 202410160736A CN 117708779 A CN117708779 A CN 117708779A
Authority
CN
China
Prior art keywords
data
watermark
tracing
check code
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410160736.9A
Other languages
Chinese (zh)
Other versions
CN117708779B (en
Inventor
肖艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Hongshu Technology Co ltd
Original Assignee
Guangdong Hongshu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Hongshu Technology Co ltd filed Critical Guangdong Hongshu Technology Co ltd
Priority to CN202410160736.9A priority Critical patent/CN117708779B/en
Publication of CN117708779A publication Critical patent/CN117708779A/en
Application granted granted Critical
Publication of CN117708779B publication Critical patent/CN117708779B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Editing Of Facsimile Originals (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a data watermarking processing method, a tracing method and a storage medium, comprising the following steps: the method is applied to a data sharing exchange scene, and comprises the following steps of: when a data sharing instruction is received, acquiring an original data set and operation user information; adding the type of the watermark to the original data set according to the watermark adding strategy to generate a data set carrying the data watermark; the watermark adding strategy is used for determining to add an encrypted watermark or hiding the watermark according to the application of the shared data of the operation user information; and carrying out sharing processing on the corresponding shared users through the data sharing interface. The invention does not need large data volume and embeds a large amount of information, and improves the security and imperceptibility of the data watermark.

Description

Data watermarking processing method, tracing method and storage medium
Technical Field
The present invention relates to the field of data security technologies, and in particular, to a data watermarking method, a tracing method, and a storage medium.
Background
The current data watermarking technology on the market mainly comprises: pseudo row watermarks, pseudo column watermarks, data deformation watermarks, fingerprint watermarks and the like. The method is mainly applied to database watermarking and traceability scenes, and the security of the watermarking technology can be guaranteed only by the large enough data volume. In the data secure sharing exchange scenario, such as API data sharing, data file downloading, etc., the situation of extracting a small amount of data, the watermarking technology has obvious limitations, and is specifically described as follows:
(1) the pseudo-line watermark is generated by simulating original line data through a field data regular expression or a data attribute rule, and then is inserted into the original data set according to a certain proportion to generate a data set carrying the pseudo-line watermark, and the security is that the number of lines of the data set is enough, but in a data exchange scene, the shared data aggregate is sometimes small, and if only one piece of data is needed to be shared, the pseudo-line watermark is not applicable any more;
(2) the pseudo-column watermark is generated by imitating a field data regular expression or a data attribute rule according to a sample provided by a user or a field value in a data set, and is inserted into a certain column in the data set to generate the data set carrying the pseudo-column watermark, and the security is that the number of columns of the data set is enough, but in a data exchange scene, the shared data aggregate is sometimes small, and if only one column of data is needed to be shared, the pseudo-column watermark is not applicable any more;
(3) the data deformation watermark is a watermark embedding mode aiming at the desensitization of sensitive data, the mode has larger change on the original data, and in the data exchange scene, if the shared data needs to be used for data verification, data analysis, value mining and the like under the condition that real data needs to be provided, the data deformation watermark is not applicable any more;
(4) fingerprint watermarking is a process of calculating the information summaries of row and column data as fingerprint information, such as MD5, SM3, etc., and if a column or row in the dataset is updated, the fingerprint watermark will fail.
However, the watermark needs a large enough data volume at the time, and the embedded information volume is large, so that the original data needs to be changed greatly. Therefore, in the data security sharing exchange scenario, the above technology has obvious limitations.
Disclosure of Invention
According to one aspect of the invention, a data watermarking method, a data watermarking traceability method, a data watermarking device and a data watermarking storage medium are provided, large data volume and large information embedding are not needed, and the security and imperceptibility of the data watermarking are improved.
To solve the above technical problem, a first aspect of the present invention discloses a data watermarking method, including:
when a data sharing instruction is received, acquiring an original data set and operation user information;
adding the type of the watermark to the original data set according to the watermark adding strategy to generate a data set carrying the data watermark; the watermark adding strategy is used for determining to add an encrypted watermark or hiding the watermark according to the application of the shared data of the operation user information;
and carrying out sharing processing on the corresponding shared users through the data sharing interface.
In some embodiments, adding an encrypted watermark or a hidden watermark according to a watermarking strategy generates a dataset carrying a data watermark, comprising:
acquiring operation user information, determining application of shared data, and determining to add an encrypted watermark or hide the watermark according to a watermark adding strategy;
reading an original data set, and preprocessing the original data set to obtain first mark data;
constructing encryption parameters according to the operation user information;
and calculating a check code according to the type of the added watermark, and combining the check code and the encryption parameter to generate a data set carrying the data watermark.
In some embodiments, the pre-treatment comprises: extracting numbers and letters in the original data set;
constructing encryption parameters according to the operation user information, including: and acquiring ID information of the operation user, generating an operation ID, and splicing the first marking data and the operation ID to generate encryption parameters.
In some embodiments, when the watermark is an encrypted watermark, calculating a check code according to the type of the added watermark, and combining the check code with the encryption parameter, generating the data set carrying the data watermark includes:
calculating a first check code according to the encryption parameter, and splicing the first check code and the encryption parameter to generate first check data;
generating a second check code according to the first check data, and splicing the second check code with the first check data to generate second check data;
removing the operation ID in the second check data and converting the operation ID into a preset format to generate a data set carrying a data watermark
Or when the watermark is a hidden watermark, calculating a check code according to the type of the added watermark, combining the check code with the second marked data, and generating a data set carrying the data watermark comprises:
and calculating a first check code according to the encryption parameters, encoding the first check code according to an encoding rule to generate invisible characters, and splicing the invisible characters and first marked data to generate a data set carrying a data watermark.
In some embodiments, the preprocessing further comprises:
when the watermark is an encrypted watermark, removing the last two characters in the original data set;
the application of the shared data is whether the shared data needs to reserve the original data statistical characteristics, if so, the hidden watermark is selected, and if not, the encrypted watermark is selected.
In a second aspect, the present application provides a data tracing method, which applies a data watermarking method as described above, and further includes the following steps:
when a data tracing instruction is received, acquiring a data set carrying a data watermark and operating user information;
determining the type of the watermark according to the watermark adding strategy, and tracing the data set according to the operation user information and the type of the watermark;
and outputting a traceability report, wherein the traceability report comprises the original data set and user operation information.
In some embodiments, determining the kind of the watermark according to the watermark adding policy, performing a tracing process on the data set according to the operation user information and the kind of the watermark, including:
acquiring operation user information, determining application of shared data, and determining the type of watermark according to a watermark adding strategy;
reading a data set carrying a data watermark, and preprocessing the data set carrying the data watermark to obtain first tracing data;
determining a tracing parameter according to the information of the operation user;
and calculating a tracing check code according to the type of the watermark, verifying the tracing check code and tracing parameters, and performing credibility verification according to the user operation information.
In some embodiments, calculating the traceability check code according to the kind of the added watermark, and verifying the traceability check code and the traceability parameter includes:
when the type of the watermark is an encrypted watermark, acquiring a first tracing code and a second tracing code of first tracing data, calculating a first tracing check code, and judging whether the first tracing check code is consistent with the first tracing code in the first tracing data;
if the first traceability check code is consistent with the user operation information, the traceability parameter book is obtained by splicing the first traceability check code and the user operation information;
calculating a second traceability check according to the traceability parameters, judging whether the second traceability check code is consistent with the second traceability code, and if so, performing credibility verification;
or when the type of the watermark is a hidden watermark, determining a first tracing code according to a data set carrying the data watermark and an invisible character reading rule;
calculating a first tracing verification code according to the second tracing data, and judging whether the first tracing verification code is consistent with the first tracing code or not; if the reliability is consistent, performing reliability verification;
in some embodiments, the credibility verification is to sequentially verify the characters of the traceability parameters according to the user operation information, and the credibility verification is successful when the verification success times are higher than the preset times.
In a third aspect, the present invention also provides a computer-readable storage medium comprising:
a memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform a data watermarking method and/or a data watermarking trace-out method as described above.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a data watermarking processing method, a tracing method and a storage medium, which effectively improve the safety and imperceptibility of data, accurately extract watermark identification information embedded in the data, and effectively prevent watermark misidentification caused by tampering or deletion of unauthorized users through data verification; the encrypted data is visually imperceptible and does not affect the quality and experience of the original content in most usage scenarios.
Drawings
FIG. 1 is a flow chart of a data watermarking method according to the present invention;
FIG. 2 is a schematic flow chart of an encryption watermarking method of the present invention;
FIG. 3 is a schematic flow chart of a hidden watermark processing method according to the present invention;
FIG. 4 is a schematic flow chart of a data watermark tracing method of the present invention;
FIG. 5 is a schematic flow chart of an encrypted watermark tracing method of the present invention;
fig. 6 is a schematic flow chart of a hidden watermark tracing method of the present invention.
Detailed Description
For a better understanding and implementation, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or modules is not necessarily limited to those steps or modules that are expressly listed or inherent to such process, method, article, or apparatus.
As shown in fig. 1 to 6, the invention provides a data encryption method, which does not need large data volume and embeds a large amount of information, and improves the security and imperceptibility of data watermarking.
Specifically, the method is applied to a data sharing exchange scene, which comprises API data sharing and data file sharing. The API data sharing service helps the enterprise build a secure channel for external applications to access the shared data. When an external application needs to access enterprise shared data, for example, a database table for sharing, a mode of calling through an API interface can be selected, so that the instantaneity of data acquisition is met.
The data file sharing service helps enterprises to construct a safe channel for users to access data resources, and the safe channel comprises data use requirements of business personnel, supervision departments and public inspection departments, and ensures the transmission and use safety after data distribution. When a user needs to access enterprise shared data, for example, a database table for sharing, a specified interface can be selected to meet the instantaneity of data acquisition in a file downloading mode.
The data sharing exchange scenes are all according to the demand, quantitative data sets are queried according to SQL conditions provided by users, the data volume is sometimes small, and the data are generally used for scenes such as report form display, development test, data analysis and the like. The security and imperceptibility requirements of the above scenario on the data watermarking technology are high.
Based on the scene requirements, the method comprises the following steps:
and step S1, when a data sharing instruction is received, acquiring an original data set and operation user information. The original data set refers to data to be shared, and the user operation information includes an operation ID generated when the user calls an API or downloads a shared file. The operation ID is of a long type, and is automatically increased from 1, and user operation audit logs are associated. An operation ID is generated each time the user applies for data sharing.
Step S2, adding the watermark types to the original data set according to the watermark adding strategy to generate a data set carrying the data watermark; the watermark adding strategy is used for determining to add an encrypted watermark or hide the watermark according to the application of the shared data of the operation user information.
In the data encryption process, only one watermark mode is generally selected, and the types of watermarks required for different applications sharing data are different. The application of the shared data is whether the shared data needs to reserve the original data statistical characteristics, if so, the hidden watermark is selected, and if not, the encrypted watermark is selected.
Further, adding an encrypted watermark or a hidden watermark according to a watermark adding strategy to generate a data set carrying a data watermark, comprising the following steps:
s21, acquiring operation user information, determining application of shared data, and determining to add an encrypted watermark or hide the watermark according to a watermark adding strategy;
s22, reading an original data set, and preprocessing the original data set to obtain first mark data;
s23, constructing encryption parameters according to the information of the operation user;
and step S24, calculating a check code according to the type of the added watermark, and combining the check code with the encrypted sweet potato to generate a data set carrying the data watermark.
As shown in fig. 2, the encryption process of the encrypted watermark is explained below in connection with the specific embodiment:
for example, time field samples: 2023-09-14 16:25:29.756
And S21, acquiring operation user information, determining application of shared data, and determining to add an encrypted watermark or hide the watermark according to a watermark adding strategy.
When the statistical characteristics of the original data do not need to be reserved, the method is only used for testing or report display, and the encryption watermark is selected. The encrypted watermark is embedded with watermark identification by modifying the preset bit character of the character string in the original data set. The more characters are modified, the safer the more theoretically, but the availability of the data needs to be ensured, and the modification of the data is reduced as much as possible, so in the application, the last 2 digits or letters in the original data set are modified, if the last two characters have special symbols, the data with the special symbols are excluded, and the special symbols are added back after encryption. In addition, if the last 1 bit is a check bit of the data, the 1 bit is shifted forward again, taking the 2 nd and 3 rd bits of reciprocal.
S22, reading an original data set, and preprocessing the original data set to obtain first mark data; reading the original dataset, e.g., time field samples: 2023-09-14:16:25:29.756, extracting the numbers and letters from the original string C of the original dataset, 20230914162529756, and removing the last two bits to obtain the first labeled data C1 as 202309141625297.
Step S23, encryption parameters are constructed according to the operation user information. The encryption parameter C2 is determined based on the operation ID in the operation user information.
Specifically, constructing encryption parameters according to the operation user information includes: and acquiring ID information of the operation user, generating an operation ID, and splicing the first marking data and the operation ID to generate encryption parameters. For example, ID is 119, the preprocessing result is 202309141625297, and the encryption parameter C2 is 119202309141625297.
And step S24, calculating a check code according to the type of the added watermark, and combining the check code with the encryption parameter to generate a data set carrying the data watermark.
When the watermark is an encrypted watermark, a 2-bit check code needs to be calculated. The method comprises the following steps:
step S241, calculating a first check code k1 according to the encryption parameter, and splicing the first check code k1 and the encryption parameter C2 to generate first check data C3; by the Luhn algorithm, a Luhn check code of the encryption parameter 119202309141625297, that is, a first check code k1=5, is calculated, and after the first check code k1 is spliced to the encryption parameter C2, first check data C3 is obtained and is 1192023091416252975.
Step S242, generating a second check code according to the first check data, and splicing the second check code with the first check data to generate second check data. Calculating a Luhn check code of first check data C3, namely 1192023091416252975, namely a second check code k2=2 through a Luhn algorithm, and splicing the second check code k2 to the first check data C3 to obtain second check data 11920230914162529752;
and step S243, removing the operation ID in the second check data, converting the operation ID into a preset format, and generating a data set carrying the data watermark. The operation ID of the second check data 11920230914162529752 is removed, that is, 119, and converted into a preset format, such as a normal time format or a preset log format, to generate a data set carrying a data watermark: 2023-09-14 16:25:29.752.
Data sets carrying data watermarks: 2023-09-14 16:25:29.752, where the last two bits 5, 2 are check codes. The encryption watermark only modifies 2-bit characters of a non-key part in the original character string, the length of the character string is unchanged after the watermark identification is embedded, the original data is changed by 2 bits at most, the original data is not perceived visually, and the quality and experience of the original content are not affected in most use scenes. The embedding and extraction processes should be secure, effectively preventing unauthorized users from tampering or deleting watermark information.
More specifically, the check code adopts the Luhn algorithm, and the check process of the m-bit character number C is approximately as follows:
wherein,
check code t=10-s% 10.
And S3, carrying out sharing processing on the corresponding shared users through the data sharing interface.
The shared data is encrypted through the encryption watermark, 2-bit characters in the encrypted original character string are not perceived as much as possible in vision, the quality and experience of the original content are not affected to a greater extent, and the data watermark can be accurately detected and extracted under the condition that the data is subject to data leakage such as cutting, copying and copying in the circulation process.
As shown in fig. 3, the following explains the encryption process of the hidden watermark in conjunction with the specific embodiment:
for example, telephone field: 020-38117659
S21, acquiring operation user information, determining application of shared data, and determining to add an encrypted watermark or hide the watermark according to a watermark adding strategy;
when the statistical characteristics of the original data need to be reserved, a hidden watermark is selected. The hidden watermark is a 4-bit invisible character embedded watermark identification generated by binary encoding. The original data characteristics are unchanged after the hidden watermark is embedded with the watermark identification, the length of the embedded varchar type character string is only 4 bytes more than the original length, the data watermark is difficult to be perceived visually, and the quality and experience of the original content are not affected.
And S22, reading an original data set, and preprocessing the original data set to obtain first mark data. The numbers of the character strings in the original data set C are extracted to obtain first marker data C1, i.e. 02038117659.
S23, constructing encryption parameters according to the information of the operation user; specifically, the ID information of the operation user is obtained, an operation ID is generated, and the first marking data and the operation ID are spliced to generate encryption parameters. If the operation ID is 10000234, the first flag data C1 is 02038117659, and the encryption parameter C2 is 1000023402038117659.
And step S24, calculating a check code according to the type of the added watermark, and combining the check code with the encryption parameter to generate a data set carrying the data watermark.
Specifically, a first check code k1 is calculated according to encryption parameters, the first check code is encoded according to an encoding rule to generate invisible characters, and the invisible characters and second marked data are spliced to generate a data set carrying a data watermark.
Calculating a Luhn check code of 1000023402038117659, namely a first check code k1=2, by a Luhn algorithm; according to a preset encoding rule, t1=2 is encoded as: and splicing invisible characters generated by the encoding result of the first check code to the back of the first marked data to obtain a data set for generating a data watermark.
The coding rules are as follows:
0: 0000\0\0\0\0
1: 0001\0\0\0\2
2: 0010\0\0\2\0
3: 0011\0\0\2\2
4: 0100\0\2\0\0
5: 0101\0\2\0\2
6: 0110\0\2\2\0
7: 0111\0\2\2\2
8: 1000\2\0\0\0
9: 1001\2\0\0\2
and S3, carrying out sharing processing on the corresponding shared users through the data sharing interface.
The hidden watermark is embedded with 4-bit invisible characters only after the original data set, the original content is not changed, the watermarked data is reversible, and in some application scenes, the data watermark is reversible, namely, the original content can be completely restored without leaving marks.
Based on the data watermark processing method, as shown in fig. 4, the application also provides a data watermark tracing method, which traces the source of the data watermark processing method.
The method specifically comprises the following steps:
and step S1, when a data tracing instruction is received, acquiring a data set carrying a data watermark and operation user information.
And S2, determining the type of the watermark according to the watermark adding strategy, and tracing the data set according to the operation user information and the type of the watermark.
Specifically, determining the type of the watermark according to the watermark adding strategy, and tracing the data set according to the operation user information and the type of the watermark, including:
s21, acquiring operation user information, determining application of shared data, and determining the type of watermark according to a watermark adding strategy;
s22, reading a data set carrying a data watermark, and preprocessing the data set carrying the data watermark to obtain first traceable data;
s23, determining a tracing parameter according to the information of the operation user;
and step S24, calculating a traceability check code according to the type of the watermark, verifying the traceability check code and the traceability parameter, and performing credibility verification according to the user operation information.
Specifically, as shown in fig. 5, the following explanation is made in connection with the tracing process of the encrypted watermark according to the specific embodiment:
for example, time field: 2023-09-14 16:25:29.752
S21, acquiring operation user information, determining application of shared data, and determining the type of watermark according to a watermark adding strategy;
step S22, a data set carrying a data watermark is read, and the data set carrying the data watermark is preprocessed to obtain first traceable data C1; wherein, the preprocessing includes extracting numbers in the data set string carrying the data watermark, for example 20230914162529752, and removing the last two bits to obtain the first tracing data C1, namely 202309141625297.
And S23, determining the traceability parameters according to the information of the operation user. Inquiring an operation ID in an operation audit log, acquiring ID information of an operation user, generating the operation ID, and splicing the first traceability data and the operation ID to generate traceability parameters. If the operation ID is 119, the tracing parameters C2, 119202309141625297 are obtained before adding to the first tracing data C1.
And step S24, calculating a traceability check code according to the type of the watermark, verifying the traceability check code and the traceability parameter, and performing credibility verification according to the user operation information.
When the watermark is of an encrypted watermark, verifying the traceability check code and the second traceability data includes:
when the type of the watermark is an encrypted watermark, acquiring a first tracing code and a second tracing code of first tracing data, calculating a first tracing check code, and judging whether the first tracing check code is consistent with the first tracing code in the first tracing data; and if the first tracing check code is consistent with the tracing parameter, splicing the first tracing check code and the tracing parameter.
Specifically, the first tracing code k1 and the second tracing code k2 of the first tracing data are respectively 5 and 2 to check the first tracing check code T1, the tracing parameter C2, that is, the Luhn check code of 119202309141625297, that is, the first tracing check code a1 is calculated first, if the first tracing check code a1 |=5, the first tracing code a1 is inconsistent with the first tracing code k1, the next operation ID is obtained, and the step S23 is continued to be returned; if the first tracing verification code a1=5, splicing the tracing parameter C2 with the first tracing verification code T1, namely 1192023091416252975;
and calculating a second tracing verification code a2 according to the first tracing verification code and the tracing parameters, judging whether the second tracing verification code a2 is consistent with the second tracing verification code k2, and if so, performing reliability verification.
Specifically, after the second traceability check code a2 is checked, the traceability parameter C2 is spliced by the first traceability check code a1, that is, the Luhn check code a2 of 1192023091416252975, if a2|=2, the next operation ID is obtained, and step S23 is continued; if a2=2, the consistency is verified successfully, the operation ID is hit, and the credibility verification is carried out.
And the credibility verification is to sequentially verify the characters of the second traceability data according to the user operation information, and the credibility verification is successful when the verification success times are higher than the preset times.
Specifically, the range of the check code is 0-9, the misrecognition rate of 1 check bit is 1/10, the misrecognition rate of 2 check bits is 1/100, and so on, the misrecognition rate of 10 check bits is 1-. For an encrypted watermark, the confidence level M may be 5.
And S3, outputting a traceability report, wherein the traceability report comprises an original data set and user operation information. And retrieving the user operation audit log according to the operation ID.
Specifically, as shown in fig. 6, the following explanation is made in connection with a specific embodiment of a tracing process of a hidden watermark:
telephone field sample: 020-38117659, t1=2 if the decoding result of the invisible character.
S21, acquiring operation user information, determining application of shared data, and determining the type of watermark according to a watermark adding strategy;
s22, reading a data set carrying a data watermark, and preprocessing the data set carrying the data watermark to obtain first traceable data; wherein the preprocessing includes extracting numbers, such as 02038117659, in the data set string carrying the data watermark.
And S23, determining the traceability parameters according to the information of the operation user. Inquiring an operation ID in an operation audit log, acquiring ID information of an operation user, generating the operation ID, and splicing the first traceability data and the operation ID to generate traceability parameters. For example, the operation ID is 10000234, the preprocessing result is 02038117659, and the tracing parameter is 1000023402038117659.
And step S24, calculating a traceability check code according to the type of the watermark, verifying the traceability check code and the traceability parameter, and performing credibility verification according to the user operation information.
When the type of the watermark is a hidden watermark, verifying the traceability check code and the traceability parameter includes:
determining a first tracing code k1 according to a data set carrying a data watermark and an invisible character reading rule; calculating a first tracing verification code a1 according to the tracing parameters, and judging whether the first tracing verification code a1 is consistent with a first tracing code k1 or not; if the reliability is consistent, performing reliability verification;
specifically, the tracing parameters, that is, the Luhn check code first tracing code a1 of 1000023402038117659, are calculated, if a1=2, the tracing parameters are consistent, if a1|=2, the next operation ID is obtained, and step S23 is executed;
and the credibility verification is to sequentially verify the characters of the second traceability data according to the user operation information, and the credibility verification is successful when the verification success times are higher than the preset times. If M pieces of unrepeated data hit the operation ID, the hidden watermark tracing is successful.
Specifically, the range of the check code is 0-9, and the misrecognition rate of 1 check bit is 1The misrecognition rate of/10, 2 check bits is 1/100, and so on, the misrecognition rate of 10 check bits is 1 +.. For hidden watermarks, the confidence level M may be 10.
And S3, outputting a traceability report, wherein the traceability report comprises an original data set and user operation information. And retrieving the user operation audit log according to the operation ID.
The calculation of more check codes and traceability codes can be adopted by adopting a Verhoeff, damm check algorithm or other user-defined check algorithms besides the Luhn algorithm in the embodiment, and the calculation is not limited in the application.
The application provides a data watermarking processing method, a tracing method and a storage medium, which effectively improve the safety and imperceptibility of data, accurately extract watermark identification information in embedded data, and effectively prevent watermark misidentification caused by tampering or deletion of unauthorized users through data verification; the encrypted data is visually imperceptible and does not affect the quality and experience of the original content in most usage scenarios.
Based on the same inventive concept, the present application also provides a computer readable storage medium comprising
A memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform a data watermarking method and a tracing method as described above.
Embodiments of the present invention disclose a computer program product comprising a non-transitory computer readable storage medium storing a computer program, and the computer program is operable to cause a computer to perform a data watermarking method and a tracing method as described.
The embodiments described above are illustrative only, and the modules illustrated as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, may be located in one place, or may be distributed over multiple network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above detailed description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course by means of hardware. Based on such understanding, the foregoing technical solutions may be embodied essentially or in part in the form of a software product that may be stored in a computer-readable storage medium including Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), programmable Read-Only Memory (Programmable Read-Only Memory, PROM), erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), one-time programmable Read-Only Memory (OTPROM), electrically erasable programmable Read-Only Memory (EEPROM), compact disc Read-Only Memory (Compact Disc Read-Only Memory, CD-ROM) or other optical disc Memory, magnetic disc Memory, tape Memory, or any other medium that can be used for computer-readable carrying or storing data.
Finally, it should be noted that: the disclosure of the embodiments of the present invention is only a preferred embodiment of the present invention, and is only for illustrating the technical scheme of the present invention, but not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that; the technical scheme recorded in the various embodiments can be modified or part of technical features in the technical scheme can be replaced equivalently; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims (10)

1. A data watermarking method, characterized in that it is applied to a data sharing exchange scenario, the method comprising the steps of:
when a data sharing instruction is received, acquiring an original data set and operation user information;
adding the type of the watermark to the original data set according to the watermark adding strategy to generate a data set carrying the data watermark; the watermark adding strategy is used for determining to add an encrypted watermark or hiding the watermark according to the application of the shared data of the operation user information;
and carrying out sharing processing on the corresponding shared users through the data sharing interface.
2. A data watermarking method according to claim 1, wherein adding an encrypted watermark or a hidden watermark according to a watermarking strategy generates a data set carrying the data watermark, comprising:
acquiring operation user information, determining application of shared data, and determining to add an encrypted watermark or hide the watermark according to a watermark adding strategy;
reading an original data set, and preprocessing the original data set to obtain first mark data;
constructing encryption parameters according to the operation user information;
and calculating a check code according to the type of the added watermark, and combining the check code and the encryption parameter to generate a data set carrying the data watermark.
3. A data watermarking method according to claim 2, wherein the pre-processing comprises: extracting numbers and letters in the original data set;
constructing encryption parameters according to the operation user information, including: and acquiring ID information of the operation user, generating an operation ID, and splicing the first marking data and the operation ID to generate encryption parameters.
4. A data watermarking method according to claim 3, wherein when the watermark is an encrypted watermark, calculating a check code according to the type of added watermark, and combining the check code with encryption parameters, generating a data set carrying the data watermark comprises:
calculating a first check code according to the encryption parameter, and splicing the first check code and the encryption parameter to generate first check data;
generating a second check code according to the first check data, and splicing the second check code with the first check data to generate second check data;
removing the operation ID in the second check data and converting the operation ID into a preset format to generate a data set carrying a data watermark
Or when the watermark is a hidden watermark, calculating a check code according to the type of the added watermark, combining the check code with the second marked data, and generating a data set carrying the data watermark comprises:
and calculating a first check code according to the encryption parameters, encoding the first check code according to an encoding rule to generate invisible characters, and splicing the invisible characters and first marked data to generate a data set carrying a data watermark.
5. A method of watermarking data according to claim 4, wherein the preprocessing further comprises:
when the watermark is an encrypted watermark, removing the last two characters in the original data set;
the application of the shared data is whether the shared data needs to reserve the original data statistical characteristics, if so, the hidden watermark is selected, and if not, the encrypted watermark is selected.
6. A data watermark tracing method, which is characterized in that the method is applied to the data watermark processing method as claimed in any one of claims 1 to 5, and further comprising the following steps:
when a data tracing instruction is received, acquiring a data set carrying a data watermark and operating user information;
determining the type of the watermark according to the watermark adding strategy, and tracing the data set according to the operation user information and the type of the watermark;
and outputting a traceability report, wherein the traceability report comprises the original data set and user operation information.
7. The method for tracing a watermark of data according to claim 6, wherein determining a watermark type according to a watermark adding policy, tracing the data set according to the operation user information and the watermark type, comprises:
acquiring operation user information, determining application of shared data, and determining the type of watermark according to a watermark adding strategy;
reading a data set carrying a data watermark, and preprocessing the data set carrying the data watermark to obtain first tracing data;
determining a tracing parameter according to the information of the operation user;
and calculating a tracing check code according to the type of the watermark, verifying the tracing check code and tracing parameters, and performing credibility verification according to the user operation information.
8. The method according to claim 7, wherein calculating a tracing check code according to the kind of the added watermark, and verifying the tracing check code and tracing parameters includes:
when the type of the watermark is an encrypted watermark, acquiring a first tracing code and a second tracing code of first tracing data, calculating a first tracing check code, and judging whether the first tracing check code is consistent with the first tracing code in the first tracing data;
if the first tracing check code is consistent with the tracing parameter, the first tracing check code is spliced with the tracing parameter;
calculating a second tracing verification code according to the first tracing verification code and the tracing parameters, judging whether the second tracing verification code is consistent with the second tracing code, and if so, performing reliability verification;
or when the type of the watermark is a hidden watermark, determining a first tracing code according to a data set carrying the data watermark and an invisible character reading rule;
calculating a first tracing verification code according to the second tracing data, and judging whether the first tracing verification code is consistent with the first tracing code or not; and if the reliability is consistent, performing reliability verification.
9. The method for tracing a watermark of data according to claim 7, wherein said reliability verification is to verify the characters of the tracing parameters in sequence according to the user operation information, and the reliability verification is successful when the verification success number is higher than the preset number.
10. A computer-readable storage medium, comprising:
a memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform a data watermarking method according to any one of claims 1 to 5 and/or a data watermarking trace-out method according to any one of claims 6 to 9.
CN202410160736.9A 2024-02-05 2024-02-05 Data watermarking processing method, tracing method and storage medium Active CN117708779B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410160736.9A CN117708779B (en) 2024-02-05 2024-02-05 Data watermarking processing method, tracing method and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410160736.9A CN117708779B (en) 2024-02-05 2024-02-05 Data watermarking processing method, tracing method and storage medium

Publications (2)

Publication Number Publication Date
CN117708779A true CN117708779A (en) 2024-03-15
CN117708779B CN117708779B (en) 2024-06-07

Family

ID=90157292

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410160736.9A Active CN117708779B (en) 2024-02-05 2024-02-05 Data watermarking processing method, tracing method and storage medium

Country Status (1)

Country Link
CN (1) CN117708779B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104850765A (en) * 2014-02-19 2015-08-19 中国移动通信集团福建有限公司 Watermark processing method, device and system
CN107992727A (en) * 2017-12-11 2018-05-04 北京安华金和科技有限公司 A kind of watermark processing realized based on legacy data deformation and data source tracing method
US20200125699A1 (en) * 2018-10-23 2020-04-23 Alibaba Group Holding Limited Data Processing, Watermark Embedding and Watermark Extraction
CN112650992A (en) * 2020-12-21 2021-04-13 江苏群杰物联科技有限公司 Document tracking encryption method based on digital watermark
CN112751823A (en) * 2020-11-11 2021-05-04 国网江苏省电力有限公司营销服务中心 Outgoing data generation method, outgoing safety control method and system
CN113536247A (en) * 2021-07-21 2021-10-22 中数通信息有限公司 Traceable information hidden data watermarking method with MD5 feature mobile phone number
CN114626968A (en) * 2022-03-30 2022-06-14 北京沃东天骏信息技术有限公司 Watermark embedding method, watermark extracting method and device
CN115114599A (en) * 2022-08-12 2022-09-27 南京星环智能科技有限公司 Method, device and equipment for processing database watermark and storage medium
CN116702103A (en) * 2023-06-19 2023-09-05 建信金融科技有限责任公司 Database watermark processing method, database watermark tracing method and device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104850765A (en) * 2014-02-19 2015-08-19 中国移动通信集团福建有限公司 Watermark processing method, device and system
CN107992727A (en) * 2017-12-11 2018-05-04 北京安华金和科技有限公司 A kind of watermark processing realized based on legacy data deformation and data source tracing method
US20200125699A1 (en) * 2018-10-23 2020-04-23 Alibaba Group Holding Limited Data Processing, Watermark Embedding and Watermark Extraction
CN112751823A (en) * 2020-11-11 2021-05-04 国网江苏省电力有限公司营销服务中心 Outgoing data generation method, outgoing safety control method and system
CN112650992A (en) * 2020-12-21 2021-04-13 江苏群杰物联科技有限公司 Document tracking encryption method based on digital watermark
CN113536247A (en) * 2021-07-21 2021-10-22 中数通信息有限公司 Traceable information hidden data watermarking method with MD5 feature mobile phone number
CN114626968A (en) * 2022-03-30 2022-06-14 北京沃东天骏信息技术有限公司 Watermark embedding method, watermark extracting method and device
CN115114599A (en) * 2022-08-12 2022-09-27 南京星环智能科技有限公司 Method, device and equipment for processing database watermark and storage medium
CN116702103A (en) * 2023-06-19 2023-09-05 建信金融科技有限责任公司 Database watermark processing method, database watermark tracing method and device

Also Published As

Publication number Publication date
CN117708779B (en) 2024-06-07

Similar Documents

Publication Publication Date Title
CN110457873B (en) Watermark embedding and detecting method and device
CN112307520B (en) Electronic seal adding and verifying method and system
CN104850765A (en) Watermark processing method, device and system
CN113536247B (en) Hidden data watermarking method for mobile phone number with MD5 characteristic of traceable information
CN114356919A (en) Watermark embedding method, tracing method and device for structured database
CN115712909B (en) Text watermark embedding method, tracing method and system based on blockchain
CN115952528B (en) Multi-scale combined text steganography method and system
CN110942322A (en) Anti-counterfeiting method, system, electronic equipment and medium for coupon getting interface screenshot
US7171561B2 (en) Method and apparatus for detecting and extracting fileprints
CN108090364B (en) Method and system for positioning data leakage source
CN112149068A (en) Access-based authorization verification method, information generation method and device, and server
CN105577376A (en) Two-dimensional code coding-and-decoding and authentication method and two-dimensional code coding-and-decoding and authentication device
CN117708779B (en) Data watermarking processing method, tracing method and storage medium
CN111199746B (en) Information hiding method and hidden information extracting method
CN111382398B (en) Method, device and equipment for information processing, hidden information analysis and embedding
CN114078071A (en) Image tracing method, device and medium
CN116305294A (en) Data leakage tracing method and device, electronic equipment and storage medium
CN116702103A (en) Database watermark processing method, database watermark tracing method and device
CN116167807A (en) Bill anti-counterfeiting method and device, electronic equipment and storage medium
CN114298882A (en) Watermark embedding method and tracing method for CAD data and electronic equipment
CN114637972A (en) Watermark embedding and extracting method based on docx format document
CN114626968A (en) Watermark embedding method, watermark extracting method and device
CN110933047B (en) Network authentication information security verification method, device, medium and terminal equipment
CN114547562A (en) Method and device for adding and applying text watermark
US11372999B2 (en) Method for inserting data on-the-fly into a watermarked database and associated device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant