WO2024066271A1

WO2024066271A1 - Database watermark embedding method and apparatus, database watermark tracing method and apparatus, and electronic device

Info

Publication number: WO2024066271A1
Application number: PCT/CN2023/085945
Authority: WO
Inventors: 刘睿民; 丁若冰; 张锦
Original assignee: 北京柏睿数据技术股份有限公司
Priority date: 2022-09-27
Filing date: 2023-04-03
Publication date: 2024-04-04
Also published as: CN115495439B; CN115495439A

Abstract

Disclosed in the present invention are a database watermark embedding method and apparatus, a database watermark tracing method and apparatus, and an electronic device. The embedding method comprises: converting data to be processed into preliminarily coded data according to a preset coding rule; converting each character in the preliminarily coded data into a binary number of a preset length and obtaining binary data; mapping the binary data to zero-width character string data according to a preset mapping relation table; separately adding a preset zero-width character string before and after the zero-width character string data to obtain final coded data; and embedding the final coded data as a database watermark into an embedding position corresponding to the data to be processed. The preset length is not smaller than the length of an original binary number corresponding to the character, and the preset mapping relation table is determined according to the mapping relations between different binary numbers of a preset bit and different zero-width character strings. Therefore, efficient database watermark embedding is achieved while data presentation is not affected.

Description

Database watermark embedding method, tracing method, device and electronic device

Technical Field

The present application relates to the field of database technology, and more specifically, to a database watermark embedding method, a source tracing method, a device and an electronic device.

Background technique

Database watermarking technology uses covert means to embed watermark information such as copyright descriptions and user identities into table data and file data without affecting the use of the original data, thereby solving the technical problem of data leakage that cannot be traced during data sharing, distribution, and use, and ensuring data security during data sharing, distribution, and use, thereby enhancing the value of data sharing.

In the prior art, the algorithm for implementing database watermarking is usually based on the different types of data. Different transformation algorithms are used to make imperceptible transformations on the data, thereby hiding the watermark data in the specific data and completing the embedding of the database watermark. When tracing the data, the backtracking algorithm corresponding to the algorithm type is used to restore the watermark information, thereby realizing data tracing in cases of data leakage.

Although this method of implementing database watermarks based on different transformation algorithms and backtracking algorithms solves the technical problem that data leakage data cannot be traced, this method requires the use of different transformation algorithms to transform different types of data, and has poor generality. In addition, after the data is transformed by the transformation algorithm, the watermark data is also inserted into the data and becomes a component of the data, causing the data value to change after the transformation. Users cannot directly read the value of the data and must use the backtracking method to read it, and the calculation process is relatively complicated. At the same time, the calculation of different algorithms requires the use of part of the computing resources of the database system, which reduces the database performance and seriously affects the data display.

Therefore, how to achieve efficient database watermark embedding without affecting data display is a technical problem that needs to be solved.

Summary of the invention

The embodiments of the present application provide a database watermark embedding method, a traceability method, a device and an electronic device, which are used to achieve efficient database watermark embedding without affecting data display.

In a first aspect, a method for embedding a database watermark is provided, the method comprising:

Convert the data to be processed into preliminary coded data according to preset coding rules;

Convert each character in the preliminary coded data into a binary number of a preset length and obtain binary data;

Mapping the binary data into zero-width character string data according to a preset mapping relationship table;

Adding preset zero-width character strings before and after the zero-width character string data to obtain final encoded data;

Embedding the final encoded data as a database watermark into an embedding position corresponding to the data to be processed;

The preset length is not less than the length of the original binary number corresponding to the character, and the preset mapping relationship table is determined according to the mapping relationship between different binary numbers with preset bit numbers and different zero-width character strings.

In some embodiments, before embedding the final encoded data as a database watermark into an embedding position corresponding to the data to be processed, the method further includes:

Determining the embedding position according to the tag mark of the data to be processed;

The label mark is determined in advance according to the type, length, position and attribute of the data to be processed.

In some embodiments, the binary data is mapped to zero-width character string data according to a preset mapping relationship table, specifically:

Dividing the binary data into a plurality of groups of sub-data according to the preset number of bits;

Each group of the sub-data is mapped into a zero-width character string according to the preset mapping relationship table to obtain the zero-width character string data.

In some embodiments, each character in the preliminary encoded data is converted into a binary number of a preset length to obtain binary data, specifically:

Perform binary conversion on each character in sequence to obtain the original binary number;

If the length of the original binary number is less than the preset length, padding zeros before the highest bit of the original binary number so that the length of the original binary number reaches the preset length;

The binary data is obtained according to a binary number of a preset length corresponding to each of the characters.

In some embodiments, the preset encoding rule includes an encoding rule corresponding to hexadecimal Unicode encoding, or decimal Unicode encoding, or hexadecimal GBK encoding, or decimal GBK encoding.

In a second aspect, a method for tracing the database watermark as described in the first aspect is provided. include:

Determining the final encoded data according to the embedding position;

Removing the preset zero-width character strings before and after the final encoded data and obtaining the zero-width character string data;

Mapping the zero-width character string data to the binary data according to the preset mapping relationship table;

Dividing the binary data into multiple groups of binary numbers according to the preset length, and converting each group of binary numbers into each character of the preliminary coded data;

Each of the characters is converted into the data to be processed according to the preset encoding rule, and the data to be processed is used as the traceability result data.

In some embodiments, the embedding position is determined by a tag mark, and the tag mark is determined in advance according to the type, length, position and attribute of the data to be processed.

In a third aspect, a database watermark embedding device is provided, the device comprising:

A first conversion module, used to convert the data to be processed into preliminary coded data according to a preset coding rule;

A second conversion module, used for converting each character in the preliminary coded data into a binary number of a preset length and obtaining binary data;

A first mapping module, used for mapping the binary data into zero-width character string data according to a preset mapping relationship table;

An adding module, used for adding a preset zero-width character string before and after the zero-width character string data respectively to obtain final encoded data;

An embedding module, used for embedding the final encoded data as a database watermark into an embedding position corresponding to the data to be processed;

In a fourth aspect, a device for tracing the source of a database watermark as described in the third aspect is provided, the device comprising:

A determination module, used for determining the final encoded data according to the embedding position;

A removal module, used for removing the preset zero-width character string before and after the final encoded data and obtaining the zero-width character string data;

A second mapping module, used for mapping the zero-width character string data into the binary data according to the preset mapping relationship table;

A third conversion module, used for dividing the binary data into a plurality of groups of binary numbers according to the preset length, and converting each group of binary numbers into each character of the preliminary coded data;

The fourth conversion module is used to convert each of the characters into the data to be processed according to the preset encoding rule, and use the data to be processed as the tracing result data.

In a fifth aspect, an electronic device is provided, including:

Processor; and

A memory, configured to store executable instructions of the processor;

The processor is configured to execute the embedding method described in any one of the first aspects or the tracing method described in any one of the second aspects by executing the executable instructions.

By applying the above technical scheme, the data to be processed is converted into preliminary coded data according to the preset coding rules; each character in the preliminary coded data is converted into a binary number of preset length and binary data is obtained; the binary data is mapped into zero-width character string data according to the preset mapping relationship table; the preset zero-width character string is added before and after the zero-width character string data respectively to obtain the final coded data; the final coded data is embedded as a database watermark in the embedding position corresponding to the data to be processed; wherein the preset length is not less than the length of the original binary number corresponding to the character, and the preset mapping relationship table is determined according to the mapping relationship between different binary numbers of preset bits and different zero-width character strings. Since the data to be processed is uniformly converted into preliminary coded data first, it is avoided to use different algorithms to process different types of data, thereby improving the versatility and ensuring the high performance of the database. Moreover, the final coded data uses a zero-width character string, which will not affect the data display after embedding the database watermark, thereby realizing efficient database watermark embedding without affecting the data display.

BRIEF DESCRIPTION OF THE DRAWINGS

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings required for use in the description of the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present application. For those skilled in the art, other drawings can be obtained based on these drawings without creative work.

FIG1 is a schematic diagram showing a flow chart of a method for embedding a database watermark according to an embodiment of the present invention;

FIG. 2 shows a schematic diagram of a process of tracing the source of a database watermark according to an embodiment of the present invention. picture;

FIG3 shows a schematic diagram of the structure of a database watermark embedding device proposed in an embodiment of the present invention;

FIG. 4 shows a schematic structural diagram of a database watermark source tracing device according to an embodiment of the present invention.

FIG5 shows a block diagram of an electronic device according to an embodiment of the present invention.

Detailed ways

The following will be combined with the drawings in the embodiments of the present application to clearly and completely describe the technical solutions in the embodiments of the present application. Obviously, the described embodiments are only part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of this application.

The present application embodiment provides a method for embedding a database watermark, as shown in FIG1 , the method comprising the following steps:

Step S101, converting the data to be processed into preliminary coded data according to a preset coding rule.

In this embodiment, the data to be processed may be sensitive data specified by the user. The user may determine the sensitive data by defining keywords or metadata information and then matching the keywords or metadata information; or may define regular expressions according to the structural composition rules of sensitive data by studying the characteristics of sensitive data, and then determine the sensitive data by matching the regular expressions. The data to be processed may include a variety of characters, such as text, numbers, letters, punctuation marks, graphic symbols, etc. The preset encoding rule is a general encoding rule that can uniformly encode different types of characters in the data to be processed. The data to be processed can be converted into preliminary encoded data according to the preset encoding rule.

In order to reliably obtain preliminary encoded data, in some embodiments of the present application, the preset encoding rules include encoding rules corresponding to hexadecimal Unicode encoding, or decimal Unicode encoding, or hexadecimal GBK encoding, or decimal GBK encoding.

In this embodiment, Unicode is a unified code, which is a character encoding scheme developed by an international organization that can accommodate all the characters and symbols in the world. GBK (Chinese Internal Code Specification) uses single- and double-byte variable-length encoding, English uses single-byte encoding, which is fully compatible with ASCII character encoding, and the Chinese part uses double-byte encoding. The preset encoding rule can use the encoding rule corresponding to the decimal or hexadecimal Unicode encoding, or the encoding rule corresponding to the decimal or hexadecimal GBK encoding.

Those skilled in the art may also adopt other types of preset coding rules according to actual needs, which does not affect the protection scope of the present application.

Step S102, converting each character in the preliminary encoded data into a binary number of a preset length to obtain binary data.

In this embodiment, in order to facilitate computer processing, each character in the preliminary coded data needs to be converted into a binary number of a preset length to achieve normalization of the preliminary coded data. In order to ensure that each character in the preliminary coded data is normalized, the preset length is not less than the length of the original binary number corresponding to the character.

In order to obtain accurate binary data, in some embodiments of the present application, each character in the preliminary encoded data is converted into a binary number of a preset length to obtain binary data, specifically:

In this embodiment, each character is first converted into a binary number in sequence to obtain an original binary number corresponding to each character. The length of the original binary number may not reach the preset length. If the length of the original binary number is less than the preset length, zeros are added before the highest bit of the original binary number to make the length of the original binary number reach the preset length. When the binary numbers corresponding to each character are all of the preset length, binary data is formed according to the binary numbers of each preset length. For example, if the preset length is 8 bits and the original binary number is 6 bits, two zeros are added before the highest bit of the original binary number.

Optionally, the preset length may be 8 bits or 16 bits.

Step S103: Map the binary data into zero-width character string data according to a preset mapping relationship table.

In this embodiment, the preset mapping relationship table is determined according to the mapping relationship between different binary numbers and different zero-width character strings according to the preset number of bits. The zero-width character string consists of zero-width characters, which are non-printable Unicode characters with a byte width of 0. They are invisible but real characters that represent a certain control function in browsers and general text editors. Binary data can be mapped to zero-width character string data according to the preset mapping relationship table.

In the specific application scenario of this application, when the preset number of bits is 2, the preset mapping relationship table can be as follows: As shown in Table 1.

Table 1

In order to accurately obtain zero-width character string data, in some embodiments of the present application, the binary data is mapped to zero-width character string data according to a preset mapping relationship table, specifically:

In this embodiment, the binary data is first grouped according to a preset number of bits to obtain multiple groups of sub-data, and then a preset mapping relationship table is queried according to each group of sub-data, and each zero-width character string is determined according to the query result, so as to map each group of sub-data to zero-width character string data.

Optionally, in some embodiments of the present application, the preset number of bits is 2, and those skilled in the art may also adopt other preset number of bits according to actual needs.

Step S104, adding a preset zero-width character string before and after the zero-width character string data to obtain final encoded data.

In this embodiment, in order to distinguish zero-width character string data, it is necessary to isolate the zero-width character string data from other data (such as main data), add preset zero-width character strings before and after the zero-width character string data, and obtain the final encoded data.

Those skilled in the art may set different preset zero-width character strings according to actual needs, which does not affect the protection scope of the present application.

Step S105: embed the final encoded data as a database watermark into an embedding position corresponding to the data to be processed.

In this embodiment, the data to be processed corresponds to an embedding position, which can be a position specified by the user. The final encoded data can be fixed or it can be the data to be processed itself, and the final encoded data can be embedded in the embedding position as a database watermark.

In order to accurately embed the database watermark, in some embodiments of the present application, before embedding the final encoded data as the database watermark into the embedding position corresponding to the data to be processed, the method further includes:

In this embodiment, the label mark is determined in advance according to the type, length, position and attribute of the data to be processed, and the embedding position of the database watermark can be determined according to the label mark. Optionally, in some embodiments of the present application, a hash algorithm can be used to process the type, length, position and attribute of the data to be processed, and a label mark is obtained according to the processing result.

Corresponding to a database watermark embedding method in an embodiment of the present application, the present application also proposes a database watermark tracing method, as shown in FIG2 , the method comprising the following steps:

Step S201, determining the final encoded data according to the embedding position.

The embedded position can be obtained according to the tracing instruction input by the user.

Step S202, removing the preset zero-width character strings before and after the final encoded data and obtaining the zero-width character string data.

Step S203: Map the zero-width character string data to the binary data according to the preset mapping relationship table.

Step S204, dividing the binary data into multiple groups of binary numbers according to the preset length, and converting each group of binary numbers into each character of the preliminary coded data.

Step S205: convert each of the characters into the data to be processed according to the preset encoding rule, and use the data to be processed as the tracing result data.

In order to accurately determine the embedding position, in some embodiments of the present application, the embedding position is determined by a tag mark, and the tag mark is pre-determined according to the type, length, position and attribute of the data to be processed. The traceability instruction input by the user may include the tag mark.

By applying the above technical scheme, the final encoded data is determined according to the embedding position; the preset zero-width character strings before and after the final encoded data are removed to obtain zero-width character string data; the zero-width character string data is mapped to binary data according to a preset mapping relationship table; the binary data is divided into multiple groups of binary numbers according to a preset length, and each group of binary numbers is converted into each character of the preliminary encoded data; each character is converted into data to be processed according to a preset encoding rule, and the data to be processed is used as the traceability result data, so that the traceability of the database watermark can be achieved with only a simple mapping, and the high performance of the database is guaranteed.

In order to further explain the technical idea of the present invention, the technical solution of the present invention is now described in combination with specific application scenarios.

The present application provides a method for embedding a database watermark, comprising the following steps:

Step S301, receiving the data to be processed R0, and encoding it according to the encoding rules corresponding to the hexadecimal Unicode encoding to obtain preliminary encoded data R1.

Specifically, each character in the data to be processed is converted into hexadecimal Unicode encoding, and the encoding rule can be shown in Table 2.

Table 2

Step S302: character normalization.

Convert each hexadecimal character in R1 into an 8-bit binary number. If the number is less than 8 bits, use 0 to fill the high bit to 8 bits to ensure that each character corresponds to an 8-bit binary number, and obtain binary data R2.

Step S303, using zero-width string encoding.

According to the corresponding relationship in Table 1, the data in R2 is converted into a zero-width string for every two digits, where 00 is converted into \u200b, 01 is converted into \u200c, 10 is converted into \u200d, and 11 is converted into \u200e, to obtain zero-width string data R3.

Step S304, adding prefixes and suffixes.

Add a preset zero-width character string uFEFF before and after R3, use uFEFF to isolate R3, and obtain the final encoded data R4.

Step S305: embed R4 as a database watermark into an embedding position corresponding to the data to be processed.

The present application provides a method for tracing the source of a database watermark, comprising the following steps:

Step S401, determining final encoded data according to the embedding position.

Find the \uFEFF marker from the embedded position, extract the data starting from \uFEFF and ending at \uFEFF, and obtain the final encoded data R4.

Step S402, remove the prefix and suffix.

Remove the leading \uFEFF and trailing \uFEFF from R4 to obtain zero-width character string data R3.

Step S403, decoding the zero-width character string data.

According to the corresponding relationship in Table 1, R3 is converted into binary data R2.

Step S404, restoring to preliminary coded data.

Convert the binary data of R2 into one character every 8 bits to obtain the preliminary hexadecimal encoded data R1.

Step S405, convert R1 into the data to be processed R0, and use the data to be processed R0 as the traceability result data.

Specifically, the hexadecimal R1 is converted according to the Unicode comparison table to obtain R0.

The following takes "User 1" as the data to be processed as an example to illustrate the embedding and tracing process of the database watermark.

The embedding process of database watermark is as follows:

S501, converting "User 1" into a hexadecimal Unicode encoding result S1: \u7528\u6237\u4e00.

S502, character normalization.

Convert each character of \u7528\u6237\u4e00 in S1 into an 8-bit binary number. If the characters do not meet the 8-bit requirement, add 0 to the high bit. The final result is S2:

01011100011101010011011100110101001100100011100001011100001011101010100110010001100110011011101011100011101010100110100110010011000000110000

S503, use zero-width string encoding.

The result in S2 is encoded by zero-width string, and the conversion relationship is obtained according to the preset mapping relationship table (Table 1). The final result is:

\u200c\u200c\u200e\u200b\u200c\u200e\u200c\u200c\u200b\u200e\u200c\u200e\u200b\u200e\u200c\u200c\u200b\u200e\u200c\u200c\u200b\u200d\u200b\u200e\u200d\u200b\u200c\u200c\u200e\u200b\u200c\u200e\u200c\u200c\u200b\u200e\u200c\u200d\u200b\u200c\u200e\u200c\u200c\u200b\u200e\u200c\u200d\u200b\ This result is simplified to [S3 result].

S504, add prefix and suffix.

Add prefixes and suffixes to [S3 results] for isolation, and the final result is:

\uFEFF[S3 result]\uFEFF.

S505: Output is the final encoded data.

The final encoded data is \uFEFF[S3 result]\uFEFF.

S506, embed \uFEFF[S3 result]\uFEFF as the database watermark into the embedding position corresponding to "User 1".

The database watermark embedded in the data is:

\uFEFF\u200c\u200c\u200e\u200b\u200c\u200e\u200c\u200c\u200b\u200e\u200c\u200e\u200b\u200e\u200c\u200c\u200b\u200e\u200c\u200c\u200b\u200d\u200b\u200e\u200d\u200b\u200c\u200c\u200e\u200b\u200c\u200e\u200c\u200c\u200b\u200e\u200c\u200c\u200b\u200e\u200c\u200c\u200b\u200e\ u200c\u200d\u200b\u200e\u200b\u200d\u200b\u200e\u200b\u200e\u200b\u200e\u200c\u200e\u200c\u200c\u200e\u200b\u200c\u200e\u200c\u200c\u200b\u200e\u200c\u200c\u200b\u200e\u200c\u200d\u200c\u200c\u200b\u200e\u200b\u200c\u200d\u200c\u200c\u200b\u200e\u200b\u200b\u200c 00e\u200b\u200b\uFEFF.

The tracing process of database watermark is as follows:

Step S601, extracting the final encoded data.

Extract the final encoded data from the embedded position according to \uFEFF: \uFEFF[S3 result]\uFEFF.

Step S602, remove the prefix and suffix.

Remove the prefix information uFEFF\ and the suffix information \uFEFF to obtain [S3 result].

Step S603: Decode zero-width string

Restore [S3 result] to binary data S2 according to the corresponding relationship in Table 1:

Step S604, restore to hexadecimal Unicode encoding.

Convert each 8-bit binary number in S2 to the corresponding hexadecimal result S1: \u7528\u6237\u4e00.

Step S605, restore S1 to original information.

Find the corresponding character information through the hexadecimal Unicode encoding table to get "User 1".

The embodiment of the present application also proposes a database watermark embedding device, as shown in FIG3 , the device includes:

The first conversion module 301 is used to convert the data to be processed into preliminary coded data according to a preset coding rule;

A second conversion module 302, used to convert each character in the preliminary encoded data into a binary number of a preset length and obtain binary data;

A first mapping module 303, configured to map the binary data into zero-width character string data according to a preset mapping relationship table;

An adding module 304 is used to add a preset zero-width character string before and after the zero-width character string data to obtain final encoded data;

An embedding module 305, used for embedding the final encoded data as a database watermark into an embedding position corresponding to the data to be processed;

The embodiment of the present application further proposes a database watermark tracing device, as shown in FIG4 , the device includes:

A determination module 401, configured to determine the final encoded data according to the embedding position;

A removal module 402, used for removing the preset zero-width character string before and after the final encoded data and obtaining the zero-width character string data;

A second mapping module 403, configured to map the zero-width character string data into the binary data according to the preset mapping relationship table;

A third conversion module 404, configured to divide the binary data into a plurality of groups of binary numbers according to the preset length, and convert each group of binary numbers into each character of the preliminary coded data;

The fourth conversion module 405 is used to convert each of the characters into the data to be processed according to the preset encoding rule, and use the data to be processed as the tracing result data.

The embodiment of the present invention further provides an electronic device, as shown in FIG5 , including a processor 501, a communication interface 502, a memory 503 and a communication bus 504, wherein the processor 501, the communication interface 502, and the memory 503 communicate with each other via the communication bus 504.

Memory 503, used to store executable instructions of the processor;

The processor 501 is configured to execute, by executing the executable instructions:

or,

Determining the final encoded data according to the embedding position;

Dividing the binary data into multiple groups of binary numbers according to a preset length, and converting each group of binary numbers into each character of the preliminary coded data;

The communication bus can be a PCI (Peripheral Component Interconnect) bus or an EISA (Extended Industry Standard Architecture) bus. The communication bus can be divided into an address bus, a data bus, a control bus, etc. For ease of representation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.

The communication interface is used for communication between the above electronic device and other devices.

The memory may include RAM (Random Access Memory), or may include a non-volatile memory, such as at least one disk storage. Optionally, the memory may also be at least one storage device located away from the aforementioned processor.

The above-mentioned processors can be general-purpose processors, including CPU (Central Processing Unit), NP (Network Processor), etc.; they can also be DSP (Digital Signal Processing), ASIC (Application Specific Integrated Circuit), FPGA (Field-Programmable Gate Array) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.

In another embodiment of the present invention, a computer-readable storage medium is provided, in which a computer program is stored. When the computer program is executed by a processor, the database watermark embedding method or database watermark tracing method as described above is implemented.

In another embodiment of the present invention, a computer program product including instructions is provided. When the computer program product is run on a computer, the computer executes the database watermark embedding method or database watermark tracing method as described above.

In the above embodiments, all or part of the embodiments may be implemented by software, hardware, firmware, or any combination thereof. When implemented by software, all or part of the embodiments may be implemented in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the embodiments described in the embodiments of the present invention are generated. Process or function. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wired (e.g., coaxial cable, optical fiber, digital subscriber line) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium may be any available medium that a computer can access or a data storage device such as a server or data center that includes one or more available media integrated therein. The available medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a solid-state hard disk), etc.

It should be noted that, in this article, relational terms such as first and second, etc. are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Moreover, the terms "include", "comprise" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, article or device. In the absence of further restrictions, the elements defined by the sentence "comprise a ..." do not exclude the presence of other identical elements in the process, method, article or device including the elements.

Each embodiment in this specification is described in a related manner, and the same or similar parts between the embodiments can be referenced to each other, and each embodiment focuses on the differences from other embodiments.

The above description is only a preferred embodiment of the present invention and is not intended to limit the protection scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.

Claims

A method for embedding a database watermark, characterized in that the method comprises:

Convert the data to be processed into preliminary coded data according to preset coding rules;

Convert each character in the preliminary coded data into a binary number of a preset length and obtain binary data;

Mapping the binary data into zero-width character string data according to a preset mapping relationship table;

Adding preset zero-width character strings before and after the zero-width character string data to obtain final encoded data;

Embedding the final encoded data as a database watermark into an embedding position corresponding to the data to be processed;

The preset length is not less than the length of the original binary number corresponding to the character, and the preset mapping relationship table is determined according to the mapping relationship between different binary numbers with preset bit numbers and different zero-width character strings.
The method according to claim 1, characterized in that before embedding the final encoded data as a database watermark into an embedding position corresponding to the data to be processed, the method further comprises:

Determining the embedding position according to the tag mark of the data to be processed;

The label mark is determined in advance according to the type, length, position and attribute of the data to be processed.
The method according to claim 1, characterized in that the binary data is mapped to zero-width string data according to a preset mapping relationship table, specifically:

Dividing the binary data into a plurality of groups of sub-data according to the preset number of bits;

Each group of the sub-data is mapped into a zero-width character string according to the preset mapping relationship table to obtain the zero-width character string data.
The method according to claim 1, characterized in that each character in the preliminary encoded data is converted into a binary number of a preset length and binary data is obtained, specifically:

Perform binary conversion on each character in sequence to obtain the original binary number;

If the length of the original binary number is less than the preset length, padding zeros before the highest bit of the original binary number so that the length of the original binary number reaches the preset length;

The binary data is obtained according to a binary number of a preset length corresponding to each of the characters.
The method according to claim 1, characterized in that the preset encoding rule includes an encoding rule corresponding to hexadecimal Unicode encoding, or decimal Unicode encoding, or hexadecimal GBK encoding, or decimal GBK encoding.
A method for tracing the source of a database watermark as claimed in claim 1, characterized in that the method comprises:

Determining the final encoded data according to the embedding position;

Removing the preset zero-width character strings before and after the final encoded data and obtaining the zero-width character string data;

Mapping the zero-width character string data to the binary data according to the preset mapping relationship table;

Dividing the binary data into multiple groups of binary numbers according to the preset length, and converting each group of binary numbers into each character of the preliminary coded data;

Each of the characters is converted into the data to be processed according to the preset encoding rule, and the data to be processed is used as the traceability result data.
The method according to claim 6 is characterized in that the embedding position is determined by a tag mark, and the tag mark is pre-determined based on the type, length, position and attributes of the data to be processed.
A database watermark embedding device, characterized in that the device comprises:

A first conversion module, used to convert the data to be processed into preliminary coded data according to a preset coding rule;

A second conversion module, used for converting each character in the preliminary coded data into a binary number of a preset length and obtaining binary data;

A first mapping module, used for mapping the binary data into zero-width character string data according to a preset mapping relationship table;

An adding module, used for adding a preset zero-width character string before and after the zero-width character string data respectively to obtain final encoded data;

An embedding module, used for embedding the final encoded data as a database watermark into an embedding position corresponding to the data to be processed;

The preset length is not less than the length of the original binary number corresponding to the character, and the preset mapping relationship table is a table of the relationship between different binary numbers with preset bits and different zero-width character strings. The mapping relationship is determined.
A database watermark tracing device as claimed in claim 8, characterized in that the device comprises:

A determination module, used for determining the final encoded data according to the embedding position;

A removal module, used for removing the preset zero-width character string before and after the final encoded data and obtaining the zero-width character string data;

A second mapping module, used for mapping the zero-width character string data into the binary data according to the preset mapping relationship table;

A third conversion module, used for dividing the binary data into a plurality of groups of binary numbers according to the preset length, and converting each group of binary numbers into each character of the preliminary coded data;

The fourth conversion module is used to convert each of the characters into the data to be processed according to the preset encoding rule, and use the data to be processed as the tracing result data.
An electronic device, comprising:

Processor; and

A memory, configured to store executable instructions of the processor;

The processor is configured to execute the embedding method described in any one of claims 1 to 5 or the tracing method described in any one of claims 6 to 7 by executing the executable instructions.