CN117155576A - Data asset tracing method based on multi-scale pooling hash data fingerprint - Google Patents

Data asset tracing method based on multi-scale pooling hash data fingerprint Download PDF

Info

Publication number
CN117155576A
CN117155576A CN202311017090.0A CN202311017090A CN117155576A CN 117155576 A CN117155576 A CN 117155576A CN 202311017090 A CN202311017090 A CN 202311017090A CN 117155576 A CN117155576 A CN 117155576A
Authority
CN
China
Prior art keywords
data
hash
fingerprint
tracing
mph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311017090.0A
Other languages
Chinese (zh)
Inventor
邵国林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanchang University
Original Assignee
Nanchang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanchang University filed Critical Nanchang University
Priority to CN202311017090.0A priority Critical patent/CN117155576A/en
Publication of CN117155576A publication Critical patent/CN117155576A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3236Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions
    • H04L9/3239Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions involving non-keyed hash functions, e.g. modification detection codes [MDCs], MD5, SHA or RIPEMD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/12Applying verification of the received information
    • H04L63/123Applying verification of the received information received data contents, e.g. message integrity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0643Hash functions, e.g. MD5, SHA, HMAC or f9 MAC
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Health & Medical Sciences (AREA)
  • Power Engineering (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Storage Device Security (AREA)

Abstract

The application discloses a data asset tracing method based on multiscale pooling hash data fingerprints, when a client generates MPH data fingerprints, a substring is firstly constructed on a data byte sequence based on a K-gram method to extract low-order sequence information of a data stream, then multiscale and multiscale pooling processing is carried out on the substring hash result to extract high-order structure information of the data stream, and finally key information extracted by multiscale pooling operation is used for generating the data fingerprints with high robustness through a local sensitive hash method and is reported to a server for tracing. When the server-side traces the source, firstly, the data fingerprint is extracted according to the same method, and then the data asset is traced based on the fingerprint similarity. According to the multi-scale pooling hash method, the robustness of the data fingerprint in the variable data asset tracing scene is improved, so that the continuity of tracing information can be guaranteed under the countermeasure environment of data content tampering and data morphological change.

Description

Data asset tracing method based on multi-scale pooling hash data fingerprint
Technical Field
The application relates to the technical field of data tracing and leakage prevention, in particular to a data asset tracing method based on multi-scale pooling hash data fingerprints.
Background
Enterprises and organizations accumulate large amounts of important, sensitive resource data in operation, which are data assets that are important to the enterprise. Such important, sensitive data assets, once stolen, can cause serious property damage, loss of benefit to the enterprise, and disruption of normal industry and economic order. Currently, data leakage exceeds data destruction to become the maximum risk of data security, and a data tracing technology is a key technology for realizing data security and privacy protection.
The traditional data asset tracing mode has the following defects: the traditional data tracing technology depends on stable data form, once the data form is changed, the data tracing may face the risk of interruption, and the traditional data tracing technology is difficult to cope with the flexible and changeable data tracing requirements of data tampering, transfer, interception and the like in a real scene.
Disclosure of Invention
Aiming at the problems, the application aims to provide a data asset tracing method based on multi-scale pooled hash data fingerprints, which is used for generating data fingerprints with high robustness through a multi-scale information extraction and pooled hash method and tracing data assets based on the similarity of the multi-scale hashed data fingerprints. By the multi-scale pooling hash method, robustness of the data fingerprint in a variable-state data asset tracing scene is improved, and therefore continuity of tracing information can be guaranteed under an countermeasure environment of data content tampering and data form change.
In order to achieve the above purpose, the application adopts the following technical scheme:
a data asset tracing method based on multi-scale pooled hash data fingerprint comprises two parts: (1) multi-scale pooled hash (MPH) data fingerprint generation; (2) data asset tracing based on MPH fingerprint similarity.
The multi-scale pooled hash (MPH) data fingerprint generation process includes the steps of:
step 11: the client monitors new data generation, specifically, the data fingerprint generation client monitors key data operation behaviors in the terminal, including file creation, modification, movement, copying, outgoing, deletion and other data processing scenes, and if the new data file generation is found, the client enters a data fingerprint generation link;
step 12: extracting low-order sequence information of a data stream, namely converting a data asset into a byte sequence, carrying out byte-by-byte sliding window processing on the byte sequence by a k-gram method, wherein the sliding window size is k, splicing k byte data in the sliding window into a unit data substring each time, and finally converting an original byte sequence with the length of n into n-k+1 substring sequences with the unit length of k; optionally, the data may be pre-processed initially, specifically, the data asset is processed according to a specific algorithm, and key information representing the content of the data is extracted;
step 13: carrying out hash processing on each sub-string, converting the sub-string into a byte with a fixed length m, and finally converting the sub-string sequence into a hash sequence with a length of L=n-k+1;
step 14: the multi-scale extraction of the data stream high-order structural information is specifically to extract information with different granularities and different types from a hash sequence with the length L by a pyramid pooling method. The calculation process is as follows:
step 141: sequentially according to the window of 2, 4, 8, … … and 2 i Carrying out pooling treatment, wherein i represents the number of layers of pooling;
step 142: respectively calculating the maximum hash and the minimum hash of each block of data, and extracting 2 values from each block of data;
step 143: splicing the values to form a new hash sequence with the length of
Step 15: generating MPH data fingerprints based on a local sensitive hash method, specifically converting a hash sequence with a length of K into a hash value with a fixed length by using an LSH method, wherein the hash value represents the final data fingerprints of the data asset;
step 16: and writing the generated data fingerprints into a local log, and reporting to a data tracing server to provide a data foundation for data tracing. Optionally, in order to facilitate hash similarity calculation and fast fuzzy matching, the reported fingerprint is in the form of a binary string cut into N segments.
The data asset tracing process based on MPH fingerprint similarity comprises the following steps:
step 21: generating MPH data fingerprints of the leakage file, extracting the MPH data fingerprints of the leakage file to be analyzed according to the steps 12-15, and converting the data assets into a data fingerprint representation form of a binary hash string with a fixed length;
step 22: MPH data fingerprint similarity comparison: cutting the binary hash string into N sections, performing similarity measurement with records in a data fingerprint database reported to the cloud, performing equivalent matching on each section according to the pigeonry principle, and calculating the similarity only for the records with the same hash value on each section;
step 23: tracing data assets: similarity matching is measured according to hamming distance, if the similarity exceeds a threshold, the same data asset is considered, and the data operation process is traced back based on the similarity.
The beneficial effects of the application are as follows:
according to the multi-scale pooling hash method, the robustness of the data fingerprint in the variable data asset tracing scene is improved, so that the continuity of tracing information can be guaranteed under the countermeasure environment of data content tampering and data morphological change.
Drawings
FIG. 1 is a flow chart of an embodiment of the present application.
Detailed Description
In order to make the technical solution of the present application better understood by those skilled in the art, the technical solution of the present embodiment will be clearly and completely described in the following description with reference to the accompanying drawings in the embodiments of the present specification, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, shall fall within the scope of the application. The following examples are only for more clearly illustrating the technical aspects of the present application, and are not intended to limit the scope of the present application.
In the embodiment of the description, as shown in fig. 1, when a client generates an MPH data fingerprint, a substring is firstly constructed on a data byte sequence based on a K-gram method to extract low-order sequence information of a data stream, then a multi-scale and multi-type pooling process is performed on a substring hash result to extract high-order structure information of the data stream, and finally key information extracted by the multi-scale pooling operation is used for generating the data fingerprint with high robustness through a local sensitive hash method and is reported to a server for tracing. When the server-side traces the source, firstly, the data fingerprint is extracted according to the same method, and then the data asset is traced based on the fingerprint similarity. The application is further described below with reference to the accompanying drawings.
The multi-scale pooled hash (MPH) data fingerprint generation process includes the steps of:
step 11: the client monitors new data generation, specifically, the data fingerprint generation client monitors key data operation behaviors in the terminal through kernel functions of the kernel programming registration system, including file creation, modification, movement, copying, outgoing, deletion and other data processing scenes, and if the new data file generation is found, the client enters a data fingerprint generation link; the local file inspection can be judged according to the MD5 hash value of the file, if the fingerprint generation log of the same MD5 file exists locally, the log is not processed, and otherwise, a fingerprint generation link is entered. Optionally, selective interception may be performed according to configured monitoring policies, such as interception only for specific file suffixes, specific file sizes, specific file types, file handling actions of specific file operations.
Step 12: extracting low-order sequence information of a data stream, preprocessing an original data asset through a k-gram method to preserve context information of the data asset, enriching granularity of fingerprint information preservation, specifically converting the data asset into a byte sequence, carrying out byte-by-byte sliding window processing on the byte sequence through the k-gram method, splicing k byte data in the sliding window into a unit data substring each time, and finally converting the original byte sequence with the length of n into n-k+1 substring sequences with the unit length of k; for example, assuming that the byte sequence of the original data asset is ABCDEF, the length is 6, and the sliding window size is set to 3, the original byte sequence is converted to ABC BCD CDE DEF after processing by the k-gram method, i.e., 6-3+1=4 substrings, each of which has a length of 3. Optionally, the data may be pre-processed preliminarily before the k-gram processing, specifically, the data asset is processed according to a specific algorithm, and key information representing the content of the data is extracted, for example, stop words in a text sequence are removed, redundant information in a file is removed by other statistical methods, or data of the data asset is sampled to a certain extent;
step 13: carrying out hash processing on each sub-string, converting the sub-string into a byte with a fixed length m, and finally converting the sub-string sequence into a hash sequence with a length of L=n-k+1; for example, each sub-string in the ABC BCD CDE DEF is converted into a hash value with the same length through an MD5 hash method;
ABC:902FBDD2B1DF0C4F70B4A5D23525E932
BCD:8539EF1FBA74A70F5A77FCC3F25C1659
CDE:F8E054E3416DE72E874492E25C38B3EC
DEF:822DD494B3E14A82AA76BD455E6B6F4B
in order to save hash byte space, only low-order data of the hash value can be reserved as a hash result, and for convenience of description, only low 8 bits are reserved as a final hash value;
ABC:0x32(50)
BCD:0x59(80)
CDE:0xEC(236)
DEF:0x4B(75)
thereby converting the byte sequence into a hash sequence of length 4, 50 80 236 75.
Step 14: the data stream high-order structure information multi-scale extraction, in particular to the division and information extraction of different granularity and different types of the hash sequence through a multi-level multi-granularity pooling method. The calculation process is as follows:
(1) Sequentially according to the window of 2, 4, … … and 2 i Carrying out pooling treatment, wherein i represents the pooled layer number, for example, the hash sequence with the length of 4 is partitioned according to windows 2 and 4;
level 1: the window is 2, divided into 2 blocks, respectively (50, 80), (236, 75);
level 2: the window is 4 and divided into 1 block (50 80 236 75);
(2) Respectively calculating the maximum hash and the minimum hash of each block of data, and extracting 2 values from each block of data;
level 1: the window is 2, divided into 2 blocks, respectively (50, 80), (236, 75);
extracting (80), (236) respectively by maximum hashing;
extracting (50), (75) respectively by minimum hash;
level 2: the window is 4 and divided into 1 block (50 80 236 75);
extracting (236) respectively through the maximum hash;
extracting (50) respectively by minimum hash;
(3) Splicing the values to form a new hash sequence with the length ofIn the case, after multi-level multi-granularity extraction and splicing, hash value sequences (80), (236), (50), (75), (236) and (50) with the length of 6 are obtained;
step 15: generating MPH data fingerprints based on a local sensitive hash method, specifically converting a hash sequence with a length of K into a hash value with a fixed length by using an LSH method, wherein the hash value represents the final data fingerprints of the data asset; LSH adopts a locally sensitive hash scheme such as SimHash, miniHash. Taking SimHash as an example:
firstly, converting the hash value sequence into a binary form;
(80):01010000
(236):11101100
(50):00110010
(75):01001011
(236):11101100
(50):00110010
secondly, aligning the hash values according to bits, adding the hash values according to a certain weight (the position value with the bit of 1 is replaced by 1, the position value with the bit of 0 is replaced by-1), and obtaining a result-2 22 0 0-2 0-5 assuming that the weights are the same and are all 1; and setting the bit with the summation result being greater than or equal to 0 as 1, otherwise setting the bit as 0, and obtaining the hash fingerprint as 01111010.
Step 16: writing the generated MPH data fingerprint into a local log, and reporting to a data tracing server side to provide a data foundation for data tracing; the data fingerprint log uploaded to the cloud can comprise information such as an operation time stamp, terminal information (IP, system and the like), a user Identity (ID), a file name, a file attribute, a file ID, a data fingerprint and the like; optionally, in order to facilitate hash similarity calculation and fast fuzzy matching, the reported fingerprint is in the form of a binary string cut into N segments. Assuming that the data fingerprint is 01111010 and the number of segments N is 4, the data fingerprint storage forms are n1=01, n2=11, n3=10, n4=10.
The data asset tracing process based on MPH fingerprint similarity comprises the following steps:
step 21: generating MPH data fingerprints of the leakage file, extracting the MPH data fingerprints of the leakage file to be analyzed according to the steps 12-15, and converting the data assets into a data fingerprint representation form of a binary hash string with a fixed length;
step 22: MPH data fingerprint similarity comparison: cutting the binary hash string into N segments, and assuming that the extracted data fingerprint is 01101010, n1=01, n2=10, n3=10, n4=10; similarity matching is carried out with records in a data fingerprint database reported to the cloud, and in order to accelerate the calculation process and avoid meaningless measurement calculation, N1, N2, N3 and N4 segments can be respectively subjected to equivalent matching according to the pigeonry principle;
step 23: tracing data assets: the similarity matching can be measured according to the Hamming distance, if the similarity exceeds a certain threshold, the similarity is determined to be the same data asset, and the data operation process is traced back based on the similarity; it is assumed that the above data of the fingerprint 01111010 can be matched according to the matching results of N1, N2, N3, N4, and then the similarity with the leaked file fingerprint (01101010) is calculated to be 7/8=87.5%.
The foregoing description of the preferred embodiments of the present application has been presented only in terms of those specific and detailed descriptions, and is not, therefore, to be construed as limiting the scope of the application. It should be noted that modifications, improvements and substitutions can be made by those skilled in the art without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims (4)

1. A data asset tracing method based on multi-scale pooled hash data fingerprint is characterized by comprising the following two parts: generating MPH data fingerprints of the multi-scale pooling hash; tracing data assets based on MPH fingerprint similarity;
the multi-scale pooled hash MPH data fingerprint generation process comprises the following steps:
step 11: the client monitors new data generation;
step 12: extracting low-order sequence information of a data stream, converting a data asset into a byte sequence, carrying out byte-by-byte sliding window processing on the byte sequence by a k-gram method, wherein the sliding window size is k, splicing k byte data in the sliding window into a unit data substring each time, and finally converting an original byte sequence with the length of n into n-k+1 substring sequences with the unit length of k;
step 13: carrying out hash processing on each sub-string, converting the sub-string into a byte with a fixed length m, and finally converting the sub-string sequence into a hash sequence with a length of L=n-k+1;
step 14: carrying out multi-scale extraction on the high-order structural information of the data stream, and carrying out information extraction of different scales and different types on the hash sequence with the length L by a pyramid pooling method;
step 15: generating MPH data fingerprints based on a local sensitive hash method;
step 16: writing the generated MPH data fingerprint into a local log, and reporting to a data tracing server side to provide a data foundation for data tracing;
the data asset tracing process based on MPH fingerprint similarity comprises the following steps:
step 21: generating MPH data fingerprints of the leakage file, extracting the MPH data fingerprints of the leakage file to be analyzed according to the steps 12-15, and converting the data assets into a data fingerprint representation form of a binary hash string with a fixed length;
step 22: MPH data fingerprint similarity comparison: cutting the binary hash string into N sections, performing similarity measurement with records in a data fingerprint database reported to the cloud, performing equivalent matching on each section according to the pigeonry principle, and calculating the similarity only for the records with the same hash value on each section;
step 23: tracing data assets: similarity matching is measured according to hamming distance, if the similarity exceeds a threshold, the same data asset is considered, and the data operation process is traced back based on the similarity.
2. The data asset tracing system based on data fingerprint similarity according to claim 1, wherein the step 11 is specifically: the data fingerprint generation client monitors key data operation behaviors in the terminal, including file creation, modification, movement, copying, outgoing and deleting data processing scenes, and if a new data file is found to be generated, the data fingerprint generation client enters a data fingerprint generation link.
3. The data asset tracing technology based on multi-scale pooled hash data fingerprint according to claim 1, wherein the calculation process of step 14 is:
step 141: sequentially according to the window of 2, 4, 8, … … and 2 i Carrying out pooling treatment, wherein i represents the number of layers of pooling;
step 142: respectively calculating the maximum hash and the minimum hash of each block of data, and extracting 2 values from each block of data;
step 143: these values are concatenated to form a new hash sequence of length k=2 (L/2+L/4+ … …) =
4. A data asset tracing technique based on multi-scale pooled hash data fingerprints as recited in claim 3, wherein the step 15 specifically comprises: the hash sequence with the length of K is converted into a hash value with the fixed length by using a local sensitive hash method, and the hash value represents the final data fingerprint of the data asset.
CN202311017090.0A 2023-08-14 2023-08-14 Data asset tracing method based on multi-scale pooling hash data fingerprint Pending CN117155576A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311017090.0A CN117155576A (en) 2023-08-14 2023-08-14 Data asset tracing method based on multi-scale pooling hash data fingerprint

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311017090.0A CN117155576A (en) 2023-08-14 2023-08-14 Data asset tracing method based on multi-scale pooling hash data fingerprint

Publications (1)

Publication Number Publication Date
CN117155576A true CN117155576A (en) 2023-12-01

Family

ID=88897843

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311017090.0A Pending CN117155576A (en) 2023-08-14 2023-08-14 Data asset tracing method based on multi-scale pooling hash data fingerprint

Country Status (1)

Country Link
CN (1) CN117155576A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020143196A1 (en) * 2019-01-11 2020-07-16 平安科技(深圳)有限公司 Communication method and device between blockchain nodes, storage medium and electronic apparatus
CN111683117A (en) * 2020-05-11 2020-09-18 厦门潭宏信息科技有限公司 Method, equipment and storage medium
CN114666063A (en) * 2022-03-21 2022-06-24 矩阵时光数字科技有限公司 Traditional Hash algorithm-based digital asset tracing method
WO2023020491A1 (en) * 2021-08-19 2023-02-23 西门子(中国)有限公司 Product traceability management method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020143196A1 (en) * 2019-01-11 2020-07-16 平安科技(深圳)有限公司 Communication method and device between blockchain nodes, storage medium and electronic apparatus
CN111683117A (en) * 2020-05-11 2020-09-18 厦门潭宏信息科技有限公司 Method, equipment and storage medium
WO2023020491A1 (en) * 2021-08-19 2023-02-23 西门子(中国)有限公司 Product traceability management method and system
CN114666063A (en) * 2022-03-21 2022-06-24 矩阵时光数字科技有限公司 Traditional Hash algorithm-based digital asset tracing method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张涛;: "数据标签在共享数据溯源中的应用研究", 通信技术, no. 01, 10 January 2020 (2020-01-10) *

Similar Documents

Publication Publication Date Title
CN105868305B (en) A kind of cloud storage data deduplication method for supporting fuzzy matching
CN108446363B (en) Data processing method and device of KV engine
US9514312B1 (en) Low-memory footprint fingerprinting and indexing for efficiently measuring document similarity and containment
Wang et al. Loguad: log unsupervised anomaly detection based on word2vec
US20190173897A1 (en) Malicious communication log detection device, malicious communication log detection method, and malicious communication log detection program
US10339124B2 (en) Data fingerprint strengthening
US11989161B2 (en) Generating readable, compressed event trace logs from raw event trace logs
CN115114599A (en) Method, device and equipment for processing database watermark and storage medium
CN114764557A (en) Data processing method and device, electronic equipment and storage medium
CN108229162A (en) A kind of implementation method of cloud platform virtual machine completeness check
Alves et al. Leveraging BERT's Power to Classify TTP from Unstructured Text
CN112529759A (en) Document processing method, device, equipment, storage medium and computer program product
CN117155576A (en) Data asset tracing method based on multi-scale pooling hash data fingerprint
CN110399464B (en) Similar news judgment method and system and electronic equipment
Al-Sharif et al. Carving and clustering files in ram for memory forensics
CN113821630A (en) Data clustering method and device
Singhal et al. A Novel approach of data deduplication for distributed storage
Vikraman et al. A study on various data de-duplication systems
CN113572860A (en) Method and device for tracking leaked data, storage system, equipment and storage medium
CN115883111A (en) Phishing website identification method and device, electronic equipment and storage medium
CN113407495A (en) SIMHASH-based file similarity determination method and system
CN117056133B (en) Data backup method, device and medium based on distributed Internet of things architecture
CN116451211B (en) File information processing method and system based on digital security
CN117061254B (en) Abnormal flow detection method, device and computer equipment
CN113591440B (en) Text processing method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Shao Guolin

Inventor after: Lu Yi

Inventor after: Zeng Xuemei

Inventor before: Shao Guolin