CN114077722A - Data leakage tracking method and device, electronic equipment and computer storage medium - Google Patents

Data leakage tracking method and device, electronic equipment and computer storage medium Download PDF

Info

Publication number
CN114077722A
CN114077722A CN202111223255.0A CN202111223255A CN114077722A CN 114077722 A CN114077722 A CN 114077722A CN 202111223255 A CN202111223255 A CN 202111223255A CN 114077722 A CN114077722 A CN 114077722A
Authority
CN
China
Prior art keywords
data
access
leaked
candidate
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111223255.0A
Other languages
Chinese (zh)
Inventor
刘余
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sangfor Technologies Co Ltd
Original Assignee
Sangfor Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sangfor Technologies Co Ltd filed Critical Sangfor Technologies Co Ltd
Priority to CN202111223255.0A priority Critical patent/CN114077722A/en
Publication of CN114077722A publication Critical patent/CN114077722A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/16Program or content traceability, e.g. by watermarking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Security & Cryptography (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Technology Law (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Storage Device Security (AREA)

Abstract

The embodiment of the invention provides a data leakage tracking method, a data leakage tracking device, electronic equipment and a computer storage medium, wherein the method comprises the following steps: acquiring information of at least one piece of data, wherein the information of each piece of data in the at least one piece of data comprises the content of the data, an access account number of the data and an access behavior of the access account number; acquiring leaked data, matching the leaked data with the content of at least one piece of data, and determining a matching result, wherein the matching result comprises the matching degree of the leaked data and the content of each piece of data of the at least one piece of data; determining at least one data of the candidate leakage from the at least one data according to the matching result; analyzing the information of the at least one candidate leaked data to obtain an analysis result of the at least one candidate leaked data; and determining an access account number for revealing the data according to the analysis result.

Description

Data leakage tracking method and device, electronic equipment and computer storage medium
Technical Field
The present invention relates to data leakage tracking technologies, and in particular, to a data leakage tracking method and apparatus, an electronic device, and a computer storage medium.
Background
In the big data era, data has huge value, and the data is generally stored in a big data platform in order to facilitate the use and management of the data. The big data platform is beneficial to sharing of data, but because different users can use the data in the big data platform, the security problem of data leakage inevitably exists, especially aiming at the leakage of some sensitive data, such as identification card numbers, bank card numbers, telephone numbers and the like. The data which is leaked are effectively traced to help a big data platform manage the data, and at present, no good method exists for tracing the data which is leaked, so that the technical problem to be solved urgently is how to trace the data which is leaked.
Disclosure of Invention
The embodiment of the invention provides a data leakage tracking method and device, electronic equipment and a computer storage medium.
The embodiment of the invention provides a data leakage tracking method, which comprises the following steps:
acquiring information of at least one piece of data, wherein the information of each piece of data in the at least one piece of data comprises the content of the data, an access account number of the data and an access behavior of the access account number;
acquiring leaked data, matching the leaked data with the content of the at least one piece of data, and determining a matching result, wherein the matching result comprises the matching degree of the leaked data and the content of each piece of data of the at least one piece of data; determining at least one data of the candidate leakage from the at least one data according to the matching result;
analyzing the information of the at least one candidate leaked data to obtain an analysis result of the at least one candidate leaked data;
and determining an access account number of the leaked data according to the analysis result.
In the foregoing solution, determining at least one data of candidate leakage from the at least one data according to the matching result includes:
in the at least one data, data satisfying a set condition is taken as data of the at least one leakage candidate, and the set condition includes: and the matching degree of the leaked data and the content of the leaked data is greater than a set threshold value.
In the foregoing solution, analyzing the information of the at least one candidate leaked data to obtain an analysis result of the at least one candidate leaked data includes:
analyzing the information of the at least one candidate leaked data to determine an access account number of the at least one candidate leaked data;
determining the access behavior of each access account in the access accounts of the at least one candidate leaked data according to the access accounts of the at least one candidate leaked data;
and performing anomaly detection on the access behaviors of the access accounts of the at least one candidate leaked data, and determining an analysis result of the at least one candidate leaked data, wherein the analysis result of the candidate leaked data comprises the abnormal access behaviors of the access accounts of the data.
In the foregoing solution, when the number of the at least one candidate leaked data is two or more, the determining, according to the analysis result, an access account number of the leaked data includes:
determining a corresponding weight value aiming at the abnormal access behavior of the access account of the data leaked by the candidate;
determining the sequence of the access account of the data leaked by the candidate according to the analysis result and the weight value corresponding to the abnormal access behavior;
and determining the access account number of the leaked data according to the sequence of the access account numbers of the candidate leaked data.
In the above scheme, the matching degree between the leaked data and the content of each of the at least one data is a ratio of the content of each data to the content of the leaked data.
In the above scheme, the acquiring information of at least one data includes:
decrypting encrypted information of at least one piece of data acquired from a big data center and/or a data sharing platform to obtain original information of the at least one piece of data;
and obtaining the information of the at least one data according to the original information of the at least one data.
In the above scheme, obtaining the information of the at least one data according to the original information of the at least one data includes:
determining the content and the log information of the at least one data according to the original information of the at least one data;
determining a session identifier of at least one data according to the log information of the at least one data;
determining an access account of at least one data according to the session identifier of the at least one data;
and determining the access behavior of the access account according to the access account of the at least one datum.
In the above scheme, the access behavior of the access account includes at least one of: address, access time, access frequency of the access source internetworking protocol.
The embodiment of the invention also provides a data leakage tracking device, which comprises: an acquisition module, a first determination module, an analysis module, and a second determination module, wherein,
the acquisition module is used for acquiring information of at least one piece of data, wherein the information of each piece of data in the at least one piece of data comprises the content of the data, an access account number of the data and an access behavior of the access account number;
the first determining module is used for acquiring the leaked data, matching the leaked data with the content of the at least one piece of data and determining a matching result, wherein the matching result comprises the matching degree of the leaked data and the content of each piece of data of the at least one piece of data; determining at least one data of the candidate leakage from the at least one data according to the matching result;
the analysis module is used for analyzing the information of the at least one candidate leaked data to obtain an analysis result of the at least one candidate leaked data;
and the second determination module is used for determining the access account number of the leaked data according to the analysis result.
In one implementation, the first determining model, configured to determine data of at least one candidate leak in the at least one data according to the matching result, includes:
in the at least one data, data satisfying a set condition is taken as data of the at least one leakage candidate, and the set condition includes: and the matching degree of the leaked data and the content of the leaked data is greater than a set threshold value.
In one implementation, the analyzing module is configured to analyze the information of the at least one candidate leaked data to obtain an analysis result of the at least one candidate leaked data, and includes:
analyzing the information of the at least one candidate leaked data to determine an access account number of the at least one candidate leaked data;
determining the access behavior of each access account in the access accounts of the at least one candidate leaked data according to the access accounts of the at least one candidate leaked data;
and performing anomaly detection on the access behaviors of the access accounts of the at least one candidate leaked data, and determining an analysis result of the at least one candidate leaked data, wherein the analysis result of the candidate leaked data comprises the abnormal access behaviors of the access accounts of the data.
In one implementation manner, when the number of the at least one candidate leaked data is two or more, the second determining module is configured to determine, according to the analysis result, an access account number of the leaked data, and includes:
determining a corresponding weight value aiming at the abnormal access behavior of the access account of the data leaked by the candidate;
determining the sequence of the access account of the data leaked by the candidate according to the analysis result and the weight value corresponding to the abnormal access behavior;
and determining the access account number of the leaked data according to the sequence of the access account numbers of the candidate leaked data.
In one implementation, the matching degree of the leaked data and the content of each data in the at least one data is a ratio of the content of each data to the content of the leaked data.
In one implementation manner, the obtaining module is configured to obtain information of at least one data, and includes:
decrypting encrypted information of at least one piece of data acquired from a big data center and/or a data sharing platform to obtain original information of the at least one piece of data;
and obtaining the information of the at least one data according to the original information of the at least one data.
In an implementation manner, the obtaining module is configured to obtain information of the at least one data according to original information of the at least one data, and includes:
determining the content and the log information of the at least one data according to the original information of the at least one data;
determining a session identifier of at least one data according to the log information of the at least one data;
determining an access account of at least one data according to the session identifier of the at least one data;
and determining the access behavior of the access account according to the access account of the at least one datum.
In one implementation, the access behavior of the access account includes at least one of: address, access time, access frequency of the access source internetworking protocol.
The embodiment of the present invention further provides an electronic device, which includes a memory, a processor, and a computer program that is stored in the memory and can be run on the processor, and when the processor executes the program, the processor implements any one of the above data leakage tracking methods.
An embodiment of the present invention further provides a computer storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements any one of the above data leakage tracking methods.
Based on the data leakage tracking method, the data leakage tracking device, the electronic equipment and the computer storage medium provided by the embodiment of the invention, the information of at least one piece of data is obtained, wherein the information of each piece of data in the at least one piece of data comprises the content of the data, the access account number of the data and the access behavior of the access account number; acquiring leaked data, matching the leaked data with the content of the at least one piece of data, and determining a matching result, wherein the matching result comprises the matching degree of the leaked data and the content of each piece of data of the at least one piece of data; determining at least one data of the candidate leakage from the at least one data according to the matching result; analyzing the information of the at least one candidate leaked data to obtain an analysis result of the at least one candidate leaked data; and determining an access account number of the leaked data according to the analysis result. The access account number of the data leaked by the candidate is determined by determining the access account number of the data leaked by the candidate based on the matching degree of the data content and the leaked data, and the access account number of the leaked data can be effectively determined according to the matching degree of the data leaked by the candidate and the abnormal access behavior of the access account number by performing abnormal detection on the access account number.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and, together with the description, serve to explain the principles of the invention;
fig. 1 is an application scenario diagram of a data leakage tracking method according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a data leakage tracking method according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a data leakage tracking platform according to the related art;
FIG. 4 is a flowchart illustrating an embodiment of determining at least one of content of data and log information according to the present invention;
fig. 5 is a schematic flowchart of a specific implementation of determining an access account of at least one piece of data according to an embodiment of the present invention;
fig. 6 is a schematic diagram of a data leakage tracking apparatus according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In the related art, in the big data era, data is stored in a scattered manner to a centralized manner, and is generally stored in a big data center and/or a data sharing platform. Illustratively, government affairs data, civil data, etc. may be stored in large data centers of various provinces, cities; teaching and research data, scientific research project data and the like of scientific research institutions are stored in big data platforms of information centers of the scientific research institutions. In the data sharing scenario, different users may use data in the large data center and/or the data sharing platform, and therefore, there is a security problem of data leakage, especially for the leakage of some sensitive data, for example, an identity card number, a bank card number, a telephone number, and the like. In order to improve the security of data in a large data center and/or a data sharing platform, it is necessary to find out information such as a user who leaks the data and time of leaking the data.
In view of the above technical problems, the technical solutions of the embodiments of the present disclosure are provided. The embodiments of the present invention will be described in further detail below with reference to the drawings and the embodiments. It should be understood that the examples provided herein are merely illustrative of the present invention and are not intended to limit the present invention. In addition, the following embodiments are provided as partial embodiments for implementing the present invention, not all embodiments for implementing the present invention, and the technical solutions described in the embodiments of the present invention may be implemented in any combination without conflict.
It should be noted that, in the embodiments of the present invention, the terms "comprises", "comprising" or any other variation thereof are intended to cover a non-exclusive inclusion, so that a method or apparatus including a series of elements includes not only the explicitly recited elements but also other elements not explicitly listed or inherent to the method or apparatus. Without further limitation, the use of the phrase "including a. -. said." does not exclude the presence of other elements (e.g., steps in a method or elements in a device, such as portions of circuitry, processors, programs, software, etc.) in the method or device in which the element is included.
The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.
For example, the data leakage tracing method provided in the embodiment of the present invention includes a series of steps, but the data leakage tracing method provided in the embodiment of the present invention is not limited to the described steps, and similarly, the data leakage tracing apparatus provided in the embodiment of the present invention includes a series of modules, but the data leakage tracing apparatus provided in the embodiment of the present invention is not limited to include the explicitly described modules, and may also include modules that are required to obtain relevant information or perform processing based on the information.
Embodiments of the invention may be implemented on a terminal and/or a server, where the terminal may be a thin client, a thick client, a hand-held or laptop device, a microprocessor-based system, a set-top box, a programmable consumer electronics, a network personal computer, a small computer system, and so forth. The server may be a small computer system, a mainframe computer system, a distributed cloud computing environment including any of the systems described above, and so forth.
An electronic device such as a server may include program modules that execute computer instructions. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks. The computer system/server may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.
The embodiment of the invention provides a technical scheme for data leakage tracking, which can be applied to tracking leaked data in a data sharing scene.
Fig. 1 shows an application scenario diagram of a data leakage tracking method according to an embodiment of the present invention. Referring to fig. 1, a data provider stores data in a big data center and/or a data sharing platform, and a data consumer 1 and a data consumer 2 can access a Web server in an Application Programming Interface (API) access manner, so as to obtain the data; the data user3 can access the database by means of database access, thereby obtaining the data. When the data user accesses the shared data, the data is automatically transmitted to the data leakage tracking platform. When data leakage occurs, the leaked data are imported into a data leakage tracking platform, and access accounts of the leaked data are determined through log association analysis, account association analysis and data access abnormity association analysis.
Based on the application scenario shown in fig. 1, the technical solution of the embodiment of the present invention is provided.
Fig. 2 is a schematic flow chart of a data leakage tracking method according to an embodiment of the present invention, and as shown in fig. 2, the flow chart may include:
step A201: the method comprises the steps of obtaining information of at least one piece of data, wherein the information of each piece of data in the at least one piece of data comprises the content of the data, an access account number of the data and an access behavior of the access account number.
In the embodiment of the invention, at least one datum refers to a datum in a big data center and/or a data sharing platform. At least one data in the big data center and/or the data sharing platform is encrypted, and the encrypted at least one data is transmitted to the data leakage tracking platform, which is shown in fig. 3.
In some embodiments, the encrypted information of at least one data obtained from the big data center and/or the data sharing platform is decrypted to obtain the original information of the at least one data;
and obtaining the information of the at least one data according to the original information of the at least one data.
In the embodiment of the invention, a data provider stores data in a big data center and/or a data sharing platform, and a data user can have the right of accessing the data by applying for and acquire the data by a database access mode or an API (application programming interface) access mode.
In the embodiment of the invention, the API is a set of definitions, programs and protocols, the communication of the computer software is realized through the API interface, the capability of accessing a set of routines by the application program and a developer based on software or hardware can be provided, and simultaneously, the source code does not need to be accessed and the details of the internal working mechanism do not need to be understood.
In the embodiment of the invention, at least one data in the big data center and/or the data sharing platform is encrypted, and the at least one encrypted data is transmitted to the data leakage tracking platform, so that the safety of the data in the transmission process can be ensured. And decrypting the at least one encrypted data, and recombining the data packet of the data to obtain the original information of the at least one data. And analyzing the original information of the at least one data to obtain the information of the at least one data.
It should be noted that, a data packet of data refers to a basic unit in communication transmission, and in general, communication transmission is performed by dividing a data stream into a plurality of data packets, and each data packet includes log information of a source IP address, a destination IP address, and a packet length.
In the embodiment of the present invention, the content of the data refers to a substantive thing included in the data, and exemplarily, a certain company information center stores data of employee information, and the content of the data includes: name, telephone number, ID card number, mail box, address.
In the embodiment of the present invention, the access account of the data is used to implement a function of data access, which is convenient for operations in a large data center and/or a data sharing platform. For example, the employee has an account number of a company information center, and can log in the information center to perform operations of inquiring, modifying and downloading personal information and inquiring information of others.
In the embodiment of the present invention, the access behavior of the access account refers to attribute information of the access account when accessing a big data center and/or a data sharing platform, and exemplarily, the access behavior includes at least one of the following: address, access time, and access frequency of an access source Internet Protocol (IP).
Step A202: acquiring leaked data, matching the leaked data with the content of at least one piece of data, and determining a matching result, wherein the matching result comprises the matching degree of the leaked data and the content of each piece of data of the at least one piece of data; and determining at least one leakage candidate data in the at least one data according to the matching result.
In the embodiment of the present invention, leaked data may be acquired from a network, where information of the leaked data includes content of the data, and the content of the leaked data includes at least one of the following: the content of a certain piece of data in the big data center and/or the data sharing platform, the combination of the content of a plurality of pieces of data in the big data center and/or the data sharing platform, and the combination of at least one piece of data in the big data center and/or the data sharing platform and the content of other data. In this embodiment of the present invention, the at least one data leaked as a candidate refers to data satisfying a set condition in the at least one data, where the set condition includes: the matching degree with the content of the leaked data is larger than a set threshold value. The matching degree is the ratio of the content of each data after the data is removed from the content of the leaked data after the data is removed from the content.
In the embodiment of the present invention, the threshold may be set according to a history threshold or an existing experience, and for example, the threshold is set to be 60%, 65% or 70%.
In the embodiment of the invention, the leaked data is matched with the content of each data in the at least one data to obtain the matching degree of the leaked data and the content of each data, so that the matching result can be determined. According to the matching result, at least one candidate leaked data can be determined in the at least one data, and information of the at least one candidate leaked data is stored in an elasticsearch (es) system.
In the embodiment of the invention, the ES system is a distributed, high-expansion and high-real-time search and data analysis engine, and can quickly store, inquire and analyze mass data in a large data center and/or a data sharing platform.
Step A203: and analyzing the information of the at least one candidate leaked data to obtain an analysis result of the at least one candidate leaked data.
In this embodiment of the present invention, the analysis result of the at least one candidate leaked data refers to an analysis result of an abnormal access behavior of an access account of the at least one candidate leaked data, and the analysis result includes at least one of: the access source IP address, the access time and the access frequency are detected according to the access time and the access frequency.
In the embodiment of the invention, according to at least one candidate leaked data, an account of the at least one candidate leaked data is determined, and abnormal access behavior of an access account of the at least one candidate leaked data is analyzed to obtain an analysis result of the at least one candidate leaked data.
Step A204: and determining an access account number for revealing the data according to the analysis result.
In the embodiment of the invention, according to the analysis result of the abnormal access behavior of the access account of the at least one candidate leaked data, a corresponding weight value is set for the abnormal access behavior, the total value of the abnormal access behavior of the access account of the at least one candidate leaked data is determined, and according to the total value of the abnormal access behavior of the access account of the at least one candidate leaked data, the sequence of the access account of the at least one candidate leaked data is determined by combining the matching degree of the content of the corresponding candidate leaked data, so that the access account of the leaked data can be determined.
In practical applications, the steps a201 to a204 are implemented by a Processor based on an electronic Device, and the Processor may be at least one of an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a Central Processing Unit (CPU), a controller, a microcontroller, and a microprocessor.
It can be seen that, in the embodiment of the present invention, the leaked data is matched with the content of each data in the at least one data to obtain the matching degree of the content of each data, the at least one candidate leaked data is determined according to the matching degree of the content of each data and the set threshold, and when the matching degree of the data is greater than the set threshold, it is indicated that the leaked data has a possibility of leakage, so that the data which is possibly leaked can be comprehensively and accurately found by determining the matching degree of the leaked data and the content of each data in the at least one data. The access behavior of the access account of at least one candidate leaked data is subjected to abnormal detection to obtain an analysis result of the access behavior, and the access account of the leaked data can be determined by combining the matching degree of the content of the data and the analysis result of the access behavior.
In some embodiments, the content and the log information of the at least one data are determined according to the original information of the at least one data;
according to the log information of at least one data, the session identification of at least one data;
according to the session identification of at least one piece of data, an access account number of the at least one piece of data;
and determining the access behavior of the access account according to the access account of the at least one datum.
In the embodiment of the invention, according to the original information of at least one piece of data, the access direction of a Hyper Text Transfer Protocol (HTTP) accessing at least one piece of data is analyzed in a bidirectional way, the analysis comprises the analysis of the HTTP request direction and the analysis of the HTTP response direction, the content and the log information of at least one piece of data can be obtained, and the content and the log information of at least one piece of data are stored in an ES system.
It should be noted that HTTP is a protocol for requests and responses. The process of processing the transaction by the HTTP comprises the following steps: (1) the client establishes connection with the server; (2) a client makes a request to a server; (3) the server receives the request and returns a corresponding file as a response according to the request; (4) the client closes the connection with the server. The HTTP connection between the client and the server is one-time connection, only one request is processed in each connection, the connection is closed after the server returns the response of the request, and a new connection is established in the next request.
In the embodiment of the present invention, the parsing of the HTTP request direction includes extracting a Universal Resource Identifier (URI), a Host (Host), a Token (Token), and a user credential (Authorization) of the request direction, and acquiring a source IP address, a destination IP address, and a destination port. It should be noted that, a URI is a character string used for identifying the name of an internet resource, and resources in a network, such as a document, a graph, a program, etc., can be located by one URI, where the URI is an interface for acquiring data; host is a readable and writable string, where Host is the IP address or domain name of the destination server; the Token can be used for access request authentication, after the request in the request direction is authenticated by the server, the server returns the Token, and the Token can be used for judging whether the access request has the authority or not; authorization is to generate authority according to the identity certificate provided by the user and to grant corresponding authority to the user.
In the embodiment of the present invention, the parsing of the HTTP response direction includes parsing of response data (Extensible Markup Language, xml) and (JavaScript Object notification, JSON), it should be noted that xml and JSON are data exchange formats that can be used to describe data and exchange data, and data responded by the server generally adopts an xml or JSON format.
Illustratively, the data of one server response is in JSON format, and the data is: [ { "name": "test 1", "age": 10, "tel": 13155262731}, { "name": "test 2", "age": 20, "tel": 13155262712}]. Analyzing the response data, extracting the value of the data, and obtaining: test1, 10, 13155262731; test2, 20, 13155262712.
In the embodiment of the present invention, according to the log information of the at least one data, an analysis result of the access direction of the at least one data HTTP may be obtained, where the access direction of the HTTP includes a request direction and a response direction, so that the URI and the session identifier Token returned by the server may be obtained, and both the request direction and the response direction of the at least one data HTTP may have the session identifier Token. The access account of the at least one data can be determined by obtaining the session identification Token of the request direction and the response direction of the at least one data HTTP.
In the embodiment of the present invention, a hash table of at least one access account of data and a corresponding session identifier Token is established, which may be used for querying an analysis result of at least one subsequent data HTTP access direction, and when querying the access account of at least one data, the corresponding access account is searched in the relationship table through the session identifier Token in the request direction or the response direction of at least one data.
Illustratively, the access account of one data is: "test 1", the session identifier Token of the request direction or the response direction of the data is: "zkWyIg + htfillstya 5xjCCGe4c1W106FG9riLC ═ it should be noted that the session identifier Token returned by the server is encoded by base 64.
Establishing a hash table for the access account and the session identifier Token, wherein the hash table stores: when an access account of the data needs to be queried, the access account of the data can be determined only by obtaining the session identifier Token of the data request direction or the data response direction.
In the embodiment of the invention, according to the analysis result of at least one data HTTP access direction, the access account of at least one data, the source IP address, the destination IP address, the access times and the access time of the access account can be obtained, so that the access behavior of the access account can be determined.
In one example, determining the content of the at least one data and the log information may be calculated by the following process.
Fig. 4 is a schematic flowchart of a specific implementation of determining content and log information of at least one data according to an embodiment of the present invention, and as shown in fig. 4, the flowchart may include:
step 41: raw information of at least one data is obtained.
In an embodiment of the present invention, the original information of the at least one data represents decrypted data obtained from the big data center and/or the data sharing platform.
Step 42: and judging whether the HTTP access direction of at least one data is a response direction, if so, executing the step 43, otherwise, executing the step 45.
In the embodiment of the invention, bidirectional analysis is carried out on the HTTP access direction of at least one datum, including analysis of the HTTP request direction and analysis of the HTTP response direction.
Step 43: and judging whether the format of the response direction data is an xml or JSON format, if so, executing the step 44, and if not, ending the flow.
Here, the data of the server response is generally transmitted in xml or JSON format.
Step 44: and analyzing the data of the response direction to obtain the value of the data of the response direction.
Here, by parsing the data in xml or JSON format, the value of the corresponding data can be obtained.
Step 45: and analyzing the HTTP request direction, and extracting the URI, the Host, the Token, the Authorization, the source IP address, the destination IP address and the destination port.
In the embodiment of the invention, when the HTTP access direction is the request direction, the HTTP request direction is analyzed, and here, the access account number is stored in URI or Authorization.
Step 46: the parsed result is stored in the ES system.
Here, the analysis result is a value obtained by analyzing a request direction and a response direction of at least one data HTTP, and the analysis result of the request direction of the HTTP includes: URI, Host, Token, Authorization, source IP address, destination IP address and destination port; the result of parsing the response direction of HTTP includes a value obtained by parsing data in xml or JSON format.
In the embodiment of the invention, the analysis result of the HTTP access direction of at least one data is stored in the ES system, and the ES system stores the content and the log information of at least one data.
In one example, determining an access account for at least one datum may be calculated by the following process.
Fig. 5 is a schematic flowchart of a specific implementation of determining an access account of at least one data according to an embodiment of the present invention, and as shown in fig. 5, the flowchart may include:
step 51: raw information of at least one data is obtained.
In an embodiment of the present invention, the original information of the at least one data represents decrypted data obtained from the big data center and/or the data sharing platform.
Step 52: the HTTP access direction of at least one data is judged, if it is the HTTP response direction, step 53 is executed, and if it is the HTTP request direction, step 55 is executed.
Step 53: and judging whether the response direction of the HTTP contains the session identifier Token, if so, executing step 54, and if not, ending the flow.
Step 54: the session identification Token is extracted and step 59 is executed.
Here, the session identifier Token returned by the server is obtained.
Step 55: and judging whether the URI contains the access account number, if so, executing step 56, and if not, executing step 58.
Here, when the HTTP access direction of the at least one data is a request direction, a URI may be obtained, in which an access account may exist.
Step 56: the access account is extracted.
Here, an access account for at least one data is obtained.
And 57: the access account number is saved in HTTP information, and step 59 is executed.
Step 58: and judging whether the HTTP request direction contains an access account, if so, executing the step 56, and otherwise, ending the process.
In the embodiment of the invention, the HTTP request direction is analyzed, and the access account number is generally stored in URI or Authorization.
Step 59: and establishing a hash table of the access account and the session identification Token.
In the embodiment of the present invention, a hash table of the access account and the corresponding session identifier Token of at least one piece of data is established, so that the user can query the HTTP access of at least one piece of data subsequently.
In some embodiments, the information of the at least one candidate leaked data is analyzed to determine an access account number of the at least one candidate leaked data;
determining the access behavior of each access account in the access accounts of the at least one candidate leaked data according to the access accounts of the at least one candidate leaked data;
and carrying out anomaly detection on the access behaviors of the access accounts of the at least one candidate leaked data, and determining the analysis result of the at least one candidate leaked data, wherein the analysis result of the candidate leaked data comprises the abnormal access behaviors of the access accounts of the data.
In the embodiment of the present invention, at least one candidate leaked data is queried in the ES system, and information of the at least one candidate leaked data can be obtained, where the information of the at least one candidate leaked data includes: the content of the data, the access account number of the data and the access behavior of the access account number, so that the access account number of at least one candidate leaked data can be determined; for each access account in the access accounts of the at least one candidate leaked data, according to each access account, the access behavior of the corresponding account when accessing the candidate leaked data can be determined.
In the embodiment of the invention, the access behavior of each account is subjected to anomaly detection aiming at the access behavior of each access account in the access accounts of at least one candidate leaked data, wherein the anomaly detection comprises at least one of the following items: detecting an abnormality of an access source IP address, detecting an abnormality of an access time, and detecting an abnormality of an access frequency.
In the embodiment of the invention, the abnormal detection of the access source IP address refers to judging whether the access behavior is abnormal or not based on the change of the geographic position. An IP geographical position library is built in the data leakage tracking platform, the geographical position of the access source IP is determined by analyzing the access source IP, and when the access source IP address changes suddenly, the access behavior is abnormal, and the access account number may leak data in the big data center and/or the data sharing platform.
Illustratively, an employee owns an access account of a company information center, under normal conditions, a source IP address of the access account of the employee is a fixed geographic location, and when it is detected that the source IP address of the access account of the employee is another geographic location, it indicates that an exception exists in the access behavior.
In the embodiment of the invention, the access time abnormity detection is to judge whether the access behavior is abnormal or not based on the time dimension. The time for accessing the account to the big data center and/or the data sharing platform is stored in the data leakage tracking platform, and when the access time changes suddenly, the access behavior is abnormal.
Illustratively, an employee uses an access account to access a company information center, under normal conditions, the time of the employee accessing the information center is working time (8: 00-18: 00), and when the time of the employee accessing the information center is detected to be 23:00, the fact that the access behavior is abnormal is shown.
In the embodiment of the invention, the access frequency abnormity detection is to judge whether the access behavior is abnormal or not based on the access times. Based on machine learning, a reference model of the access frequency and the access amount of each account can be established for the frequency and the size of the data amount of the access account accessing data every day and the frequency and the size of the data amount of the access account accessing data every month. When the access frequency and the access amount of the access account are suddenly changed, the access behavior is abnormal.
Illustratively, the employee uses the access account to access the company information center, under normal conditions, the number of times that the employee accesses the company information center every day is 100, and when the number of times that the employee accesses the company information center every day is 500, it is indicated that the access behavior is abnormal.
In some embodiments, when the number of the at least one candidate leaked data is two or more, determining a corresponding weight value for an abnormal access behavior of an access account of the candidate leaked data;
determining the sequence of the access account numbers of the data which are leaked by candidates according to the analysis result and the weight value corresponding to the abnormal access behavior;
and determining the access account number of the leaked data according to the sequence of the access account numbers of the candidate leaked data.
In the embodiment of the present invention, when the number of at least one candidate leaked data is one, it may be determined without analysis that the candidate leaked data is a leaked data, so that the access account of the candidate leaked data is an account of the leaked data. When the number of the at least one candidate leaked data is two or more, the abnormal access behavior of the access account of the candidate leaked data needs to be analyzed, so as to determine the account of the leaked data.
In the embodiment of the invention, a corresponding weight value can be set for the abnormal access behavior of the access account of the data leaked by the candidate according to the existing experience or the information of the data leaked by the candidate; and determining the sum of the values of the abnormal access behaviors of the access account numbers of the data leaked by candidates according to the analysis result and the weight values corresponding to the abnormal access behaviors.
In the embodiment of the invention, the rank of the content matching degree of the candidate leaked data is determined according to the content matching degree of the candidate leaked data, and it needs to be noted that the higher the matching degree is, the higher the rank of the content matching degree of the candidate leaked data is; and aiming at the ranking of the content matching degree of the data leaked by the candidates, when the ranking is the same, comparing the total value of the abnormal access behaviors of the access accounts of the data leaked by the candidates with the same ranking, and determining the ranking of the access accounts of the data leaked by the candidates.
In the embodiment of the invention, at least one access account with the top ranking can be selected as the access account revealing the data according to the ranking of the access accounts of the data revealing candidates.
In a specific example, the employee information data stored in a company information center is taken as an example for explanation, and the employee information data includes: name, telephone number, identification number, mailbox, address, gender. The employee information data is internal data of a company, external publishing is not allowed, when data leakage exists, an account number of the leaked data needs to be tracked, and the employee information data can be managed conveniently and better.
Assume that a batch of employee information data is revealed, for example, as shown in table 1 below, table 1 is the revealed employee information data.
Name (I) Telephone set Identity card Mailbox Address Sex
Zhang San 13112345601 3306xxxx01 abc@mail.com addr1 For male
Li Si 13112345602 3306xxxx02 Abc2@mail.com addr2 For male
Wangsan 13112345603 3306xxxx03 Abc3@mail.com addr3 Woman
Sun Liu 13112345604 3306xxxx04 Abc4@mail.com addr4 Woman
TABLE 1
Matching the leaked employee information data with employee information data of the information center to obtain matching degree of the employee information data of the information center and the leaked employee information data, taking the data with the matching degree larger than 0 as candidate leaked data, wherein it needs to be stated that the matching degree of the data is a basis, and when the matching degree is larger than 0, the data leakage is possibly generated.
The information of the data of the candidate leakage can be obtained by querying the data of the candidate leakage in the ES system, for example, as shown in table 2 below, table 2 is the information of the data of the candidate leakage.
Data of Account for access to data
Com, Zhang San, 13112345601, abc @ mail User1
Li Si, 13112345602 User1
Wang Di, 13112345603 User2
Zhang San, addr1 User2
Sunwu, woman User3
Li Si, 13112345602 and Man User3
Wang Di, 3306xxxx03 User4
TABLE 2
According to the information of the data leaked by the candidate, the matching degree of the data leaked by the candidate can be obtained. The matching degree of the access content of the User1 is 20.8%; the matching degree of the access content of the User2 is 16.7%; the matching degree of the access content of the User3 is 20.8%; the matching degree of the access content of User4 was 8.3%.
According to the matching degree, the ranking of the matching degree of the candidate leaked data can be obtained: user1 ═ User3> User2> User 4. Therefore, from the matching point of view, the possibility that the data is leaked by the User1 and the User3 is the largest, and the possibility that the data is leaked by the User4 is the smallest after the User2 times.
Performing anomaly detection on access behaviors of access accounts of the data leaked by candidates, including: and (3) performing abnormal detection on the access source IP address, the access time and the access frequency to obtain an abnormal access behavior result of the access account of the data leaked by candidates. Weighted values are set for the three abnormal access behaviors, the abnormality of the access source IP address is 0.5, the abnormality of the access time is 0.25, and the abnormality of the access frequency is 0.25, so that the access condition of the access account of the data leaked as a candidate can be obtained, for example, as shown in the following table 3, the table 3 is the abnormal access condition of the access account of the data leaked as a candidate.
Figure BDA0003313399730000181
Figure BDA0003313399730000191
TABLE 3
Calculating the total value of the abnormal access behaviors of the access account numbers of the data which are leaked candidate, and obtaining the following results: user1 had a total value of 0.25; user2 had a total value of 0.5; user3 had a total value of 0.75; the total value of User4 was 1. According to the ranking of the matching degree of the candidate leaked data: user1 ═ User3> User2> User4, the ranking of access accounts for candidate compromised data can be found: user3> Uer1> User2> User 4.
Thus, it may be determined that User3 is the access account that compromised the data.
Based on the same technical concept as the foregoing embodiment, referring to fig. 6, the data leakage tracking apparatus provided in the embodiment of the present invention may include:
an obtaining module 601, configured to obtain information of at least one piece of data, where the information of each piece of data in the at least one piece of data includes content of the data, an access account of the data, and an access behavior of the access account;
a first determining module 602, configured to obtain leaked data, match the leaked data with content of the at least one piece of data, and determine a matching result, where the matching result includes a matching degree of the leaked data with content of each piece of the at least one piece of data; determining at least one data of the candidate leakage from the at least one data according to the matching result;
the analysis module 603 is configured to analyze the information of the at least one candidate leaked data to obtain an analysis result of the at least one candidate leaked data;
and a second determining module 604, configured to determine, according to the analysis result, an access account for revealing data.
In one implementation, the first determining module 602 is configured to determine, according to the matching result, at least one leakage candidate data in the at least one data, and includes:
in the at least one data, data satisfying a set condition is taken as data of the at least one leakage candidate, and the set condition includes: and the matching degree of the leaked data and the content of the leaked data is greater than a set threshold value.
In one implementation, the analysis module 603 is configured to analyze the information of the at least one candidate leaked data to obtain an analysis result of the at least one candidate leaked data, and includes:
analyzing the information of the at least one candidate leaked data to determine an access account number of the at least one candidate leaked data;
determining the access behavior of each access account in the access accounts of the at least one candidate leaked data according to the access accounts of the at least one candidate leaked data;
and performing anomaly detection on the access behaviors of the access accounts of the at least one candidate leaked data, and determining an analysis result of the at least one candidate leaked data, wherein the analysis result of the candidate leaked data comprises the abnormal access behaviors of the access accounts of the data.
In one implementation, when the number of the at least one candidate leaked data is two or more, the second determining module 604 is configured to determine, according to the analysis result, an access account number for leaking data, and includes:
determining a corresponding weight value aiming at the abnormal access behavior of the access account of the data leaked by the candidate;
determining the sequence of the access account of the data leaked by the candidate according to the analysis result and the weight value corresponding to the abnormal access behavior;
and determining the access account number of the leaked data according to the sequence of the access account numbers of the candidate leaked data.
In one implementation, the matching degree of the leaked data and the content of each data in the at least one data is a ratio of the content of each data to the content of the leaked data.
In an implementation manner, the obtaining module 601 is configured to obtain information of at least one data, including:
decrypting encrypted information of at least one piece of data acquired from a big data center and/or a data sharing platform to obtain original information of the at least one piece of data;
and obtaining the information of the at least one data according to the original information of the at least one data.
In an implementation manner, the obtaining module 601 is configured to obtain information of the at least one data according to original information of the at least one data, and includes:
determining the content and the log information of the at least one data according to the original information of the at least one data;
determining a session identifier of at least one data according to the log information of the at least one data;
determining an access account of at least one data according to the session identifier of the at least one data;
and determining the access behavior of the access account according to the access account of the at least one datum.
In one implementation, the access behavior of the access account includes at least one of: address, access time, access frequency of the access source internetworking protocol.
In practical applications, the obtaining module 601, the first determining module 602, the analyzing module 603, and the second determining module 604 may all be implemented by a processor of an electronic device, where the processor may be at least one of an ASIC, a DSP, a DSPD, a PLD, an FPGA, a CPU, a controller, a microcontroller, and a microprocessor, and the embodiment of the present invention is not limited thereto.
It should be noted that the above description of the embodiment of the apparatus, similar to the above description of the embodiment of the method, has similar beneficial effects as the embodiment of the method. For technical details not disclosed in the embodiments of the apparatus of the present application, reference is made to the description of the embodiments of the method of the present application for understanding.
It should be noted that, in the embodiment of the present invention, if the method is implemented in the form of a software functional module and sold or used as a standalone product, the method may also be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a terminal, a server, etc.) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a magnetic disk, or an optical disk. Thus, embodiments of the present application are not limited to any specific combination of hardware and software.
Correspondingly, the embodiment of the present invention further provides a computer program product, where the computer program product includes computer-executable instructions, and the computer-executable instructions are used to implement any one of the data leakage tracking methods provided by the embodiment of the present invention.
Accordingly, an embodiment of the present invention further provides a computer storage medium, where computer-executable instructions are stored on the computer storage medium, and the computer-executable instructions are used to implement any one of the data leakage tracking methods provided in the foregoing embodiments.
In some embodiments, functions of or modules included in the apparatus provided in the embodiments of the present invention may be used to execute the method described in the above method embodiments, and specific implementation thereof may refer to the description of the above method embodiments, and for brevity, will not be described again here.
Based on the same technical concept as the foregoing embodiment, referring to fig. 7, an electronic device 700 provided in an embodiment of the present invention may include: a memory 710 and a processor 720; wherein,
a memory 710 for storing computer programs and data;
a processor 720, configured to execute the computer program stored in the memory to implement any one of the data leakage tracking methods in the foregoing embodiments.
The foregoing description of the various embodiments is intended to highlight various differences between the embodiments, and the same or similar components may be referred to one another, and for brevity, are not repeated herein.
The methods disclosed in the method embodiments provided by the present application can be combined arbitrarily without conflict to obtain new method embodiments.
Features disclosed in various product embodiments provided by the application can be combined arbitrarily to obtain new product embodiments without conflict.
The features disclosed in the various method or apparatus embodiments provided herein may be combined in any combination to arrive at new method or apparatus embodiments without conflict.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, and for example, the division of the unit is only one logical function division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication between the components shown or discussed may be through some interfaces, and the indirect coupling or communication between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of grid units; some or all of the units can be selected according to actual conditions to achieve the purpose of the scheme of the embodiment.
In addition, all functional units in the embodiments of the present application may be integrated into one processing module, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments.
The above description is only exemplary of the present invention and should not be taken as limiting the scope of the present invention, and any modifications, equivalents, improvements, etc. that are within the spirit and principle of the present invention should be included in the present invention.

Claims (11)

1. A method for tracking data compromise, the method comprising:
acquiring information of at least one piece of data, wherein the information of each piece of data in the at least one piece of data comprises the content of the data, an access account number of the data and an access behavior of the access account number;
acquiring leaked data, matching the leaked data with the content of the at least one piece of data, and determining a matching result, wherein the matching result comprises the matching degree of the leaked data and the content of each piece of data of the at least one piece of data; determining at least one data of the candidate leakage from the at least one data according to the matching result;
analyzing the information of the at least one candidate leaked data to obtain an analysis result of the at least one candidate leaked data;
and determining an access account number of the leaked data according to the analysis result.
2. The method of claim 1, wherein determining at least one candidate leaked data in the at least one data according to the matching result comprises:
in the at least one data, data satisfying a set condition is taken as data of the at least one leakage candidate, and the set condition includes: and the matching degree of the leaked data and the content of the leaked data is greater than a set threshold value.
3. The method of claim 1, wherein analyzing the information of the at least one candidate leaked data to obtain an analysis result of the at least one candidate leaked data comprises:
analyzing the information of the at least one candidate leaked data to determine an access account number of the at least one candidate leaked data;
determining the access behavior of each access account in the access accounts of the at least one candidate leaked data according to the access accounts of the at least one candidate leaked data;
and performing anomaly detection on the access behaviors of the access accounts of the at least one candidate leaked data, and determining an analysis result of the at least one candidate leaked data, wherein the analysis result of the candidate leaked data comprises the abnormal access behaviors of the access accounts of the data.
4. The method of claim 3, wherein when the number of the at least one candidate leaked data is two or more, the determining an access account number of the leaked data according to the analysis result comprises:
determining a corresponding weight value aiming at the abnormal access behavior of the access account of the data leaked by the candidate;
determining the sequence of the access account of the data leaked by the candidate according to the analysis result and the weight value corresponding to the abnormal access behavior;
and determining the access account number of the leaked data according to the sequence of the access account numbers of the candidate leaked data.
5. The method of claim 1, wherein the matching degree of the leaked data with the content of each of the at least one data is a ratio of the content of each data to the content of the leaked data.
6. The method of claim 1, wherein the obtaining information of at least one data comprises:
decrypting encrypted information of at least one piece of data acquired from a big data center and/or a data sharing platform to obtain original information of the at least one piece of data;
and obtaining the information of the at least one data according to the original information of the at least one data.
7. The method according to claim 6, wherein the obtaining information of the at least one data according to the original information of the at least one data comprises:
determining the content and the log information of the at least one data according to the original information of the at least one data;
determining a session identifier of at least one data according to the log information of the at least one data;
determining an access account of at least one data according to the session identifier of the at least one data;
and determining the access behavior of the access account according to the access account of the at least one datum.
8. The method of any of claims 1 to 7, wherein the access behavior of the access account comprises at least one of: address, access time, access frequency of the access source internetworking protocol.
9. A data leakage tracking apparatus, the apparatus comprising at least:
the acquisition module is used for acquiring information of at least one piece of data, wherein the information of each piece of data in the at least one piece of data comprises the content of the data, an access account number of the data and an access behavior of the access account number;
the first determining module is used for acquiring the leaked data, matching the leaked data with the content of the at least one piece of data and determining a matching result, wherein the matching result comprises the matching degree of the leaked data and the content of each piece of data of the at least one piece of data; determining at least one data of the candidate leakage from the at least one data according to the matching result;
the analysis module is used for analyzing the information of the at least one candidate leaked data to obtain an analysis result of the at least one candidate leaked data;
and the second determination module is used for determining the access account number of the leaked data according to the analysis result.
10. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the data compromise tracking method of any one of claims 1-8 when executing the program.
11. A computer storage medium storing a computer program; characterized in that said computer program is capable of implementing a data leakage tracing method according to any one of claims 1 to 8 when executed.
CN202111223255.0A 2021-10-20 2021-10-20 Data leakage tracking method and device, electronic equipment and computer storage medium Pending CN114077722A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111223255.0A CN114077722A (en) 2021-10-20 2021-10-20 Data leakage tracking method and device, electronic equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111223255.0A CN114077722A (en) 2021-10-20 2021-10-20 Data leakage tracking method and device, electronic equipment and computer storage medium

Publications (1)

Publication Number Publication Date
CN114077722A true CN114077722A (en) 2022-02-22

Family

ID=80283420

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111223255.0A Pending CN114077722A (en) 2021-10-20 2021-10-20 Data leakage tracking method and device, electronic equipment and computer storage medium

Country Status (1)

Country Link
CN (1) CN114077722A (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104063394A (en) * 2013-03-21 2014-09-24 北京百度网讯科技有限公司 Method and device for determining target page as well as equipment
CN106685995A (en) * 2017-02-23 2017-05-17 王锐 Data query system for leaked account based on hardware encryption
CN106685966A (en) * 2016-12-29 2017-05-17 北京奇虎科技有限公司 Divulged information detection method, divulged information detection device and divulged information detection system
CN107066882A (en) * 2017-03-17 2017-08-18 平安科技(深圳)有限公司 Information leakage detection method and device
CN107483422A (en) * 2017-08-03 2017-12-15 深信服科技股份有限公司 Leakage of data retroactive method, equipment and computer-readable recording medium
CN109218390A (en) * 2018-07-12 2019-01-15 北京比特智学科技有限公司 User's screening technique and device
CN109525558A (en) * 2018-10-22 2019-03-26 深信服科技股份有限公司 Leaking data detection method, system, device and storage medium
CN109739889A (en) * 2018-12-27 2019-05-10 北京三未信安科技发展有限公司 A kind of data leak based on data mapping is traced to the source determination method and system
CN109992936A (en) * 2017-12-31 2019-07-09 中国移动通信集团河北有限公司 Data source tracing method, device, equipment and medium based on data watermark
CN112905857A (en) * 2021-01-30 2021-06-04 北京中安星云软件技术有限公司 Data leakage behavior tracing method and device based on data characteristics

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104063394A (en) * 2013-03-21 2014-09-24 北京百度网讯科技有限公司 Method and device for determining target page as well as equipment
CN106685966A (en) * 2016-12-29 2017-05-17 北京奇虎科技有限公司 Divulged information detection method, divulged information detection device and divulged information detection system
CN106685995A (en) * 2017-02-23 2017-05-17 王锐 Data query system for leaked account based on hardware encryption
CN107066882A (en) * 2017-03-17 2017-08-18 平安科技(深圳)有限公司 Information leakage detection method and device
CN107483422A (en) * 2017-08-03 2017-12-15 深信服科技股份有限公司 Leakage of data retroactive method, equipment and computer-readable recording medium
CN109992936A (en) * 2017-12-31 2019-07-09 中国移动通信集团河北有限公司 Data source tracing method, device, equipment and medium based on data watermark
CN109218390A (en) * 2018-07-12 2019-01-15 北京比特智学科技有限公司 User's screening technique and device
CN109525558A (en) * 2018-10-22 2019-03-26 深信服科技股份有限公司 Leaking data detection method, system, device and storage medium
CN109739889A (en) * 2018-12-27 2019-05-10 北京三未信安科技发展有限公司 A kind of data leak based on data mapping is traced to the source determination method and system
CN112905857A (en) * 2021-01-30 2021-06-04 北京中安星云软件技术有限公司 Data leakage behavior tracing method and device based on data characteristics

Similar Documents

Publication Publication Date Title
US11616800B2 (en) Security policy analyzer service and satisfiability engine
Jung et al. Privacy oracle: a system for finding application leaks with black box differential testing
IL275042A (en) Self-adaptive application programming interface level security monitoring
US7917759B2 (en) Identifying an application user as a source of database activity
CN110870279B (en) Security policy analyzer service and satisfiability engine
CN113489713B (en) Network attack detection method, device, equipment and storage medium
AU2024204413A1 (en) Systems and methods for controlling data exposure using artificial-intelligence-based modeling
US20160203337A1 (en) Identifying private information from data streams
EP2820582B1 (en) Network service interface analysis
Takahashi et al. Web of cybersecurity: Linking, locating, and discovering structured cybersecurity information
US10805377B2 (en) Client device tracking
US11947694B2 (en) Dynamic virtual honeypot utilizing honey tokens and data masking
Englehardt Automated discovery of privacy violations on the web
Yan et al. Graph mining for cybersecurity: A survey
CN107357562B (en) Information filling method, device and client
CN117033552A (en) Information evaluation method, device, electronic equipment and storage medium
US9904661B2 (en) Real-time agreement analysis
CN114077722A (en) Data leakage tracking method and device, electronic equipment and computer storage medium
Zhao et al. Statistical Feature‐Based Personal Information Detection in Mobile Network Traffic
Xu et al. Mining web usage profiles from proxy logs: User identification
WO2018166365A1 (en) Method and device for recording website access log
US11956215B2 (en) System and method for blurring connection information in virtual private networks
Aditham et al. Call trace and memory access pattern based runtime insider threat detection for big data platforms
Cherubin Bots detection by Conformal Clustering
Liu et al. POSTER: the popular apps in your pocket are leaking your privacy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination