WO2023088249A1 - 一种数据处理的合规性检测方法、装置和相关设备 - Google Patents

一种数据处理的合规性检测方法、装置和相关设备 Download PDF

Info

Publication number
WO2023088249A1
WO2023088249A1 PCT/CN2022/132004 CN2022132004W WO2023088249A1 WO 2023088249 A1 WO2023088249 A1 WO 2023088249A1 CN 2022132004 W CN2022132004 W CN 2022132004W WO 2023088249 A1 WO2023088249 A1 WO 2023088249A1
Authority
WO
WIPO (PCT)
Prior art keywords
compliance
data
data processing
judgment conditions
relationship
Prior art date
Application number
PCT/CN2022/132004
Other languages
English (en)
French (fr)
Inventor
喻鹏
丰雷
阎钰洁
陈成
赵明宇
严学强
吴建军
汪洋
李文璟
周凡钦
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023088249A1 publication Critical patent/WO2023088249A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models

Definitions

  • the present application relates to the field of computer technology, in particular to a data processing compliance detection method, device and related equipment.
  • the GDPR compliance verification methods in the existing technology describe the GDPR compliance from a high-level perspective for data processing, lack of a specific judgment process for the GDPR regulations, and it is difficult to help enterprises accurately control compliance costs, and it is also difficult to help supervision
  • the applicable scenarios are single and the scope of application is limited. It is not extensive enough to meet the complex and changing needs of actual detection scenarios. Therefore, how to provide a compliance detection solution that can refine the specific judgment process of laws and regulations, cover the needs of diverse scenarios, effectively promote the development and utilization of data, and protect the rights and interests of all parties is an urgent problem to be solved.
  • the embodiment of the present application provides a data processing compliance detection method, device and related equipment, which can refine the specific judgment process of laws and regulations, cover diverse scene requirements, improve the interpretability of compliance detection, and thus promote the development and utilization of data , to protect the rights and interests of all parties.
  • the embodiment of the present application provides a compliance detection method for data processing, the method comprising:
  • the rule information includes one or more of the Chinese Personal Information Protection Law data, the General Data Protection Regulation GDPR corpus, or the Chinese Data Security Law data; based on the knowledge
  • the extracted result constructs one or more knowledge graph entities, and establishes the relationship between the one or more knowledge graph entities to generate a knowledge graph;
  • the one or more knowledge graph entities include one or more compliance judgment conditions , one or more of one or more compliance statuses, and the one or more compliance judgment conditions are compliance with one or more of China's Personal Information Protection Law, GDPR or China's Data Security Law
  • One or more judgment conditions of compliance, the one or more compliance statuses are the possible judgment results of compliance with one or more of China’s Personal Information Protection Law, GDPR or China’s Data Security Law ;
  • Obtain the data processing record to be detected by the data processor or data controller and input it into the knowledge map to determine the compliance of the data processing record;
  • the data processing record includes the processor, processing time, processing specific One or more of an operation type and a concrete data object type.
  • a knowledge map for compliance detection is generated, and then the data to be detected
  • the data processing record is input into the knowledge map to determine the compliance of the data processing record.
  • the embodiment of the present application uses the knowledge map to express the characteristics of the data based on the map, and can intuitively refine the specific judgment process of the regulations.
  • knowledge is extracted from one or more of the Chinese Personal Information Protection Law corpus, the General Data Protection Regulation GDPR corpus, or the Chinese Data Security Law corpus, and a knowledge graph is constructed based on the extraction results Entities, establish the relationship between entities, thereby generating a knowledge graph for compliance detection; then obtain the data processing records to be detected by the data processor or data controller, and use the record as the input of the knowledge graph, and finally Determine the compliance of this record.
  • the embodiment of this application is aimed at the lack of a specific judgment process for GDPR regulations in the prior art (such as a compliance supervision method based on alliance chains and smart contracts), it is difficult to implement, and the applicable scenarios are single, and the scope of application is not wide enough (such as using Monkey It is difficult to meet the complex and changeable requirements of the actual detection scenario.
  • a knowledge graph By generating a knowledge graph, the knowledge extraction and knowledge reasoning of relevant laws and regulations will be realized.
  • the judgment process of laws and regulations is presented in detail with the graphic data structure.
  • the data processing records input as knowledge graphs can include processors (such as data controllers or data processors), processing time, and specific types of operations (such as acquisition, storage, etc.) , transmission, etc.) and one or more of specific data object types (such as private data or non-private data), so that the embodiments of the present application are not limited to data types or operation types, and the applicable scenarios are more abundant.
  • processors such as data controllers or data processors
  • specific types of operations such as acquisition, storage, etc.
  • specific data object types such as private data or non-private data
  • the embodiments of the present invention can refine the specific judgment process of laws and regulations, cover diverse scenario requirements, and improve the interpretability of compliance detection, thereby promoting the development and utilization of data and protecting the rights and interests of all parties .
  • the establishing the relationship between the one or more knowledge graph entities and generating the knowledge graph includes: establishing the relationship between the one or more knowledge graph entities through a decision tree, Generate the knowledge map; the decision tree includes one or more of one or more root nodes, one or more internal nodes and one or more leaf nodes, the root node is used to receive the data processing record, the internal node is used to store one or more of the processor, processing time, specific operation type of processing and specific data object type, and the leaf node of the decision tree is used to store the one or more knowledge Graph entity.
  • the decision tree in the process of establishing the relationship between the knowledge map entities, is used to classify and sort out the knowledge map entities, so that different classifications (such as processing person, processing time, processing specific operation type or the classification of specific data object types) to form different sub-knowledge graphs; when receiving data processing records from data controllers or data processors, they can accurately Find the sub-knowledge graph corresponding to the classification, and only need to judge the sub-knowledge graph to determine the compliance of the data processing record.
  • different classifications such as processing person, processing time, processing specific operation type or the classification of specific data object types
  • the embodiment of the present application establishes the relationship between the knowledge graphs through the decision tree, so that when determining the compliance of the data processing records, it is possible to avoid traversal and judgment on the entire knowledge graph, and quickly , Accurately determine the compliance detection result of the data processing record.
  • the one or more knowledge graph entities include the one or more compliance judgment conditions and the one or more compliance states; wherein, the one or more knowledge graph entities The relationship between each compliance judgment condition in the entity includes one or more of phase and relationship, phase or relationship or inclusion relationship; the one or more compliance in the one or more knowledge graph entities The relationship between the judgment condition and the one or more compliance states includes a belonging relationship.
  • the compliance judgment conditions and compliance status in the knowledge graph entity are classified and sorted out, so as to provide a basis for further improving the efficiency of compliance detection.
  • the relationship between a plurality of compliance judgment conditions can be classified into a phase-and relationship, a phase-or relationship, or an inclusion relationship, and the relationship between a compliance judgment condition and a compliance status can be classified into a belonging relationship.
  • a corresponding detection strategy is adopted in combination with the relationship between each entity (for example, if one of the compliance judgment conditions of the phase and relationship is non-compliant, it can be considered If the result of the compliance detection is non-compliance, there is no need to detect other compliance judgment conditions), which can improve the efficiency of detection.
  • the one or more compliance judgment conditions include one or more first compliance judgment conditions
  • the data processing record involves the one or more first compliance judgment conditions
  • the determination of the compliance of the data processing records includes: when the relationship between the one or more first compliance judgment conditions is an AND relationship, if the one or more first compliance judgment conditions If each of the compliance judgment conditions in the conditions is compliant, it is determined that the data processing record is compliant.
  • the data processing record of the data controller or data processor when the data processing record of the data controller or data processor is checked for compliance, it is clear that the data processing record involves multiple compliance judgment conditions (one or more first compliance judgment conditions ), if the relationship between the multiple compliance judgment conditions is an AND relationship, then the multiple compliance judgment conditions must all be compliant, and the data processing record is compliant, that is to say, when the When the relationship between multiple compliance judgment conditions is an AND relationship, as long as one of the compliance judgment conditions is non-compliant, the data processing record is also non-compliant. Therefore, in the embodiment of the present application, when performing compliance detection on multiple compliance judgment conditions related to each other, when a certain compliance judgment condition is not in compliance, the detection of other related compliance judgment conditions can be stopped. Compliance detection, to determine that the corresponding data processing records are not compliant, thereby improving the efficiency of compliance detection.
  • multiple compliance judgment conditions one or more first compliance judgment conditions
  • the one or more compliance judgment conditions include one or more second compliance judgment conditions
  • the data processing record involves the one or more second compliance judgment conditions
  • the determination of the compliance of the data processing records includes: when the relationship between the one or more second compliance judgment conditions is an OR relationship, if the one or more second compliance judgments If any of the compliance judgment conditions in the conditions is compliant, it is determined that the data processing record is compliant.
  • the data processing record of the data controller or data processor when the data processing record of the data controller or data processor is checked for compliance, it is clear that the data processing record involves multiple compliance judgment conditions (that is, one or more second compliance judgments condition), if the relationship between the multiple compliance judgment conditions is an OR relationship, then any one of the multiple compliance judgment conditions is compliant, and the data processing record is compliant, That is to say, when the relationship between the multiple compliance judgment conditions is an OR relationship, the data processing record is non-compliant only if all the compliance judgment conditions are non-compliant. Therefore, in the embodiment of the present application, when a compliance judgment condition of multiple phases or relationships is checked for compliance, when a certain compliance judgment condition is in compliance, the compliance of other related compliance judgment conditions can be stopped. Compliance detection, to determine the compliance of the corresponding data processing records, thereby improving the efficiency of compliance detection.
  • multiple compliance judgment conditions that is, one or more second compliance judgments condition
  • the one or more compliance judgment conditions include one third compliance judgment condition and one or more fourth compliance judgment conditions
  • the data processing record involves the one third compliance judgment condition A compliance judgment condition and the one or more fourth compliance judgment conditions
  • the determination of the compliance of the data processing record includes: when the one or more third compliance judgment conditions include the one or more In the fourth compliance judgment condition, if one of the third compliance judgment conditions is non-compliant, it is determined that the data processing record is not in compliance, and the one or more fourth compliance judgment conditions are further determined. Compliance.
  • the data processing record involves multiple compliance judgment conditions (that is, a third compliance judgment condition and a or multiple fourth compliance judgment conditions), if the relationship between the multiple compliance judgment conditions is an inclusion relationship, then when the third compliance judgment condition is non-compliant, it is determined that the data processing record is not compliant , and further determine the compliance of one or more fourth compliance judgment conditions, it is possible to accurately find a certain fourth compliance judgment condition that causes the third compliance judgment condition to be non-compliant, which can help enterprises Enterprises can accurately find violation points; for regulatory authorities, compliance testing can be detailed and easy to implement.
  • multiple compliance judgment conditions that is, a third compliance judgment condition and a or multiple fourth compliance judgment conditions
  • the method further includes: setting a priority factor for the one or more compliance judgment conditions; determining the compliance of the data processing record includes: based on the The priority coefficient judges the compliance judgment conditions involved in the data processing record to determine the compliance of the data processing record.
  • a priority coefficient in the process of generating the knowledge map, can be set for one or more compliance judgment conditions included in the knowledge map entity;
  • multiple compliance judgment conditions involved in the data processing record can be judged in sequence according to the priority coefficient.
  • the priority factor can be set according to the importance of each compliance judgment condition in the regulations, or according to the severity of violation punishment for different compliance judgment conditions, and can also be set according to the frequency involved in the compliance judgment condition. Therefore, in the embodiment of the present application, when judging multiple compliance judging conditions involved in the data processing record in sequence according to the priority coefficient, the judging the compliance judging conditions with higher priority can be prioritized, thereby improving compliance detection. s efficiency.
  • the embodiment of the present application provides a compliance detection device for data processing, which includes: an acquisition module for acquiring rule information and performing knowledge extraction on the rule information; the rule information includes China One or more of the personal information protection law data, the General Data Protection Regulation GDPR corpus or the Chinese data security law data; the processing module is used to construct one or more knowledge graph entities based on the results of the knowledge extraction, and establish the Describe the relationship between one or more knowledge map entities to generate a knowledge map; the one or more knowledge map entities include one or more of one or more compliance judgment conditions, one or more compliance statuses , the one or more compliance judgment conditions are one or more judgment conditions for compliance with one or more of China's Personal Information Protection Law, GDPR or China Data Security Law, and the one or more A compliance status is a possible judgment result of compliance with one or more of China's Personal Information Protection Law, GDPR or China's Data Security Law; the determination module is used to obtain the data processor or data controller's The data processing record to be detected is input into the knowledge map to determine the compliance of the
  • the processing module first, based on the acquisition module, knowledge extraction is performed on one or more of the corpus in China's Personal Information Protection Law, General Data Protection Regulations, and China Data Security Law, and the processing module generates the data for compliance detection. knowledge map, and then input the data processing record to be detected into the knowledge map through the determination module, so as to determine the compliance of the data processing record.
  • the embodiment of the present application uses the knowledge map to express the characteristics of the data based on the graph, and can intuitively express the laws and regulations The specific judgment process is refined.
  • Detection is no longer limited to a certain type of data or a certain type of operation, and can cover the needs of diverse scenarios, thereby effectively promoting data development and utilization and protecting the rights and interests of all parties.
  • the acquisition module is used to extract knowledge from one or more of the Chinese Personal Information Protection Law corpus, the General Data Protection Regulation GDPR corpus and/or the Chinese Data Security Law corpus, and then through
  • the processing module constructs the entities of the knowledge graph based on the extraction results, establishes the relationship between the entities, and generates a knowledge graph for compliance detection; then the determination module obtains the data processing to be detected by the data processor or data controller Record, use the record as the input of the knowledge map, and finally determine the compliance of the record.
  • the embodiment of this application is aimed at the lack of a specific judgment process for GDPR regulations in the prior art (such as a compliance supervision method based on alliance chains and smart contracts), it is difficult to implement, and the applicable scenarios are single, and the scope of application is not wide enough (such as using Monkey It is difficult to meet the complex and changeable requirements of the actual detection scenario.
  • a knowledge graph By generating a knowledge graph, the knowledge extraction and knowledge reasoning of relevant laws and regulations will be realized.
  • the judgment process of laws and regulations is presented in detail with the graphic data structure.
  • the data processing records input as knowledge graphs can include processors (such as data controllers or data processors), processing time, and specific types of operations (such as acquisition, storage, etc.) , transmission and other operations) and one or more of specific data object types (such as private data or non-private data), so that the embodiments of the present application are not limited to a certain data type or a certain operation type, and the applicable scenarios are more for abundance.
  • processors such as data controllers or data processors
  • specific types of operations such as acquisition, storage, etc.
  • transmission and other operations such as one or more of specific data object types (such as private data or non-private data)
  • specific data object types such as private data or non-private data
  • the embodiments of the present invention can refine the specific judgment process of laws and regulations, cover diverse scenario requirements, and improve the interpretability of compliance detection, thereby promoting the development and utilization of data and protecting the rights and interests of all parties .
  • the determining module is specifically configured to: establish the relationship between the one or more knowledge graph entities through a decision tree to generate the knowledge graph;
  • the decision tree includes one or more One or more of a root node, one or more internal nodes and one or more leaf nodes, the root node is used to receive the data processing record, and the internal node is used to store the processing person, processing time , one or more of a specific operation type to be processed and a specific data object type, and the leaf nodes of the decision tree are used to store the one or more knowledge graph entities.
  • the one or more knowledge graph entities include the one or more compliance judgment conditions and the one or more compliance states; wherein, the one or more knowledge graph entities The relationship between each compliance judgment condition in the entity includes one or more of phase and relationship, phase or relationship or inclusion relationship; the one or more compliance in the one or more knowledge graph entities The relationship between the judgment condition and the one or more compliance states includes a belonging relationship.
  • the one or more compliance judgment conditions include one or more first compliance judgment conditions, and the data processing record involves the one or more first compliance judgment conditions;
  • the determination module is specifically configured to: when the relationship between the one or more first compliance judgment conditions is an AND relationship, if each compliance in the one or more first compliance judgment conditions If the judging conditions of the regulations are all compliant, it is determined that the data processing records are compliant.
  • the one or more compliance judgment conditions include one or more second compliance judgment conditions, and the data processing record involves the one or more second compliance judgment conditions;
  • the determination module is specifically configured to: when the relationship between the one or more second compliance judgment conditions is an OR relationship, if any one of the one or more second compliance judgment conditions meets If the rule judgment condition is compliant, it is determined that the data processing record is compliant.
  • the one or more compliance judgment conditions include one third compliance judgment condition and one or more fourth compliance judgment conditions
  • the data processing record involves the one third compliance judgment condition
  • the determination module is specifically used for: when the one or more third compliance judgment conditions include the one or more fourth compliance judgment conditions , if the third compliance judgment condition is non-compliance, it is determined that the data processing record is non-compliance, and the compliance of the one or more fourth compliance judgment conditions is further determined.
  • the device further includes: a configuration module configured to set a priority coefficient for the one or more compliance judgment conditions; the determination module is specifically configured to: based on the priority The level coefficient judges the compliance judgment conditions involved in the data processing record to determine the compliance of the data processing record.
  • an embodiment of the present application provides a terminal device, which is characterized in that it includes a processor, an input device, an output device, and a memory, and the processor, input device, output device, and memory are connected to each other, wherein the The memory is used to store a computer program, the computer program includes program instructions, and the processor is configured to call the program instructions to execute the compliance detection method for data processing in the first aspect above.
  • the embodiment of the present application provides a computer-readable storage medium, which is characterized in that the computer storage medium stores a computer program, and the computer program includes program instructions, and when the program instructions are executed by a processor, the processing The device executes the compliance detection method of data processing in the first aspect above.
  • an embodiment of the present application provides a computer program, wherein the computer program includes an instruction, and when the computer program is executed by the terminal device, the terminal device executes the above-mentioned first aspect. Compliance detection methods for data processing.
  • the embodiment of the present application provides a chip system
  • the chip system includes a processor, configured to support the device to implement the functions involved in the first aspect above, for example, generate or process the compliance detection of the above data processing Information involved in the method.
  • the chip system further includes a memory, and the memory is configured to store necessary program instructions and data of the device.
  • the system-on-a-chip may consist of chips, or may include chips and other discrete devices.
  • Figure 1 is a schematic flow diagram of a GDPR compliance supervision method based on the alliance chain in the prior art
  • Fig. 2 is a schematic flow chart of a compliance detection method based on the Monkey program in the prior art
  • Fig. 3 is a schematic flow chart of a user data migration method in the prior art
  • FIG. 4 is a schematic diagram of a general data protection rule system for microservices and programming models in the prior art
  • Fig. 5a is a schematic flow chart of a data processing compliance detection method provided by an embodiment of the present application.
  • Fig. 5b is a schematic diagram of an overall generation process of a knowledge map provided by the embodiment of the present application.
  • Fig. 5c is a schematic diagram of a local conversion process of a knowledge map provided by the embodiment of the present application.
  • Fig. 5d is a schematic diagram of another local conversion process of knowledge graph provided by the embodiment of the present application.
  • FIG. 6 is a schematic diagram of a decision tree-based knowledge graph provided in an embodiment of the present application.
  • Fig. 7a is a schematic flow diagram of a compliance judgment based on a knowledge map provided in the embodiment of the present application.
  • Fig. 7b is another schematic flow diagram of compliance judgment based on knowledge graph provided in the embodiment of the present application.
  • Fig. 7c is a schematic flow diagram of another compliance judgment based on a knowledge graph provided in the embodiment of the present application.
  • Fig. 7d is another schematic flow diagram of compliance judgment based on knowledge graph provided in the embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a compliance detection device for data processing provided by an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of another compliance detection device for data processing provided by an embodiment of the present application.
  • an embodiment means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application.
  • the presentation of this phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are independent or alternative embodiments mutually exclusive of other embodiments. It is understood explicitly and implicitly by those skilled in the art that the embodiments described herein can be combined with other embodiments.
  • Knowledge Graph (Knowledge Graph, KG) is a system/technology that can collect, store and automatically update knowledge. It can display the knowledge development process and structural relationship as a series of different graphics.
  • Technology describes knowledge resources and their carriers, mines, analyzes, constructs, draws and displays knowledge and their interrelationships, and has strong explanatory power.
  • the establishment of a knowledge graph generally includes knowledge extraction, knowledge storage, knowledge calculation, and knowledge application.
  • knowledge graphs are used to perform data processing compliance testing on GDPR and the "China Data Protection Law", and there is no detection scheme for data processing compliance based on knowledge graphs in the prior art.
  • Decision Tree is a tree structure in which each internal node represents a test on an attribute, each branch represents a test output, and each leaf node represents a category. Decision trees are also often used in classification scenarios.
  • a knowledge map can be established with the help of a decision tree model.
  • the classification corresponding to each internal node of the decision tree can represent a sub-knowledge map entry in the knowledge map.
  • the corresponding Sub-knowledge graph so as to complete the detection quickly and accurately, without traversing the entire knowledge graph, improving detection efficiency.
  • the schemes for GDPR compliance testing include the following schemes 1, 2, 3 and 4:
  • FIG. 1 is a schematic flow diagram of a consortium chain-based GDPR compliance supervision method in the prior art, which may specifically include the following steps S100 and S103 :
  • Step S100 Service providers and regulators register their real names in the alliance chain
  • Step S101 The permission record of the data subject is encrypted and stored in the alliance blockchain through the smart contract;
  • Step S102 Grant the data subject the right to access the alliance blockchain, and store data transfer records through smart contracts;
  • Step S103 During the compliance investigation, the alliance blockchain service network obtains records retroactively according to the request of the regulatory agency.
  • Disadvantage 1 The level is high and it is difficult to land.
  • the GDPR compliance supervision method based on the consortium chain, using blockchain scalability and non-tamperable modification, effectively improves the efficiency of users exercising GDPR rights and service providers' GDPR compliance judgments, and reduces the compliance costs of enterprise data development and utilization and regulatory agencies
  • the overall regulation of the GDPR compliance supervision process is still at a high level, and there is no specific process for judging the GDPR compliance rules.
  • For enterprises it may be difficult for enterprises to accurately capture violation points, making it difficult for enterprises to accurately control compliance costs and implement implementation; for regulators, the lack of a specific and detailed judgment process will undoubtedly increase
  • the difficulty of implementation makes it difficult to achieve effective supervision, and the interpretation of the supervision results is not strong.
  • FIG. 2 is a schematic flow diagram of a compliance detection method based on the Monkey program in the prior art, which may specifically include the following steps S200 and S203:
  • Step S200 Run the Monkey test program; the Monkey test program is used to test the application program of the first terminal;
  • Step S201 Obtain the communication data of the application program from the Monkey test program
  • Step S202 Send the communication data to the server, mark the private data and send it to the second terminal;
  • Step S203 the second terminal generates a detection report of the application program when determining that there is violation data that does not comply with GDPR compliance rules in the private data.
  • Disadvantage 1 The data type is single, and the scope of application is small.
  • the compliance detection scheme of the Monkey program realizes the automatic generation of the application detection report, and the detection efficiency is high.
  • this method can only verify the GDPR compliance of the private data in the communication data, and the data type is single.
  • the scope of the verified data is not comprehensive.
  • the compliance detection scheme based on the Monkey program may not be able to meet the compliance detection of non-private data.
  • FIG. 3 is a schematic flow diagram of a user data migration method in the prior art, which may specifically include the following steps S300 and S303:
  • Step S300 Establishing regional databases corresponding to country codes of different countries
  • Step S301 Obtain the registration country of the user data to be migrated and the destination country of the migration;
  • Step S302 Determine whether the destination country and the registration country belong to the same region according to the region database
  • Step S303 If they do not belong to the same region, the data needs to be migrated, and at the same time, the success or failure of the migration is determined according to whether the region of the destination country and the region of the registration country comply with the GDPR regulations.
  • Disadvantage 1 The operation type is single, and the scope of application is small.
  • the user data migration method in the prior art can make the user data fall in different regions and meet the data compliance standards of each region, while ensuring that the data in all regions is unique, but this method can only perform GDPR compliance on the data migration operation Verification of security, single type of operation, and lack of GDPR compliance verification for other operations on data.
  • this scheme It may not be able to meet the requirements of compliance testing.
  • Solution 4 General Data Protection Rules (GDPR) infrastructure for microservices and programming models, please refer to Figure 4, which is a schematic diagram of a general data protection rules system for microservices and programming models in the prior art , specifically, the system includes a general data privacy regulatory module that retains personal information in data communicated with business applications in accordance with at least one data privacy regulation; includes a data privacy compliance module connected to the general data privacy regulatory module to monitor The data flow controller reports to the client computer; includes a data subject privacy request module connected to the general data privacy supervision module and the data privacy compliance module, and generates operations based on one or more requests.
  • GDPR General Data Protection Rules
  • Disadvantage 1 The level is high and it is difficult to land.
  • the system can help applications, cloud computing platforms, etc. meet the requirements of GDPR compliance testing, but the system still stipulates the judgment process of GDPR compliance from a high-level perspective, and lacks specific rules for judging GDPR compliance.
  • For enterprises it may be difficult for enterprises to accurately capture violation points, making it difficult for enterprises to accurately control compliance costs and implement implementation; for regulators, the lack of a specific and detailed judgment process will undoubtedly increase
  • the difficulty of implementation makes it difficult to achieve effective supervision, and the interpretation of the supervision results is not strong.
  • the compliance detection scheme in the existing technology lacks a specific judgment process for the GDPR regulations, the applicable scenarios are single, and the application range is not wide enough, which leads to the inability to meet the higher requirements of actual deployment detection. Therefore, the data processing compliance detection method provided by this application is based on the corpus of the General Data Protection Regulation and/or the Chinese Data Security Law to generate a knowledge map for compliance detection, and then input the data processing records to be detected To the knowledge graph, so as to determine the compliance of the data processing records, it has the characteristics of intuitive expression and strong interpretation, and is not limited to a certain type of data or a certain type of operation, and can solve the above technical problems.
  • the GDPR regulations will be taken as an example below, and the technical problems raised in this application will be specifically analyzed and solved by combining the GDPR regulations with the data processing compliance detection method provided in this application.
  • Figure 5a is a schematic flowchart of a data processing compliance detection method provided by the embodiment of the present application, and the following steps S500-step S502 will be used to check the compliance of the data processing provided by the embodiment of the present application
  • the detection method is described:
  • Step S500 Obtain rule information, and perform knowledge extraction on the rule information.
  • the rule information may include one or more of Chinese personal information protection law data, General Data Protection Regulation (GDPR) data, or Chinese data security law data.
  • Knowledge extraction can use Resource Description Framework (Resource Description Framework, RDF) to describe one or more kinds of knowledge in China's personal information protection law data, general data protection regulation GDPR data or Chinese data security law data, which can be passed through three
  • the form of tuple (entity1, relation, entity2) is stored in the knowledge base.
  • entity 1 can be different compliance judgment conditions (that is, one of one or more compliance judgment conditions), and each compliance judgment condition corresponds to each judgment rule in the regulations; in the triplet
  • the relationship is used to represent the relationship between entity 1 and entity 2 or the judgment relationship between the two, which can include four types: “combined judgment”, “continuing judgment”, “inclusion judgment” and “belonging to”.
  • Each relationship has Attribute, which is used to indicate whether the compliance requirements of entity 1 are met.
  • a value of true means that the conditions represented by entity 1 are met, and a value of false means that the conditions represented by entity 1 are not met;
  • entity 2 can be a different compliance judgment Condition or compliance state (that is, one of one or more compliance states, including compliance and non-compliance), and "combined judgment" can be judged by treating entity 1 and entity 2 as a combination, that is, entity 1
  • the respective compliance of entity 2 and entity 2 will affect the compliance result of the triple (entity 1, relationship, entity 2); and "continue to judge” can first judge whether it meets the compliance of entity 1, according to whether The compliance result of entity 1 continues to judge entity 2, and then judge the compliance result of the triple (entity 1, relationship, entity 2); and the inclusion judgment can be based on the fact that it does not meet the compliance requirements of entity 1 , further judge whether it meets the compliance requirements of entity 2, where entity 1 can contain one or more entity 2; and in the "belongs to" relationship, entity 2 is the representation of compliance status, that is, it directly identifies entity 1 as I
  • Step S501 Construct one or more knowledge graph entities based on the knowledge extraction results, and establish relationships among the one or more knowledge graph entities to generate a knowledge graph.
  • the one or more knowledge graph entities include one or more of one or more compliance judgment conditions and one or more compliance statuses, and the one or more compliance judgment conditions are whether to comply with One or more criteria for judging the compliance of one or more of China's Personal Information Protection Law, GDPR or China's Data Security Law, and the one or more compliance statuses are whether to comply with China's Personal Information Protection Law, GDPR Or the possible judgment results of one or more compliances in China's data security law.
  • Figure 5b is a schematic diagram of the overall generation process of a knowledge map provided by the embodiment of the present application, wherein, after obtaining the rule information, the rules can be extracted first (as shown in Figure 5b step S5000), and then visually represent the rules in the form of graphs (step S5001 in Figure 5b), and finally program the knowledge in the form of graphs (step S5002 in Figure 5b), so that it can be used for compliance detection
  • the knowledge graph can be recognized and used by the device.
  • Figure 5c is a schematic diagram of a partial transformation process of a knowledge map provided by the embodiment of the present application, where after the rules are extracted, you can use The rules are expressed in the form of triples (as shown in step S5100 in Figure 5c), and then the rules in the form of triples are converted into representations in the form of graphs using the graph visualization tool (step S5101 in Figure 5c). It should also be noted that for the programmatic processing of knowledge in the form of graphs, refer to Figure 5d.
  • Figure 5d is a schematic diagram of another local conversion process of knowledge graphs provided by the embodiment of the present application, where the ternary
  • different judgment rule graphs that is, sub-knowledge graphs
  • steps S5200 in Figure 5d different judgment rule graphs
  • steps S5201 in Figure 5d different judgment rule graphs
  • the relationship between one or more knowledge graph entities can also be established through a decision tree to generate knowledge Map;
  • the decision tree includes one or more of one or more root nodes, one or more internal nodes and one or more leaf nodes, the root node is used to receive the data processing records, the The internal nodes are used to store one or more of the processor, the processing time, the specific operation type of the processing and the specific data object type, and the leaf nodes of the decision tree are used to store the one or more knowledge graph entities.
  • Figure 6 is a schematic diagram of a decision tree-based knowledge map provided by the embodiment of the present application.
  • the root node of the decision tree is used to receive the data controller or data processor
  • the data processing record, and the specific operation type of the processing is the main judgment condition
  • the internal node of the decision tree stores the specific operation type of processing, such as storage operation, data migration operation, data acquisition operation, data deletion operation, etc.
  • the leaves of the decision tree One or more knowledge graph entities (including one or more compliance judgment conditions and one or more compliance status) corresponding to the specific operation type used by the node for storage processing, if the specific operation type processed in the data processing record is data
  • the compliance detection process will jump from the root node to the internal node corresponding to the data storage operation, and then traverse the leaf nodes under the internal node to determine the compliance of the data processing record.
  • the programmatic representation of the knowledge graph can refer to the following rules: For example, first, define the variable compliance to indicate GDPR compliance, and the initial value is true, indicating compliance; define the variable d to receive processed personal data; define the variable act, Identify the type of operation currently in progress (storage, deletion, access, etc.); then, map each compliance judgment condition in the map to one or more Boolean variables, which are used to indicate whether the corresponding compliance judgment is met If the condition is met, it can be assigned a value of true, and if it is not met, it can be assigned a value of false, which is used as the assignment of the compliance variable; finally, since the attributes of the relationship in the map have already indicated whether the compliance judgment condition is met, it is possible to traverse the entities of the entire map And relationship, assigning values to the variables mapped to each entity, mapping the relationship "combined judgment" to the keyword and, the relationship "continue judgment” to the keyword or, and the relationship "include judgment” to the keyword in the outer judgment statement The if judgment statement, the relationship
  • the structure of the decision tree can also be established based on factors such as the processor, processing time, or specific data object types, which are not specifically limited here. Understandably, the programmatic representation of the above-mentioned knowledge graph is only used as an example, and does not constitute a specific limitation of the present application.
  • a priority coefficient can be set for one or more compliance judgment conditions included in the knowledge graph entity; in the data processing record of the data controller or data processor
  • multiple compliance judgment conditions involved in the data processing record may be judged in sequence according to the priority coefficient.
  • the priority factor can be set according to the importance of each compliance judgment condition in the regulations, or according to the severity of violation punishment for different compliance judgment conditions, and can also be set according to the frequency involved in the compliance judgment condition. Therefore, in the embodiment of the present application, when judging multiple compliance judging conditions involved in the data processing record in sequence according to the priority coefficient, the judging the compliance judging conditions with higher priority can be prioritized, thereby improving compliance detection. s efficiency.
  • the relationship between each compliance judgment condition in one or more knowledge graph entities can be determined as a phase-and relationship, One or more of phase or relationship or inclusion relationship; the relationship between the one or more compliance judgment conditions and the one or more compliance statuses in one or more knowledge graph entities can be determined for belonging relationship.
  • the data processing records involve multiple compliance judgment conditions (that is, one or more first compliance judgment conditions), If the relationship between the multiple compliance judgment conditions is an association relationship, then the multiple compliance judgment conditions must all be compliant before the data processing record can be considered compliant, that is to say, when the When the relationship between multiple compliance judgment conditions is an AND relationship, as long as one of the compliance judgment conditions is non-compliant, the compliance detection of other involved compliance judgment conditions can be stopped, then the data Processing records can be considered non-compliant, which can improve the efficiency of compliance detection.
  • multiple compliance judgment conditions that is, one or more first compliance judgment conditions
  • the compliance judgment condition 1, the compliance judgment condition 2, ..., and the compliance judgment condition n are in an "AND" relationship (that is, when the data processing record does not meet any of the conditions, it will lead to non-compliance ).
  • the judgment order among various compliance judgment conditions can be fixed in advance according to the priority coefficient of the judgment conditions, and the conditions with higher priority are judged first, and the compliance judgment condition n represents the last compliance judgment condition, then each compliance judgment
  • the judgment process between conditions can refer to the following triplet form (compliance judgment conditions are simplified to conditions):
  • condition a For example, 1) (condition a, combined judgment (condition a is met), condition b)
  • condition a and condition b are in an "AND" relationship, and the combined judgment can indicate that when the data processing record meets condition a, then it is judged whether it meets condition b.
  • condition a and condition b are in an "AND" relationship, and the combined judgment can indicate that when the data processing record meets condition a, then it is judged whether it meets condition b.
  • "AND” only when the data processing record A record is considered compliant only when it meets both condition a and condition b.
  • condition c belongs to (does not meet condition c), condition c does not meet)
  • the data processing record conforms to the first n-1 conditions that are "and" with the condition n (that is, the last compliance judgment condition), then when the data processing record also meets the condition n, it means that all relevant If all the conditions of the "AND" relationship are complete, the data processing records can be considered to be compliant.
  • any compliance judgment condition in the multiple compliance judgment conditions belongs to the compliance judgment condition. If it is not regulated, the compliance detection of other related compliance judgment conditions can be stopped, and the data processing record can be considered as compliant, thereby improving the efficiency of compliance detection, that is to say, when the multiple compliance
  • the relationship between the compliance judgment conditions is an OR relationship, only when all the compliance judgment conditions are non-compliant, then the data processing record can be considered as non-compliant.
  • the compliance judgment condition 1, the compliance judgment condition 2, ..., and the compliance judgment condition n are in an "or" relationship (that is, when the data processing record does not meet all the conditions, it will lead to non-compliance ).
  • the judgment order among various compliance judgment conditions can be fixed in advance according to the priority coefficient of the judgment conditions, and the conditions with higher priority are judged first, and the condition n is the last compliance judgment condition that needs to be judged, then each compliance judgment
  • the judgment process between conditions can refer to the following triplet form (compliance judgment conditions are simplified to conditions):
  • condition a continue to judge (does not meet condition a), condition b)
  • condition a and condition b are in an "or" relationship, and continuing to judge means that when the data processing record does not meet condition a, it can continue to judge whether it meets condition b and then judge whether it is compliant, because in the phase "or" In the case of a relationship, a data processing record can only be considered non-compliant if none of the data processing records meets all the conditions of the "or" relationship.
  • condition c belonging to (meeting condition c), compliance
  • the data processing record when the data processing record meets a certain condition, the data processing record can be considered to be compliant.
  • condition n belongs to (does not meet condition n), does not comply)
  • the data processing record does not meet the first n-1 conditions that are "or" with the condition n (that is, the last compliance judgment condition). If it does not meet the condition n, it can be considered that the data processing record is not compliance.
  • the relationship between the multiple compliance judgment conditions (that is, a third compliance judgment condition and one or more fourth compliance judgment conditions) is an inclusion relationship
  • the third compliance judgment condition belongs to When it is compliant, it is determined that the data processing record is not compliant, and the compliance of one or more fourth compliance judgment conditions is further determined, and a certain fourth compliance judgment condition that causes the third compliance judgment condition to be non-compliant can be accurately found.
  • Compliance judgment conditions For enterprises, this can help them accurately find violation points; for regulatory authorities, it can refine compliance detection and be easy to implement.
  • Step S502 Obtain the data processing records to be detected by the data processor or data controller and input them into the knowledge map to determine the compliance of the data processing records.
  • the data processing record includes one or more of the processor, the processing time, the specific operation type of processing, and the specific data object type.
  • each data controller (or data processor), and if applicable, the agent of the data controller (or data processor), shall maintain records of processing activities in accordance with its duties , and the records shall include all of the following information: the name and contact information of the data controller and, if applicable, joint controllers, the controller’s representative and data protection officer; the purpose of the processing, the type of operation; the categories of data subjects and the A description of the categories; the categories of recipients to whom the personal data have been or will be disclosed, including recipients in third world countries or international organizations; if applicable, transfers of personal data to third world countries or international organizations, including The identification of the third country or international organization, and in the case of transmission, appropriate security measures for documents; if possible, setting time limits for the erasure of different categories of data; general description of the technical and organizational security measures in place; other relevant information.
  • the data processing records of the data controller can be obtained by default, and used as the input data of the knowledge map, so as to complete the judgment of GDPR compliance.
  • it may also be required to identify the type of data collected when collecting personal data from the data subject such as: whether the data belongs to sensitive data, which type of sensitive data it belongs to, etc.).
  • the data processing records can be mapped to Boolean variables and their assignments (for example, the data processing records in Word segmentation and keyword extraction are performed on the "implement encryption protection" statement to obtain the keywords "implementation” and “encryption”; then, the keyword “encryption” can be mapped to the variable encrypt, and the keyword “implementation” can be mapped to true and assigned to the variable encrypt); finally, the data processing records after natural language processing are input into the knowledge map as the input data for compliance testing. Understandably, the above natural language processing representation for data processing records is only an example, and does not constitute a specific limitation of the present application.
  • the overall basic rules can be judged first, that is to say, the priority coefficient of the overall basic rules can be set to the highest, if If it does not comply with the basic rules of GDPR, it can be considered non-compliant; if it complies with the basic rules of GDPR, it can continue to judge compliance according to the type of data operation.
  • FIG. 7a is a schematic flow diagram of a compliance judgment based on knowledge graphs provided in the embodiment of the present application, in which, The data processing records of the data controller or data processor are used as input to judge whether the data processing records comply with the above six basic rules.
  • the data processing record can be considered to comply with the above six basic rules, so that specific data can be processed Types or compliance checks for specific types of operations handled.
  • the programmatic representation of the knowledge graph of these six basic rules can be referred to as follows:
  • conditional entity "restrict processing right” is mapped to the variable restrict
  • conditional entity “refusal right” is mapped to the variable reject
  • conditional entity “inform users of all rights” is mapped to the variable inform
  • conditional entity "provide default privacy” is mapped to the variable privacy
  • conditional entity "obtaining user consent” is mapped to the variable agreement
  • conditional entity “collection purpose” is mapped to the variable collect
  • conditional entity “processing purpose” is mapped to the variable done
  • //Input data includes the type of operation, the data processing records of the data controller (including technical description information to ensure data security, the category of data subjects and the classification description information of personal data, the purpose of processing and other related record information)
  • the data processing record involves storage operation.
  • the following three data storage rules are processed through knowledge extraction: 1) whether the data storage is safe (whether storage encryption is used, etc., derived from the obligations of the data controller (processor) in the GDPR regulations); 2) whether it is allowed The user deletes the data in the original device (derived from the "deletion right" in the data subject's rights in the GDPR regulations); principle of transformation").
  • the relationship between the above three data storage rules is an "and" relationship, that is, when the data processing record does not meet any of the data storage rules, it can be considered non-compliant.
  • the above three data storage rules can be judged by referring to the following triplet example:
  • the storage time is greater than the processing time, which belongs to (false), and the principle is not compliant);
  • the triplet of the above three data storage rules can be visually represented in the form of a graph as a sub-knowledge graph (if the complete knowledge graph is established through a decision tree, the input node of the sub-knowledge graph can be the internal One of the nodes), please refer to Figure 7b.
  • Figure 7b is another schematic flowchart of compliance judgment based on the knowledge graph provided in the embodiment of the present application, where the data processing records of the data controller or data processor are used as Input to judge whether the data processing record complies with the above three data storage rules.
  • the data processing record can be considered to comply with the above three data storage rules.
  • the programmatic representation of the knowledge graph of these three data storage rules can be referred to as follows:
  • the input data includes the type of operation, the record information of the data controller processing the data (including the security protection technology used in the data processing process, encryption, and the time limit for erasing different types of data, etc.)
  • rules 1), 2), and 3) are phase “and” relationships
  • rules 1), 2), and 4) are phase "and” relationships
  • rules 3), 4) are phase "or” relationships.
  • relationship that is, when the data processing records comply with rules 1), 2), and any of the data acquisition rules in rules 3) and 4), it can be considered compliant.
  • the above four data acquisition rules can be judged by referring to the following triplet example:
  • the data provision method conforms to the user request data method, belongs to (true), compliance);
  • the data provision method conforms to the user's request data method, continue to judge (false), and the data provision method conforms to the user's specified method);
  • the data provision method conforms to the method specified by the user, which belongs to (false), and the data provision is not compliant);
  • the triplet of the above four data acquisition rules can be visually represented in the form of a graph as a sub-knowledge graph (if the complete knowledge graph is established through a decision tree, the input node of the sub-knowledge graph can be the internal One of the nodes), please refer to Figure 7c, Figure 7c is another schematic flowchart of compliance judgment based on knowledge graph provided in the embodiment of this application, where the data processing record of the data controller or data processor is used as Input to judge whether the data processing record complies with the above four data acquisition rules.
  • the data processing record can be considered to comply with the data acquisition rules.
  • the programmatic representation of the knowledge graph of these four data acquisition rules can be referred to as follows:
  • //Input data includes the type of operation, the record information of the data controller processing the data (including the description of the category of the data subject and the classification of personal data; the rights information owned by the data subject; the way the data subject requests data and other related information)
  • the transmission operation on data involves the transmission operation.
  • the following four data transmission rules are processed through knowledge extraction: 1) Whether the third party receiving the data is in the EU, if not, it needs to be judged 2) (derived from the "data portability" in the GDPR regulations); 2) Whether the location of the third party receiving the data has been certified by BCR; 3) Whether the transmission process is encrypted (derived from the obligations of the data controller in the data transmission process in the GDPR regulations); 4) Whether the integrity of the transmission process is carried out Verification (derived from the obligations of the data controller in the data transmission process in the GDPR regulations).
  • rules 1), 2) are phase "or” relationship, rules 1), 3), and rule 4) are phase "and” relationship, rules 2), 3), 4) are phase " "and” relationship, that is, when the data processing records comply with rules 3) and 4) and any of the data transmission rules in rules 1) and 2), it can be considered compliant.
  • rules 1), 2) are phase "or” relationship, rules 1), 3), and rule 4) are phase "and” relationship, rules 2), 3), 4) are phase " "and” relationship, that is, when the data processing records comply with rules 3) and 4) and any of the data transmission rules in rules 1) and 2), it can be considered compliant.
  • the above four data transmission rules can be judged by referring to the following triplet example:
  • the triplet of the above four data transmission rules can be visually represented in the form of a graph as a sub-knowledge graph (if the complete knowledge graph is established through a decision tree, the input node of the sub-knowledge graph can be the internal knowledge graph of the complete knowledge graph One of the nodes), please refer to Figure 7d, Figure 7d is another schematic flow diagram of compliance judgment based on knowledge graph provided in the embodiment of this application, where the data processing records of data controllers or data processors are used as Input to judge whether the data processing record complies with the above four data transmission rules. For example, if the data processing record does not meet the requirements of being in the European Union (that is, the assignment is false), it is necessary to judge whether the data processing record meets the requirements of the BCR certification of the location.
  • the programmatic representation of the knowledge graph of these four data transmission rules can be referred to as follows:
  • the input data includes the type of operation, the record information of the data controller processing the data (including the category of recipients whose personal data has been or will be disclosed, including recipients in third world countries or international organizations; the third country identification of international organizations and appropriate security measures for documents in case of transmission)
  • the above-mentioned compliance detection method for data processing is not limited to a certain type of data or a certain type of operation, and the applicable scenarios are more abundant, covering the requirements of diverse scenarios, and can effectively solve the applicable problems existing in the existing technology.
  • the scene is single, the scope of application is not wide enough, and it is difficult to meet the complex and changeable requirements of the actual detection scene.
  • FIG. 8 is a schematic structural diagram of a compliance detection device for data processing provided by an embodiment of the present application.
  • the compliance detection device 10 may include an acquisition module 101 , a processing module 102 , a determination module 103 , and optionally a configuration module 104 . Among them, the detailed description of each module is as follows:
  • the obtaining module 101 is used to obtain rule information, and perform knowledge extraction on the rule information;
  • the rule information includes one or more of the Chinese Personal Information Protection Law data, the General Data Protection Regulation GDPR corpus, or the Chinese Data Security Law data. kind;
  • a processing module 102 configured to construct one or more knowledge graph entities based on the knowledge extraction results, and establish a relationship between the one or more knowledge graph entities to generate a knowledge graph; the one or more knowledge graph entities
  • the entity includes one or more of one or more compliance judgment conditions, one or more compliance statuses, and the one or more compliance judgment conditions are compliance with China's Personal Information Protection Law, GDPR or China Data Security
  • One or more criteria for judging the compliance of one or more of the laws, the one or more compliance status is whether to comply with one or more of the China Personal Information Protection Law, GDPR or China Data Security Law The possible judgment results of the compliance of the species;
  • the determination module 103 is used to obtain the data processing record to be detected by the data processor or data controller and input it into the knowledge map to determine the compliance of the data processing record; the data processing record includes the processor, the processing One or more of time, the specific type of operation processed, and the specific type of data object.
  • the determining module 103 is specifically configured to:
  • the decision tree includes one or more root nodes, one or more internal nodes and one or more leaf nodes
  • the root node is used to receive the data processing record
  • the internal node is used to store one or more of the processing person, processing time, specific operation type of processing and specific data object type
  • the leaf nodes of the decision tree are used to store the one or more knowledge graph entities.
  • the one or more knowledge graph entities include the one or more compliance judgment conditions and the one or more compliance states; wherein, the one or more knowledge graph entities The relationship between each compliance judgment condition in the entity includes one or more of phase and relationship, phase or relationship or inclusion relationship; the one or more compliance in the one or more knowledge graph entities The relationship between the judgment condition and the one or more compliance states includes a belonging relationship.
  • the one or more compliance judgment conditions include one or more first compliance judgment conditions, and the data processing record involves the one or more first compliance judgment conditions;
  • the determining module 103 is specifically used for:
  • the one or more compliance judgment conditions include one or more second compliance judgment conditions, and the data processing record involves the one or more second compliance judgment conditions;
  • the determining module 103 is specifically used for:
  • the relationship between the one or more second compliance judgment conditions is an OR relationship
  • if any one of the one or more second compliance judgment conditions belongs to compliance then determine The data processing records are compliant.
  • the one or more compliance judgment conditions include one third compliance judgment condition and one or more fourth compliance judgment conditions
  • the data processing record involves the one third compliance judgment condition a compliance judgment condition and the one or more fourth compliance judgment conditions
  • the determining module 103 is specifically used for:
  • the one third compliance judgment condition includes the one or more fourth compliance judgment conditions
  • if the one third compliance judgment condition belongs to non-compliance then determine that the data processing record is non-compliant , and further determine the compliance of the one or more fourth compliance judgment conditions.
  • the device further includes:
  • a configuration module 104 configured to set a priority coefficient for the one or more compliance judgment conditions
  • the determining module 103 is specifically used for:
  • each functional unit in the compliance detection device 10 described in the embodiment of the present application can refer to the relevant description of steps S500-step S502 in the method embodiment described in FIG. Let me repeat.
  • FIG. 9 is a schematic structural diagram of another data processing compliance detection device provided by an embodiment of the present application.
  • the apparatus 20 may include: one or more processors 601 ; one or more input devices 602 , one or more output devices 603 and memory 604 .
  • the aforementioned processor 601 , input device 602 , output device 603 and memory 604 are connected through a bus 605 .
  • the memory 604 is used to store computer programs, and the computer program includes program instructions, and the processor 601 is used to execute the program instructions stored in the memory 604 .
  • the processor 601 is configured to call the program instruction to execute: obtain rule information, and perform knowledge extraction on the rule information; One or more of the data security law materials; construct one or more knowledge map entities based on the results of the knowledge extraction, and establish the relationship between the one or more knowledge map entities to generate a knowledge map; the One or more knowledge graph entities include one or more of one or more compliance judgment conditions and one or more compliance statuses, and the one or more compliance judgment conditions are whether to comply with China's Personal Information Protection Law One or more criteria for judging the compliance of one or more of GDPR or China Data Security Law, the one or more compliance status is whether to comply with China Personal Information Protection Law, GDPR or China Data Security Law One or more of the possible judgment results of compliance; obtain the data processing records to be detected by the data processor or data controller and input them into the knowledge map, and determine the compliance of the data processing records ; The data processing record includes one or more of the processor, processing time, specific operation type of processing and specific data object type.
  • the so-called processor 601 may be a central processing unit (Central Processing Unit, CPU), and the processor may also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP) , Application Specific Integrated Circuit (ASIC), off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the input device 602 may include a touch panel, a microphone, etc.
  • the output device 603 may include a display (LCD, etc.), a speaker, and the like.
  • the memory 604 may include read-only memory and random-access memory, and provides instructions and data to the processor 601 .
  • a portion of memory 604 may also include non-volatile random access memory.
  • memory 604 may also store device type information.
  • the scope of the compliance detection device described in this application is not limited thereto, and the structure of the compliance detection device may not be limited by FIG. 9 .
  • the device may be a stand-alone device or may be part of a larger device.
  • the device may be:
  • a set of one or more ICs may also include storage components for storing data and computer programs;
  • ASIC such as modem (Modem);
  • the processor 601, input device 602, and output device 603 described in the embodiment of the application can execute the implementation described in the compliance detection method of data processing provided in the embodiment of the application, and can also execute this The implementation of the data processing compliance detection device described in the embodiment of the application will not be repeated here.
  • the device described in the embodiment of the present application may be implemented by a general-purpose processor. It should be understood that the above-mentioned devices in various product forms have any function of the compliance detection method for data processing in the above-mentioned method embodiments, which will not be repeated here.
  • the embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium stores a computer program.
  • the computer program includes program instructions.
  • the program instructions are executed by a processor, the data processing method shown in FIG.
  • FIG. 5a For the regularity detection method, please refer to the description of the embodiment shown in FIG. 5a for details, and details are not repeated here.
  • the above-mentioned computer-readable storage medium may be the compliance detection device described in any of the foregoing embodiments or an internal storage unit of the electronic device, such as a hard disk or a memory of the electronic device.
  • the computer-readable storage medium may also be an external storage device of the electronic device, such as a plug-in hard disk equipped on the electronic device, a smart memory card (smart media card, SMC), a secure digital (secure digital, SD) card, Flash card (flash card), etc.
  • the computer-readable storage medium may also include both an internal storage unit of the electronic device and an external storage device.
  • the computer-readable storage medium is used to store the computer program and other programs and data required by the electronic device.
  • the computer-readable storage medium can also be used to temporarily store data that has been output or will be output.
  • An embodiment of the present application further provides a computer program product, which, when the computer program product is run on a computer, causes the computer to execute the method in any one of the preceding embodiments.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable processing device to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions
  • the device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
  • These computer program instructions can also be loaded onto a computer or other programmable processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, whereby the process performed on the computer or other programmable device
  • the instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种数据处理的合规性检测方法、装置和相关设备,其中方法包括:获取规则信息,并对所述规则信息进行知识抽取,其中,规则信息可以包括中国个人信息保护法、GDPR或中国数据安全法中一种或多种的相关信息;基于所述知识抽取的结果构建一个或多个知识图谱实体,并建立所述一个或多个知识图谱实体之间的关系,生成知识图谱,获取数据处理者或数据控制者的待检测的数据处理记录并输入至该知识图谱,确定所述数据处理记录是否符合中国个人信息保护法、GDPR或中国数据安全法中一种或多种的要求。采用本申请实施例,可以细化法规具体判断流程,覆盖多样化场景需求,提高合规性检测解释性,从而促进数据开发和利用,保障各方权益。

Description

一种数据处理的合规性检测方法、装置和相关设备
本申请要求于2021年11月18日提交中国国家知识产权局、申请号为202111373056.8、申请名称为“一种数据处理的合规性检测方法、装置和相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,尤其涉及一种数据处理的合规性检测方法、装置和相关设备。
背景技术
随着移动互联网的发展,终端电子设备得到广泛应用,网络中每时每刻都在产生海量数据。个人数据的收集、使用、加工、传输等大数据产业的兴起,使得数据隐私和数据保护成为公众的关注点。欧盟(European Union,EU)于2016年颁布《一般数据保护条例》(General Data Protection Regulation,GDPR),并且我国为了规范数据处理活动,保障数据安全,促进数据开发利用,保护个人、组织的合法权益,维护国家主权、安全和发展利益,也于2021年9月起和11月起分别施行了《中国数据安全法》和《中国个人信息保护法》,这全面提升了对个人数据这一数字经济的关键生产要素的保护。与其他行业相比,互联网企业更有可能在开展业务的过程中接触到数据主体的个人数据,从而受到GDPR和《中国数据安全法》以及《中国个人信息保护法》的规制,如果企业的数据处理操作违反了GDPR和《中国数据安全法》以及《中国个人信息保护法》法规,则将面临巨额罚款。因此,企业想要基于个人数据获取巨大的经济效益,应当立足GDPR和《中国数据安全法》以及《中国个人信息保护法》的立法理念,探索合规成本最小化的应对策略。
然而,现有技术中的GDPR合规性的验证方法,对于数据处理多从高层次角度描述GDPR合规性,缺乏对GDPR法规具体判断流程,难以帮助企业准确控制合规成本,也难以帮助监管者落地实施,实现有效监管;此外,还有部分技术的方案只针对某种数据类型(如隐私数据)或某种操作类型(如数据迁移)进行合规检测,可应用的场景单一,应用范围不够广泛,难以满足实际检测场景复杂多变的需求。因此,如何提供一种能够细化法规具体判断流程,覆盖多样化场景需求的合规性检测方案,有效促进数据开发利用,保障各方权益,是亟待解决的问题。
发明内容
本申请实施例提供一种数据处理的合规性检测方法、装置和相关设备,能够细化法规具体判断流程,覆盖多样化场景需求,提高合规性检测解释性,从而促进数据的开发和利用,保障各方权益。
第一方面,本申请实施例提供了一种数据处理的合规性检测方法,该方法包括:
获取规则信息,并对所述规则信息进行知识抽取;所述规则信息包括中国个人信息保护法语料、通用数据保护条例GDPR语料或中国数据安全法语料中的一种或多种;基于所述知识抽取的结果构建一个或多个知识图谱实体,并建立所述一个或多个知识图谱实体之间的关 系,生成知识图谱;所述一个或多个知识图谱实体包括一个或多个合规判断条件、一个或多个合规状态中的一种或多种,所述一个或多个合规判断条件为是否遵守中国个人信息保护法、GDPR或中国数据安全法中的一种或多种的合规性的一个或多个判断条件,所述一个或多个合规状态为是否遵守中国个人信息保护法、GDPR或中国数据安全法中的一种或多种的合规性的可能的判断结果;获取数据处理者或数据控制者的待检测的数据处理记录并输入至所述知识图谱,确定所述数据处理记录的合规性;所述数据处理记录包括处理人、处理时间、处理的具体操作类型和具体数据对象类型中的一种或多种。
本申请实施例中,首先基于中国个人信息保护法语料、通用数据保护条例以及中国数据安全法中的一种或多种的语料,生成用于合规性检测的知识图谱,然后将待检测的数据处理记录输入至该知识图谱,从而确定该数据处理记录的合规性,本申请实施例借助知识图谱基于图谱表达数据的特点,能够直观地将法规具体判断流程细化,此外,由于对法规中关于各类数据类型和各类操作类型的规则进行知识抽取,结合数据处理记录中处理的具体操作类型或具体数据对象类型进行合规性检测,使得检测不再受限于某一数据类型或某一操作类型,可覆盖多样化场景需求,进而有效促进数据开发利用,保障各方权益。具体地,首先在生成知识图谱的过程中,对中国个人信息保护法语料、通用数据保护条例GDPR语料或中国数据安全法语料中的一种或多种进行知识抽取,基于抽取结果构建出知识图谱的实体,建立实体之间的关系,从而生成用于检测合规性的知识图谱;然后获取到数据处理者或数据控制者的待检测的数据处理记录,将该记录作为知识图谱的输入,最终确定该记录的合规性。本申请实施例针对现有技术中,缺乏对GDPR法规具体判断流程(如基于联盟链和智能合约的合规监管方法),难以落地,以及可应用的场景单一,应用范围不够广泛(如利用Monkey程序对隐私数据的合规性检测和不同国家地区间数据迁移的方法),难以满足实际检测场景复杂多变的需求的问题,通过生成知识图谱实现对相关法规的知识抽取和知识推理,将相关法规的判断流程以图形数据结构具体细化呈现,同时作为知识图谱输入的数据处理记录可以包括处理人(如数据控制者或数据处理者)、处理时间、处理的具体操作类型(如获取、存储、传输等操作)和具体数据对象类型(如隐私数据或非隐私数据)中的一种或多种,使得本申请实施例不受限于数据类型或操作类型,可应用场景更为丰富。此外,因为知识图谱还具备解释性强的特点,使得基于知识图谱进行数据处理合规性检测的方案具备强解释性。因此,相比于现有技术中由于缺乏对GDPR法规具体判断流程,导致难以落地的问题,由于只针对某一数据类型或操作类型进行检测,导致的可应用场景单一,应用范围不够广泛,难以满足实际检测场景复杂多变的需求等问题,本发明实施例能够细化法规具体判断流程、覆盖多样化场景需求、提高合规性检测解释性,从而促进数据的开发和利用,保障各方权益。
在一种可能的实现方式中,所述建立所述一个或多个知识图谱实体之间的关系,生成知识图谱,包括:通过决策树建立所述一个或多个知识图谱实体之间的关系,生成所述知识图谱;所述决策树包括一个或多个根节点、一个或多个内部节点和一个或多个叶子节点中的一种或多种,所述根节点用于接收所述数据处理记录,所述内部节点用于存储处理人、处理时间、处理的具体操作类型和具体数据对象类型中的一种或多种,所述决策树的叶子节点用于存储所述一个或多个知识图谱实体。
本申请实施例中,在建立知识图谱实体之间的关系的过程中,利用决策树对知识图谱实体进行分类和梳理,从而可以针对不同的分类(如处理人、处理时间、处理的具体操作类型或具体数据对象类型的分类)形成不同的子知识图谱;在接收到数据控制者或数据处理者的数据处理记录时,可以根据处理人、处理时间、处理的具体操作类型或具体数据对象类型精 确找到对应分类的子知识图谱,只需对该子知识图谱进行判断即可确定该数据处理记录的合规性。综上,本申请实施例在生成知识图谱的过程中,通过决策树建立知识图谱之间的关系,从而可以在确定数据处理记录的合规性时,能够避免对整个知识图谱进行遍历判断,快速、准确地确定出该数据处理记录的合规性检测结果。
在一种可能的实现方式中,所述一个或多个知识图谱实体包括所述一个或多个合规判断条件和所述一个或多个合规状态;其中,所述一个或多个知识图谱实体中的各个合规判断条件之间的关系包括相与关系、相或关系或包含关系中的一种或多种;所述一个或多个知识图谱实体中的所述一个或多个合规判断条件与所述一个或多个合规状态之间的关系包括属于关系。
本申请实施例中,对知识图谱实体中的合规判断条件以及合规状态进行分类和梳理,为进一步提高合规性检测效率提供基础。具体地,可以将多个合规判断条件之间的关系分为相与关系、相或关系或者包括关系,将合规判断条件和合规状态之间的关系分为属于关系。在进行合规性检测时,结合各个实体之间的关系采取相对应的检测策略(如相与关系的多个合规判断条件中,有一个合规判断条件属于不合规时,即可认为合规性检测的结果为不合规,不需再对其它合规判断条件进行检测),能够提高检测的效率。
在一种可能的实现方式中,所述一个或多个合规判断条件包括一个或多个第一合规判断条件,所述数据处理记录涉及所述一个或多个第一合规判断条件;所述确定所述数据处理记录的合规性,包括:当所述一个或多个第一合规判断条件之间的关系为相与关系时,若所述一个或多个第一合规判断条件中的每个合规判断条件均属于合规,则确定所述数据处理记录合规。
本申请实施例中,在对数据控制者或数据处理者的数据处理记录进行合规性检测时,明确该数据处理记录涉及到多个合规判断条件(一个或多个第一合规判断条件),若该多个合规判断条件之间的关系是相与关系,则该多个合规判断条件均需属于合规,该数据处理记录才是合规的,也即是说,当该多个合规判断条件之间的关系是相与关系时,只要其中有一个合规判断条件属于不合规,那么该数据处理记录也就不合规。因此,本申请实施例在对多个相与关系的合规判断条件进行合规性检测时,当某个合规判断条件不合规时,即可停止对其它涉及到的合规判断条件的合规性检测,确定对应的数据处理记录不合规,从而提高合规性检测的效率。
在一种可能的实现方式中,所述一个或多个合规判断条件包括一个或多个第二合规判断条件,所述数据处理记录涉及所述一个或多个第二合规判断条件;所述确定所述数据处理记录的合规性,包括:当所述一个或多个第二合规判断条件之间的关系为相或关系时,若所述一个或多个第二合规判断条件中的任一个合规判断条件属于合规,则确定所述数据处理记录合规。
本申请实施例中,在对数据控制者或数据处理者的数据处理记录进行合规性检测时,明确该数据处理记录涉及到多个合规判断条件(即一个或多个第二合规判断条件),若该多个合规判断条件之间的关系是相或关系,则该多个合规判断条件中任一个合规判断条件属于合规,则该数据处理记录即是合规的,也即是说,当该多个合规判断条件之间的关系是相或关系时,只有所有合规判断条件属于不合规,那么该数据处理记录才不合规。因此,本申请实施例在对多个相或关系的合规判断条件进行合规性检测时,当某个合规判断条件合规时,即可停止对其它涉及到的合规判断条件的合规性检测,确定对应的数据处理记录合规,从而提高合规性检测的效率。
在一种可能的实现方式中,所述一个或多个合规判断条件包括一个第三合规判断条件和一个或多个第四合规判断条件,所述数据处理记录涉及所述一个第三合规判断条件和所述一个或多个第四合规判断条件;所述确定所述数据处理记录的合规性,包括:当所述一个第三合规判断条件包含所述一个或多个第四合规判断条件时,若所述一个第三合规判断条件属于不合规,则确定所述数据处理记录不合规,并进一步确定所述一个或多个第四合规判断条件的合规性。
本申请实施例中,在对数据控制者或数据处理者的数据处理记录进行合规性检测时,明确该数据处理记录涉及到多个合规判断条件(即一个第三合规判断条件和一个或多个第四合规判断条件),若该多个合规判断条件之间的关系是包含关系,则当第三合规判断条件属于不合规时,则确定该数据处理记录不合规,并进一步确定一个或多个第四合规判断条件的合规性,可以精准找到导致第三合规判断条件不合规的某个第四合规判断条件,这对于企业而言,可以帮助企业精准找到违规点;对于监管部门而言,可以将合规性检测细化,易于落地实施。
在一种可能的实现方式中,所述方法,还包括:为所述一个或多个合规判断条件设置优先级系数;所述确定所述数据处理记录的合规性,包括:基于所述优先级系数对所述数据处理记录涉及的合规判断条件进行判断,确定所述数据处理记录的合规性。
本申请实施例中,在生成知识图谱的过程中,可以为知识图谱实体包括的一个或多个合规判断条件设置优先级系数;在对数据控制者或数据处理者的数据处理记录进行合规性检测时,可以根据该优先级系数对该数据处理记录所涉及到多个合规判断条件依次进行判断。其中,该优先级系数可以根据法规对于各个合规判断条件的重视程度进行设置,也可以根据不同合规判断条件的违规惩处力度进行设置,还可以根据合规判断条件涉及的频次进行设置。因此,本申请实施例在根据优先级系数对该数据处理记录所涉及到多个合规判断条件依次进行判断时,可以优先对优先级高的合规判断条件进行判断,从而提高合规性检测的效率。
第二方面,本申请实施例提供了一种数据处理的合规性检测装置,该装置包括:获取模块,用于获取规则信息,并对所述规则信息进行知识抽取;所述规则信息包括中国个人信息保护法语料、通用数据保护条例GDPR语料或中国数据安全法语料中的一种或多种;处理模块,用于基于所述知识抽取的结果构建一个或多个知识图谱实体,并建立所述一个或多个知识图谱实体之间的关系,生成知识图谱;所述一个或多个知识图谱实体包括一个或多个合规判断条件、一个或多个合规状态中的一种或多种,所述一个或多个合规判断条件为是否遵守中国个人信息保护法、GDPR或中国数据安全法中的一种或多种的合规性的一个或多个判断条件,所述一个或多个合规状态为是否遵守中国个人信息保护法、GDPR或中国数据安全法中的一种或多种的合规性的可能的判断结果;确定模块,用于获取数据处理者或数据控制者的待检测的数据处理记录并输入至所述知识图谱,确定所述数据处理记录的合规性;所述数据处理记录包括处理人、处理时间、处理的具体操作类型和具体数据对象类型中的一种或多种。
本申请实施例中,首先基于获取模块对中国个人信息保护法、通用数据保护条例以及中国数据安全法中的一种或多种的语料进行知识抽取,通过处理模块生成用于合规性检测的知识图谱,然后通过确定模块将待检测的数据处理记录输入至该知识图谱,从而确定该数据处理记录的合规性,本申请实施例借助知识图谱基于图表达数据的特点,能够直观地将法规具体判断流程细化,此外,由于对法规中关于各类数据类型和各类操作类型的规则进行知识抽取,结合数据处理记录中处理的具体操作类型或具体数据对象类型进行合规性检测,使得检 测不再受限于某一数据类型或某一操作类型,可覆盖多样化场景需求,进而有效促进数据开发利用,保障各方权益。具体地,首先在生成知识图谱的过程中,利用获取模块对中国个人信息保护法语料、通用数据保护条例GDPR语料和/或中国数据安全法语料中的一种或多种进行知识抽取,再通过处理模块基于抽取结果构建出知识图谱的实体,建立实体之间的关系,从而生成用于检测合规性的知识图谱;然后由确定模块获取到数据处理者或数据控制者的待检测的数据处理记录,将该记录作为知识图谱的输入,最终确定该记录的合规性。本申请实施例针对现有技术中,缺乏对GDPR法规具体判断流程(如基于联盟链和智能合约的合规监管方法),难以落地,以及可应用的场景单一,应用范围不够广泛(如利用Monkey程序对隐私数据的合规性检测和不同国家地区间数据迁移的方法),难以满足实际检测场景复杂多变的需求的问题,通过生成知识图谱实现对相关法规的知识抽取和知识推理,将相关法规的判断流程以图形数据结构具体细化呈现,同时作为知识图谱输入的数据处理记录可以包括处理人(如数据控制者或数据处理者)、处理时间、处理的具体操作类型(如获取、存储、传输等操作)和具体数据对象类型(如隐私数据或非隐私数据)中的一种或多种,使得本申请实施例不受限于某一数据类型或某一操作类型,可应用场景更为丰富。此外,因为知识图谱还具备解释性强的特点,使得基于知识图谱进行数据处理合规性检测的方案具备强解释性。因此,相比于现有技术中由于缺乏对GDPR法规具体判断流程,导致难以落地的问题,由于只针对某一数据类型或操作类型进行检测,导致的可应用场景单一,应用范围不够广泛,难以满足实际检测场景复杂多变的需求的问题,本发明实施例能够细化法规具体判断流程、覆盖多样化场景需求、提高合规性检测解释性,从而促进数据的开发和利用,保障各方权益。
在一种可能的实施方式中,所述确定模块,具体用于:通过决策树建立所述一个或多个知识图谱实体之间的关系,生成所述知识图谱;所述决策树包括一个或多个根节点、一个或多个内部节点和一个或多个叶子节点中的一种或多种,所述根节点用于接收所述数据处理记录,所述内部节点用于存储处理人、处理时间、处理的具体操作类型和具体数据对象类型中的一种或多种,所述决策树的叶子节点用于存储所述一个或多个知识图谱实体。
在一种可能的实施方式中,所述一个或多个知识图谱实体包括所述一个或多个合规判断条件和所述一个或多个合规状态;其中,所述一个或多个知识图谱实体中的各个合规判断条件之间的关系包括相与关系、相或关系或包含关系中的一种或多种;所述一个或多个知识图谱实体中的所述一个或多个合规判断条件与所述一个或多个合规状态之间的关系包括属于关系。
在一种可能的实施方式中,所述一个或多个合规判断条件包括一个或多个第一合规判断条件,所述数据处理记录涉及所述一个或多个第一合规判断条件;所述确定模块,具体用于:当所述一个或多个第一合规判断条件之间的关系为相与关系时,若所述一个或多个第一合规判断条件中的每个合规判断条件均属于合规,则确定所述数据处理记录合规。
在一种可能的实施方式中,所述一个或多个合规判断条件包括一个或多个第二合规判断条件,所述数据处理记录涉及所述一个或多个第二合规判断条件;所述确定模块,具体用于:当所述一个或多个第二合规判断条件之间的关系为相或关系时,若所述一个或多个第二合规判断条件中的任一个合规判断条件属于合规,则确定所述数据处理记录合规。
在一种可能的实施方式中,所述一个或多个合规判断条件包括一个第三合规判断条件和一个或多个第四合规判断条件,所述数据处理记录涉及所述一个第三合规判断条件和所述一个或多个第四合规判断条件;所述确定模块,具体用于:当所述一个第三合规判断条件包含所述一个或多个第四合规判断条件时,若所述一个第三合规判断条件属于不合规,则确定所 述数据处理记录不合规,并进一步确定所述一个或多个第四合规判断条件的合规性。
在一种可能的实施方式中,所述装置,还包括:配置模块,用于为所述一个或多个合规判断条件设置优先级系数;所述确定模块,具体用于:基于所述优先级系数对所述数据处理记录涉及的合规判断条件进行判断,确定所述数据处理记录的合规性。
第三方面,本申请实施例提供了一种终端设备,其特征在于,包括处理器、输入设备、输出设备和存储器,所述处理器、输入设备、输出设备和存储器相互连接,其中,所述存储器用于存储计算机程序,所述计算机程序包括程序指令,所述处理器被配置用于调用所述程序指令,执行上述第一方面的数据处理的合规性检测方法。
第四方面,本申请实施例提供了一种计算机可读存储介质,其特征在于,该计算机存储介质存储有计算机程序,该计算机程序包括程序指令,该程序指令当被处理器执行时使该处理器执行上述第一方面的数据处理的合规性检测方法。
第五方面,本申请实施例提供了一种计算机程序,其特征在于,所述计算机程序包括指令,当所述计算机程序被所述终端设备执行时,使得所述终端设备执行上述第一方面的数据处理的合规性检测方法。
第六方面,本申请实施例提供了一种芯片系统,该芯片系统包括处理器,用于支持设备实现上述第一方面中所涉及的功能,例如,生成或处理上述数据处理的合规性检测方法中所涉及的信息。在一种可能的设计中,所述芯片系统还包括存储器,所述存储器,用于保存设备必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。
附图说明
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是现有技术中的一种基于联盟链的GDPR合规监管方法的流程示意图;
图2是现有技术中的一种基于Monkey程序的合规检测方法的流程示意图;
图3是现有技术中的一种用户数据的迁移方法的流程示意图;
图4是现有技术中的一种用于微服务和编程模型的通用数据保护规则系统示意图;
图5a是本申请实施例提供的一种数据处理的合规性检测方法的流程示意图;
图5b是本申请实施例提供的一种知识图谱整体生成流程示意图;
图5c是本申请实施例提供的一种知识图谱局部转化流程示意图;
图5d是本申请实施例提供的另一种知识图谱局部转化流程示意图;
图6是本申请实施例提供的一种基于决策树的知识图谱示意图;
图7a是本申请实施例中提供的一种基于知识图谱进行合规性判断的流程示意图;
图7b是本申请实施例中提供的另一种基于知识图谱进行合规性判断的流程示意图;
图7c是本申请实施例中提供的另一种基于知识图谱进行合规性判断的流程示意图;
图7d是本申请实施例中提供的另一种基于知识图谱进行合规性判断的流程示意图;
图8是本申请实施例提供的一种数据处理的合规性检测装置的结构示意图;
图9是本申请实施例提供的另一种数据处理的合规性检测装置的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
应当理解,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。
还应当理解,在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置展示该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。
还应当进一步理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。
首先,对本申请中的部分用语进行解释说明,以便于本领域技术人员理解。
(1)知识图谱(Knowledge Graph,KG),是一种可以对知识进行收集、存储和自动更新的系统/技术,可以将知识发展进程与结构关系显示为一系列各种不同的图形,用可视化技术描述知识资源及其载体,挖掘、分析、构建、绘制和显示知识及它们之间的相互联系,具有强解释性。知识图谱的建立,一般包括知识抽取、知识存储、知识计算以及知识应用。本申请实施例将借助知识图谱进行数据处理关于GDPR以及《中国数据保护法》的合规性检测,而且现有技术中并无基于知识图谱进行数据处理合规性的检测方案。
(2)决策树(Decision Tree,DT),是一种树形结构,其中每个内部节点表示一个属性上的测试,每个分支代表一个测试输出,每个叶节点代表一种类别。决策树也常常被用于分类场景。本申请实施例可以借助决策树的模型建立知识图谱,决策树的每个内部节点对应的分类可以表示知识图谱中的一个子知识图谱入口,在进行合规性检测时,可以通过内部节点找到对应子知识图谱,从而快速、准确地完成检测,不需遍历整个知识图谱,提高检测效率。
首先,分析并提出本申请所具体要解决的技术问题。在现有技术中,关于GDPR合规性检测的方案,包括如下方案一、方案二、方案三和方案四:
方案一:基于联盟链的GDPR合规监管方案,请参见图1,图1是现有技术中的一种基于联盟链的GDPR合规监管方法的流程示意图,具体可以包括如下步骤S100和步骤S103:
步骤S100:服务提供商和监管机构在联盟链中进行实名注册;
步骤S101:数据主体的许可记录通过智能合约加密存储在联盟区块链中;
步骤S102:授予数据主体访问联盟区块链的权限,并通过智能合约存储数据流转记录;
步骤S103:合规调查时,联盟区块链服务网络根据监管机构所述请求追溯获得记录。
该方案一存在以下缺点:
缺点1:层次较高,难以落地。基于联盟链的GDPR合规监管方法,利用区块链扩展性和不可篡改性,有效提高用户行使GDPR权利和服务商GDPR合规性判断的效率,降低企业数据开发利用的合规成本和监管机构的监管成本,但是整体仍是从较高的层次对GDPR合规性监管流程进行规定,缺乏GDPR合规性判断规则的具体过程。这对于企业而言,可能出现企业难以准确捕捉违规点的情况,导致企业难以准确控制合规成本,也难以落地实施;而对于监管者而言,缺乏具体细化的判断过程,无疑将会增加落地实施的难度,难以实现有效监管,监管结果解释性不强。
方案二:基于Monkey程序的合规检测方案,请参见图2,图2是现有技术中的一种基于Monkey程序的合规检测方法的流程示意图,具体可以包括如下步骤S200和步骤S203:
步骤S200:运行Monkey测试程序;所述Monkey测试程序用于对所述第一终端的应用程序进行测试;
步骤S201:从所述Monkey测试程序中获取所述应用程序的通信数据;
步骤S202:向服务器发送所述通信数据,并对隐私数据进行标记后发送给第二终端;
步骤S203:第二终端确定隐私数据中有不符合GDPR合规规则的违规数据时生成应用程序的检测报告。
该方案二存在以下缺点:
缺点1:数据类型单一,可应用范围小。Monkey程序的合规检测方案,实现了应用程序检测报告的自动生成,检测效率较高,但该方法仅仅能对通信数据中的隐私数据进行了GDPR合规性的验证,针对的数据类型单一,验证的数据范围不全面,当通信数据中涉及的个人数据中的非隐私数据时,基于Monkey程序的合规检测方案可能无法满足对非隐私数据的合规性检测。
方案三:用户数据的迁移方法,请参见图3,图3是现有技术中的一种用户数据的迁移方法的流程示意图,具体可以包括如下步骤S300和步骤S303:
步骤S300:建立对应不同国家的国家码的所属区域数据库;
步骤S301:获取待迁移的用户数据的注册国家以及迁移的目的国家;
步骤S302:根据所属区域数据库判断目的国家和注册国家是否为同一所属区域;
步骤S303:如不属于同一区域则需迁移数据,同时根据目的国家的所属区域和注册国家的所属区域是否符合GDPR规定决定迁移成功与失败。
该方案三存在以下缺点:
缺点1:操作类型单一,可应用范围小。现有技术中的用户数据的迁移方法,能够使得用户数据落在不同区域并符合各个区域数据合规标准,同时保证所有区域数据唯一,但是该方法仅仅能对数据的迁移操作进行了GDPR合规性的验证,针对的操作类型单一,缺少针对数据的其他操作的GDPR合规性验证,当针对通信数据的处理操作为迁移外的其他操作(如复制,存储,获取等操作)时,该方案可能无法满足合规性检测的需求。
方案四:用于微服务和编程模型的通用数据保护规则(GDPR)基础设施,请参见图4,图4是现有技术中的一种用于微服务和编程模型的通用数据保护规则系统示意图,具体地,该系统包括一个通用数据隐私监管模块,根据至少一个数据隐私法规保留与业务应用程序通信的数据中的个人信息;包括与通用数据隐私监管模块连接的数据隐私合规模块,以监控数据流控制器并向客户端计算机报告;包括与通用数据隐私监管模块和数据隐私合规模块相连的数据主体隐私请求模块,基于一个或多个请求生成操作。
该方案四存在以下缺点:
缺点1:层次较高,难以落地。该系统能够帮助应用程序、云计算平台等满足GDPR合规检测的需求,但是该系统仍然是从高层次的角度对GDPR合规性的判断流程进行了规定,缺乏GDPR合规性判断规则的具体过程,对于GDPR法规到GDPR合规性判断规则的具体转化过程并未给出明确定义。这对于企业而言,可能出现企业难以准确捕捉违规点的情况,导致企业难以准确控制合规成本,也难以落地实施;而对于监管者而言,缺乏具体细化的判断过程,无疑将会增加落地实施的难度,难以实现有效监管,监管结果解释性不强。
为了解决当前现有技术中存在的缺乏对GDPR法规具体判断流程、可应用的场景单一,应用范围不够广泛的问题,达到细化法规具体判断流程,覆盖多样化场景需求,有效促进数据开发利用,保障各方权益的目的,综合考虑现有技术存在的缺点,本申请实际要解决的技术问题如下:
1、采用可以通过图谱呈现知识关系、解释性强的知识图谱(方案一的缺点1、方案四的缺点1)。现有技术中,基于联盟链的GDPR合规监管方案和用于微服务和编程模型的通用数据保护规则(GDPR)基础设施,部分满足了数据处理合规性检测的需求,但是由于都是从较高层次的角度对数据处理合规性的判断流程进行规定,没有细化具体的判断流程,导致难以落地实施,解释性不强,无法满足实际部署过程中企业对于精准控制成本和监管者对于精准监管的严苛要求。因此,需要一种能够细化法规具体判断流程,解释性强的方案,易于落地实施,有效促进数据开发利用,保障各方权益。
2、可检测的数据类型和操作类型丰富(方案二的缺点1、方案三的缺点1)。现有技术中,基于Monkey程序的合规检测方案和用户数据的迁移方法,部分满足了数据处理合规性检测的需求,但是由于它们针对的数据类型或操作类型单一,可应用的场景范围不全面,无法满足实际部署过程中各种复杂多变场景对于合规性检测的复杂需求。因此,需要一种能够覆盖多样化场景需求的合规性检测方案,有效促进数据开发利用。
综上所述,现有技术中的合规性检测方案存在的缺乏对GDPR法规具体判断流程,可应用的场景单一,应用范围不够广泛的问题,而导致无法满足实际部署检测的更高要求。因此,本申请提供的数据处理的合规性检测方法,基于通用数据保护条例和/或中国数据安全法的语料,生成用于合规性检测的知识图谱,然后将待检测的数据处理记录输入至该知识图谱,从而确定该数据处理记录的合规性,具备直观表达和解释性强的特点,且可以不受限于某一类数据类型或某一操作类型,能够解决上述技术问题。
为方便理解,以下将以GDPR条例为例,通过GDPR条例与本申请中提供的数据处理的合规性检测方法结合,对本申请中提出的技术问题进行具体分析和解决。
请参见图5a,图5a是本申请实施例提供的一种数据处理的合规性检测方法的流程示意图,下面将从以下步骤S500-步骤S502,对本申请实施例提供的数据处理的合规性检测方法进行描述:
步骤S500:获取规则信息,并对所述规则信息进行知识抽取。
具体地,所述规则信息可以包括中国个人信息保护法语料、通用数据保护条例GDPR语料或中国数据安全法语料中的一种或多种。知识抽取可以是使用资源描述框架(Resource Description Framework,RDF)描述中国个人信息保护法语料、通用数据保护条例GDPR语料或中国数据安全法语料中的一种或多种的各种知识,可以通过三元组(实体1,关系,实体2) 的形式保存到知识库中。其中,实体1可以是不同的合规判断条件(即一个或多个合规判断条件中的一个),每一条合规判断条件与法规条例中的每一条判断规则相对应;三元组中的关系用于表示实体1和实体2之间的关系或二者之间的判断关系,可以包括“组合判断”、“继续判断”、“包含判断”、“属于”四种,每种关系都具有属性,属性用来表示是否符合实体1的合规性要求,例如,取值true表示符合实体1代表的条件,取值false表示不符合实体1代表的条件;实体2可以是不同的合规判断条件或合规状态(即一个或多个合规状态中的一个,包括合规与不合规)的表示,而“组合判断”可以是将实体1和实体2当做组合进行判断,即实体1和实体2各自的合规性都会影响该三元组(实体1,关系,实体2)的合规性结果;而“继续判断”可以是首先判断是否符合实体1的合规性,根据是否符合实体1的合规性结果继续对实体2进行判断,进而判断该三元组(实体1,关系,实体2)的合规性结果;而包含判断可以是在不符合实体1的合规性要求时,进一步判断是否符合实体2的合规性要求,其中,实体1可以包含一个或多个实体2;而在“属于”关系中,实体2是合规状态的表示,即直接标识实体1是属于合规还是不合规。需要说明的是,为方便理解本申请实施例,以“组合判断”、“继续判断”、“包含判断”、“属于”四种作为示例,三元组中的关系也可以是其它表示,在此不作具体限定。还需要说明的是,上述规则信息除了包括中国个人信息保护法语料、通用数据保护条例GDPR语料和中国数据安全法语料之外,还可以包括其它用于保障数据安全的规则、条例或法规,在此不作具体限定。还需要说明的是,知识抽取也可以通过其它多元组将知识存储到知识库中,例如四元组、五元组等,在此不作具体限定。
步骤S501:基于所述知识抽取的结果构建一个或多个知识图谱实体,并建立所述一个或多个知识图谱实体之间的关系,生成知识图谱。
具体地,所述一个或多个知识图谱实体包括一个或多个合规判断条件、一个或多个合规状态中的一种或多种,所述一个或多个合规判断条件为是否遵守中国个人信息保护法、GDPR或中国数据安全法中的一种或多种的合规性的一个或多个判断条件,所述一个或多个合规状态为是否遵守中国个人信息保护法、GDPR或中国数据安全法中的一种或多种的合规性的可能的判断结果。而生成知识图谱的整体思路可以参考图5b,图5b是本申请实施例提供的一种知识图谱整体生成流程示意图,其中,在获取到规则信息后,可以先对规则进行抽取(如图5b中的步骤S5000),然后通过图谱形式将规则直观表示(如图5b中的步骤S5001),最后再对图谱形式的知识进行程序化(如图5b中的步骤S5002),使得用于合规性检测的知识图谱可被设备识别和使用。需要说明的是,关于具体如何对抽取的规则进行图谱形式处理可以参考图5c,图5c是本申请实施例提供的一种知识图谱局部转化流程示意图,其中,在对规则进行抽取后,可以通过三元组形式对规则进行表示(如图5c中的步骤S5100),然后再利用图谱可视化工具将三元组形式的规则转化为图谱形式表示(如图5c中的步骤S5101)。还需要说明的是,关于对图谱形式的知识进行程序化处理可参考图5d,图5d是本申请实施例提供的另一种知识图谱局部转化流程示意图,其中,在利用图谱可视化工具将三元组形式的规则转化为图谱形式表示时,可以针对不同的操作类型生成不同的判断规则图谱(即子知识图谱)(如图5d中的步骤S5200),然后再对不同的判断规则图谱进行程序化处理(如图5d中的步骤S5201)。
在一种可能的实现方式中,在建立一个或多个知识图谱实体之间的关系,生成知识图谱的过程中,也可以通过决策树建立一个或多个知识图谱实体之间的关系,生成知识图谱;所述决策树包括一个或多个根节点、一个或多个内部节点和一个或多个叶子节点中的一种或多种,所述根节点用于接收所述数据处理记录,所述内部节点用于存储处理人、处理时间、处 理的具体操作类型和具体数据对象类型中的一种或多种,所述决策树的叶子节点用于存储所述一个或多个知识图谱实体。请参见图6,图6是本申请实施例提供的一种基于决策树的知识图谱示意图,其中,以处理的具体操作类型为例,决策树的根节点用于接收数据控制者或数据处理者的数据处理记录,并以处理的具体操作类型为主要判断条件;决策树的内部节点存储处理的具体操作类型,如存储操作、数据迁移操作、数据获取操作、数据删除操作等;决策树的叶子节点用于存储处理的具体操作类型所对应一个或多个知识图谱实体(包括一个或多个合规判断条件和一个或多个合规状态),若数据处理记录中处理的具体操作类型为数据存储操作时,合规性检测流程将从根节点跳转到数据存储操作对应的内部节点,然后遍历该内部节点下的叶子节点,确定该数据处理记录的合规性。知识图谱的程序化表示,可以参考如下规则:例如,首先,定义变量compliance表示GDPR合规性,初始值为true,表示合规;定义变量d,用来接收处理的个人数据;定义变量act,标识当前正在进行的操作类型(存储、删除、访问等);然后,将图谱中的每一个合规判断条件均映射为一个或多个布尔变量,该变量用于表示是否符合对应的合规判断条件,符合则可以赋值为true,不符合则可以赋值为false,作为compliance变量的赋值;最后,由于图谱中关系的属性已经表示了符合合规判断条件与否,因此可以通过遍历整个图谱的实体与关系,为每个实体映射成的变量赋值,将关系“组合判断”映射为关键字and,关系“继续判断”映射为关键字or,关系“包含判断”映射为外层判断语句之内的if判断语句,关系“属于”映射为对变量compliance的赋值操作,进而得到对应的程序化判断语句。需要说明的是,决策树的架构也可以以处理人、处理时间、或具体数据对象类型等要素为基础进行建立,在此不作具体限定。可理解地,上述知识图谱的程序化表示仅作为一种示例,不构成本申请的具体限定。
在一种可能的实现方式中,在生成知识图谱的过程中,可以为知识图谱实体包括的一个或多个合规判断条件设置优先级系数;在对数据控制者或数据处理者的数据处理记录进行合规性检测时,可以根据该优先级系数对该数据处理记录所涉及到多个合规判断条件依次进行判断。其中,该优先级系数可以根据法规对于各个合规判断条件的重视程度进行设置,也可以根据不同合规判断条件的违规惩处力度进行设置,还可以根据合规判断条件涉及的频次进行设置。因此,本申请实施例在根据优先级系数对该数据处理记录所涉及到多个合规判断条件依次进行判断时,可以优先对优先级高的合规判断条件进行判断,从而提高合规性检测的效率。
在一种可能的实现方式中,在建立一个或多个知识图谱实体之间的关系时,可以将一个或多个知识图谱实体中的各个合规判断条件之间的关系确定为相与关系、相或关系或包含关系中的一种或多种;可以将一个或多个知识图谱实体中的所述一个或多个合规判断条件与所述一个或多个合规状态之间的关系确定为属于关系。
具体地,在对数据控制者或数据处理者的数据处理记录进行合规性检测时,明确该数据处理记录涉及到多个合规判断条件(即一个或多个第一合规判断条件),若该多个合规判断条件之间的关系是相与关系,则该多个合规判断条件均需属于合规,该数据处理记录才可以认为是合规的,也即是说,当该多个合规判断条件之间的关系是相与关系时,只要其中有一个合规判断条件属于不合规,即可停止对其它涉及到的合规判断条件的合规性检测,那么该数据处理记录可以认为是不合规的,从而可以提高合规性检测的效率。例如,合规判断条件1、合规判断条件2、...、合规判断条件n之间是相“与”的关系(即当数据处理记录不满足任意一个条件时,都会导致不合规)。其中,各个合规判断条件之间的判断顺序可以提前按照判断条件优先级系数固定,优先级高的条件优先判断,并且合规判断条件n表示最后一个合规判断 条件,那么将各合规判断条件之间的判断过程可以参考如下的三元组形式(合规判断条件简化为条件):
例如,1)(条件a,组合判断(符合条件a),条件b)
a=1,2,....,(n-1)b=2,3,...,n
此形式中,条件a与条件b是相“与”关系,组合判断可以表示当数据处理记录符合条件a时,则判断是否符合条件b,在相“与”的情况之下,只有当数据处理记录同时符合条件a和条件b时才可认为是合规的。
例如,2)(条件c,属于(不符合条件c),条件c不合规)
c=1,2,...,n
此形式中,在所有相“与”的条件中,当数据处理记录不符合某一条件时,那么可以认为数据处理记录是不合规的。
例如,3)(条件n,属于(符合条件n),合规)
此形式中,数据处理记录符合与条件n(即最后一个合规判断条件)是相“与”关系的前n-1个条件,那么当数据处理记录也符合条件n时,则表示符合所有相“与”关系的条件全部,则可以认为数据处理记录是合规的。
具体地,若该多个合规判断条件(即一个或多个第二合规判断条件)之间的关系是相或关系,则该多个合规判断条件中任一个合规判断条件属于合规,即可停止对其它涉及到的合规判断条件的合规性检测,该数据处理记录可以认为是合规的,从而提高合规性检测的效率,也即是说,当该多个合规判断条件之间的关系是相或关系时,只有所有合规判断条件都属于不合规,那么该数据处理记录才可以认为是不合规的。例如,合规判断条件1、合规判断条件2、...、合规判断条件n之间是相“或”的关系(即当数据处理记录不满足所有条件时,才会导致不合规)。其中,各个合规判断条件之间的判断顺序可以提前按照判断条件优先级系数固定,优先级高的条件优先判断,且条件n是需要判断的最后一个合规判断条件,那么将各合规判断条件之间的判断过程可以参考如下的三元组形式(合规判断条件简化为条件):
例如,1)(条件a,继续判断(不符合条件a),条件b)
a=1,2,....,(n-1)b=2,3,...,n
此形式中,条件a与条件b是相“或”关系,继续判断表示当数据处理记录不符合条件a时,则可以继续判断是否符合条件b进而判断合规与否,因为在相“或”关系的情况下,只有当数据处理记录均不符合所有相“或”关系的条件时,才可以认为该数据处理记录是不合规的。
例如,2)(条件c,属于(符合条件c),合规)
c=1,2,...,n
此形式中,当数据处理记录符合某一条件时,可以认为该数据处理记录是合规的。
例如,3)(条件n,属于(不符合条件n),不合规)
此形式中,数据处理记录不符合与条件n(即最后一个合规判断条件)是相“或”关系的前n-1个条件,若也不符合条件n,则可以认为该数据处理记录不合规。
具体地,若该多个合规判断条件(即一个第三合规判断条件和一个或多个第四合规判断条件)之间的关系是包含关系,则当第三合规判断条件属于不合规时,则确定该数据处理记录不合规,并进一步确定一个或多个第四合规判断条件的合规性,可以精准找到导致第三合规判断条件不合规的某个第四合规判断条件,这对于企业而言,可以帮助企业精准找到违规点;对于监管部门而言,可以将合规性检测细化,易于落地实施。
步骤S502:获取数据处理者或数据控制者的待检测的数据处理记录并输入至所述知识图 谱,确定所述数据处理记录的合规性。
具体地,在生成用于合规性检测的知识图谱后,获取数据处理者或者数据控制者的数据处理记录,并输入至知识图谱进行合规性检测,从而确定该数据处理记录的合规性。其中,所述数据处理记录包括处理人、处理时间、处理的具体操作类型和具体数据对象类型中的一种或多种。需要说明的是,根据GDPR条例法规的要求,每一位数据控制者(或数据处理者),以及如适用数据控制者(或数据处理者)的代理人,应当依其职责保持处理活动的记录,而记录应当包括以下所有信息:数据控制者以及如适用的联合控制者、控制者代理人和数据保护员的姓名和联系信息;处理的目的,操作的类型;数据主体的类别和个人数据的分类的描述;个人数据已经或将要被公开的收件人的类别,包括在第三世界国家或国际组织的收件人;如适用,将个人数据向第三世界国家或国际组织的传输,包括该第三国或国际组织的鉴定,以及传输的情况下,对文档采取适当的安全措施;如可能,则对擦除不同类别的数据设定时间限制;如可能,对确保数据安全性过程中采用的技术和组织安全措施进行一般性描述;其他相关信息。除保持处理活动的记录外,还要求能够按照要求将该记录提供给监管机构以供进行GDPR合规性的判断。因此,在本申请实施例中默认可以获取到数据控制者(或数据处理者)的数据处理记录,并作为知识图谱的输入数据,以此完成GDPR合规性的判断。同时,还可以要求在对数据主体中的个人数据进行收集时,对所收集数据的类型进行标识(如:该数据是否属于敏感类数据、属于敏感类数据中的哪一类等)。还需要说明的是,需要先对获取到的数据处理记录进行自然语言处理(如分词、提取关键字等方式),可以将数据处理记录映射为布尔变量及其赋值(例如,将数据处理记录中的“实现加密保护”语句进行分词和关键字提取,得到关键字“实现”和“加密”;然后,可以将关键字“加密”映射为变量encrypt,将关键字“实现”映射为true并赋值给变量encrypt);最后,将经过自然语言处理后的数据处理记录,作为合规性检测的输入数据输入至知识图谱。可理解地,上述针对数据处理记录的自然语言处理表示仅作为一种示例,不构成本申请的具体限定。
以上从知识图谱的生成流程以及数据处理记录合规性检测流程进行了总体描述,为更方便理解本申请实施例,现通过以下多个示例再对本申请实施例提供的数据处理的合规性检测方法进行简单说明。
(1)以整体基本规则为示例。根据对GDPR法规进行的知识抽取可以发现,除了针对每一类数据操作需要遵循的GDPR条款之外,还存在部分条款是应当遵守的基本规则(不论是否对数据进行了处理)。因此,在针对每一个数据处理记录进行具体GDPR条款的合规性判断之前,可以先对整体的基本规则进行判断,也即是说,可以将整体的基本规则的优先级系数设置为最高,若不符合GDPR基本规则,则可以认为是不合规的;若符合GDPR基本规则,则可根据数据操作类型的不同继续进行合规性判断。例如,存在如下6个基本规则:1)数据主体是否能够限制数据控制者对个人数据的处理(来源于GDPR法规中数据主体权利的“限制处理权”);2)数据主体是否能够随时拒绝数据控制者对个人数据的处理(来源于GDPR法规中数据主体权利的“拒绝权”);3)是否在第一次与数据主体通信时明确单独告知了数据主体所具有的权利(来源于GDPR法规中数据控制者的义务);4)数据控制者是否提供默认隐私(来源于GDPR法规中数据控制者的义务);5)是否获取了数据主体对个人数据处理的同意(来源于GDPR法规中数据控制者需遵守的原则);6)数据收集的目的是否与最终处理的目的一致(来源于GDPR法规中数据控制者需遵守的原则,“目的限制原则”)。上述基本规则之间的关系为相“与”关系,即当数据处理记录不符合任何一个基本规则时,即可认为不合规。以上6个基本规则可以参考如下三元组示例进行判断:
(限制处理权,继续判断(true),拒绝权);
(拒绝权,继续判断(true),告知用户所有权利);
(告知用户所有权利,继续判断(true),提供默认隐私);
(提供默认隐私,继续判断(true),获得用户同意);
(获得用户同意,继续判断(true),收集目的符合处理目的);
(收集目的符合处理目的,属于(true),GDPR合规);
(限制处理权,属于(false),权利不合规);
(拒绝权,属于(false),权利不合规);
(告知用户所有权利,属于(false),义务不合规);
(提供默认隐私,属于(false),义务不合规);
(获得用户同意,属于(false),原则不合规);
(收集目的符合处理目的,属于(false),原则不合规);
(原则不合规,属于,GDPR不合规);
(义务不合规,属于,GDPR不合规);
(权利不合规,属于,GDPR不合规)。
可以通过图谱形式针对上述6个基本规则的三元组进行直观表示,请参考图7a,图7a是本申请实施例中提供的一种基于知识图谱进行合规性判断的流程示意图,其中,以数据控制者或数据处理者的数据处理记录作为输入,判断数据处理记录是否符合上述6个基本规则。例如,若数据处理记录不符合限制处理权的要求(即赋值为false),并认为是因为权利不合规导致的不合规;若符合限制处理权的要求(即赋值为true),然后按照与判断限制处理权相类似的逻辑继续判断是否符合拒绝权的要求,直到判断收集目的符合处理目的也是合规的,则可以认为该数据处理记录符合上述6个基本规则,从而可以进行对具体数据类型或处理的具体操作类型的合规性检测。例如,这6个基本规则的知识图谱的程序化表示,可以参考如下:
compliance=true;//定义变量compliance并赋初值
//条件实体“限制处理权”映射为变量restrict,条件实体“拒绝权”映射为变量reject,条件实体中的“告知用户所有权利”映射为变量inform,条件实体“提供默认隐私”映射为变量privacy,条件实体“获取用户同意”映射为变量agreement,条件实体“收集目的”映射为变量collect,条件实体“处理目的”映射为变量done
//输入数据包括操作类型、数据控制者的数据处理记录(包括确保数据安全性的技术相关描述信息、数据主体的类别和个人数据的分类描述信息、处理的目的等相关记录信息)
Figure PCTCN2022132004-appb-000001
Figure PCTCN2022132004-appb-000002
可理解地,上述知识图谱的程序化表示仅作为一种示例,不构成本申请的具体限定。
(2)再以对数据进行存储操作为示例(即数据处理记录涉及存储操作)。首先,通过知识抽取对以下3个数据存储规则进行处理:1)数据存储是否安全(是否采用了存储加密方式等,来源于GDPR法规中数据控制者(处理者)的义务);2)是否允许用户删除其原始设备中的数据(来源于GDPR法规中数据主体权利中的“删除权”);3)数据存储时间小于等于数据处理时间(来源于GDPR法规中对个人数据处理原则的“数据最小化原则”)。上述3个数据存储规则之间的关系为相“与”关系,即当数据处理记录不符合任何一个数据存储规则时,即可认为不合规。以上3个数据存储规则可以参考如下三元组示例进行判断:
(存储加密情况,组合判断(true),用户删除权);
(用户删除权,组合判断(true),存储时间小于等于处理时间);
(存储时间小于等于处理时间,属于(true),GDPR合规);
(存储加密情况,属于(false),加密不合规);
(用户删除权,属于(false),权利不合规);
(存储时间大于处理时间,属于(false),原则不合规);
(加密不合规,属于,GDPR不合规);
(权利不合规,属于,GDPR不合规);
(原则不合规,属于,GDPR不合规)。
可以通过图谱形式针对上述3个数据存储规则的三元组进行直观表示,作为一个子知识图谱(若完整知识图谱是通过决策树建立的,该子知识图谱的输入节点可以是完整知识图谱的内部节点之一),请参考图7b,图7b是本申请实施例中提供的另一种基于知识图谱进行合规性判断的流程示意图,其中,以数据控制者或数据处理者的数据处理记录作为输入,判断数据处理记录是否符合上述3个数据存储规则。例如,若数据处理记录不符合存储加密情况的要求(即赋值为false),并认为是因为加密不合规导致的不合规;若符合存储加密情况的要求(即赋值为true),然后按照与判断存储加密情况相类似的逻辑组合判断是否符合用户删除权的要求,直到判断存储时间小于等于处理时间也是合规的,则可以认为该数据处理记录符合上述3个数据存储规则。例如,这3个数据存储规则的知识图谱的程序化表示,可以参考如下:
compliance=true;//定义变量compliance并赋初值
//条件实体“存储加密情况”映射为变量encrypt,条件实体“用户删除权”映射为变量erase,条件实体中的“存储时间”映射为变量Ts,“处理时间”映射为变量Tt
//输入数据包括操作的类型、数据控制者处理数据的记录信息(包括对数据处理过程中采用的安全保护技术、加密情况,对擦除不同类别的数据设定时间限制等相关信息)
Figure PCTCN2022132004-appb-000003
可理解地,上述知识图谱的程序化表示仅作为一种示例,不构成本申请的具体限定。
(3)再以对数据进行获取操作为示例(即数据处理记录涉及获取操作)。首先,通过知识抽取对以下4个数据获取规则进行处理:1)数据主体是否能够获取个人数据的处理状态、处理目的等及相关附加信息(来源于GDPR法规中,数据主体具有获取个人数据的权利);2)是否通过了身份认证(来源于GDPR法规中数据控制者的义务);3)数据控制者提供数据的方式是否与数据主体请求数据的方式一致(来源于GDPR法规中规定的数据控制者提供数据的方式);4)如果3)不满足,则判断数据控制者提供数据的方式是否与数据主体指定的方式一致。上述4个数据获取规则中,规则1)、2)、3)为相“与”关系,规则1)、2)、4)为相“与”关系,规则3)、4)为相“或”关系,即当数据处理记录都符合规则1)、2)时,又符合规则3)、4)中任何一个数据获取规则时,即可认为合规。以上4个数据获取规则可以参考如下三元组示例进行判断:
(用户数据获取权,组合判断(true),身份验证通过);
(身份验证通过,组合判断(true),数据提供方式符合用户请求数据方式);
(数据提供方式符合用户请求数据方式,属于(true),合规);
(数据提供方式符合用户请求数据方式,继续判断(false),数据提供方式符合用户指定方式);
(用户数据获取权,属于(false),权利不合规);
(身份验证通过,属于(false),验证不合规);
(数据提供方式符合用户指定方式,属于(true),合规);
(数据提供方式符合用户指定方式,属于(false),数据提供不合规);
(权利不合规,属于,不合规);
(验证不合规,属于,不合规);
(数据提供不合规,属于,不合规)。
可以通过图谱形式针对上述4个数据获取规则的三元组进行直观表示,作为一个子知识图谱(若完整知识图谱是通过决策树建立的,该子知识图谱的输入节点可以是完整知识图谱的内部节点之一),请参考图7c,图7c是本申请实施例中提供的另一种基于知识图谱进行合规性判断的流程示意图,其中,以数据控制者或数据处理者的数据处理记录作为输入,判断数据处理记录是否符合上述4个数据获取规则。例如,若数据处理记录不符合用户数据获取权的要求(即赋值为false),并认为是因为权利不合规导致的不合规;若符合用户数据获取权的要求(即赋值为true),然后按照与判断用户数据获取权相类似的逻辑组合判断是否符合身份验证通过的要求,直到判断数据提供方式中的任何一个规则也是合规的,则可以认为该数据处理记录符合数据获取规则。例如,这4个数据获取规则的知识图谱的程序化表示,可以参考如下:
将图谱中的每一个条件实体映射为一个或多个变量,通过遍历整个图谱的实体和关系,根据映射规则将图谱转化为程序化的表示形式如下:
compliance=true;//定义变量compliance并赋初值
//条件实体“用户数据获取权”映射为变量get,条件实体“身份验证通过”映射为变量identity,条件实体中的“数据提供方式”映射为变量provide_method,“用户请求数据方式”映射为变量request_method,“用户指定方式”映射为变量specified_method
//输入数据包括操作类型、数据控制者处理数据的记录信息(包括数据主体的类别和个人数据的分类的描述;数据主体所具有的权利信息;数据主体请求数据的方式等相关信息)
Figure PCTCN2022132004-appb-000004
可理解地,上述知识图谱的程序化表示仅作为一种示例,不构成本申请的具体限定。
(4)再以对数据进行传输操作为示例(即数据处理记录涉及传输操作)。首先,通过知识抽取对以下4个数据传输规则进行处理:1)接收数据的第三方是否处于欧盟境内,如不是需要判断2)(来源于GDPR法规中的“数据可携权”);2)接收数据的第三方所在地是否获得了BCR认证;3)传输过程是否进行了加密(来源于GDPR法规中数据控制者对数据传输过程中需要履行的义务);4)是否对传输过程进行了完整性校验(来源于GDPR法规中数据控 制者对数据传输过程中需要履行的义务)。上述4个数据传输规则中,规则1)、2)为相“或”关系、规则1)、3)、规则4)为相“与”关系,规则2)、3)、4)为相“与”关系,即当数据处理记录都符合规则3)、4)时,又符合规则1)、2)中任何一个数据传输规则时,即可认为合规。以上4个数据传输规则可以参考如下三元组示例进行判断:
(第三方处于欧盟境内,继续判断(false),第三方所在地拥有BCR认证);
(第三方处于欧盟境内,组合判断(true),传输过程加密);
(第三方所在地拥有BCR认证,组合判断(true),传输过程加密);
(传输过程加密,组合判断(true),传输过程完整性校验);
(传输过程完整性校验,属于(true),合规);
(第三方所在地拥有BCR认证,属于(false),义务不合规);
(传输过程加密,属于(false),原则不合规);
(传输过程完整性校验,属于(false),原则不合规);
(原则不合规,属于,不合规);
(义务不合规,属于,不合规)。
可以通过图谱形式针对上述4个数据传输规则的三元组进行直观表示,作为一个子知识图谱(若完整知识图谱是通过决策树建立的,该子知识图谱的输入节点可以是完整知识图谱的内部节点之一),请参考图7d,图7d是本申请实施例中提供的另一种基于知识图谱进行合规性判断的流程示意图,其中,以数据控制者或数据处理者的数据处理记录作为输入,判断数据处理记录是否符合上述4个数据传输规则。例如,若数据处理记录不符合处于欧盟境内的要求(即赋值为false),则需要判断数据处理记录是否符合所在地拥有BCR认证的要求,若也不符合则认为是因为义务不合规导致的不合规;若符合所在地拥有BCR认证的要求(即赋值为true),然后按照与处于欧盟境内或所在地拥有BCR认证相类似的逻辑组合判断是否符合传输过程加密的要求,直到判断传输过程完整性校验也是合规的,则可以认为该数据处理记录符合数据传输规则。例如,这4个数据传输规则的知识图谱的程序化表示,可以参考如下:
compliance=true;//定义变量compliance并赋初值
//条件实体“第三方所在地”映射为变量loc,条件实体“传输过程加密”映射为变量trans_encrypt,条件实体中的“传输过程完整性校验”映射为变量check,所有的欧盟国家将使用列表EU存储,所有拥有BCR认证的国家将使用列表BCR存储
//输入数据包括操作类型、数据控制者处理数据的记录信息(包括个人数据已经或将要被公开的收件人的类别,包括在第三世界国家或国际组织的收件人;对该第三国或国际组织的鉴定以及传输情况下对文档采取的适当的安全措施等相关信息)
Figure PCTCN2022132004-appb-000005
Figure PCTCN2022132004-appb-000006
可理解地,上述知识图谱的程序化表示仅作为一种示例,不构成本申请的具体限定。
以上,通过GDPR条例中针对整体基本规则、数据存储操作、数据获取操作以及数据传输操作的部分规则要求,通过知识抽取、构建实体关系从而生成知识图谱和与部分规则对应的子知识图谱;然后在获取到数据处理者或者数据控制者的数据处理记录之后,将数据处理记录输入至知识图谱中,基于前述知识图谱和数据处理记录涉及到的子知识图谱,确定出该数据处理记录的合规性。需要说明的是,为方便理解,本申请实施例中仅以GDPR条例为例进行说明,可理解地,本申请实施例也适用于《中国数据安全法》和《中国个人信息保护法》以及其它用于保障数据安全的法规条例,在此不作具体限定。
可理解地,当上述数据处理的合规性检测方法应用于合规性检测的实际场景时,由于采用了可以通过图谱呈现知识关系、解释性强的知识图谱,使得法规具体判断流程能够细化,且可以直观展现,这可以有效解决现有技术中因从较高层次的角度对数据处理合规性的判断流程进行规定,导致的方案难以落地实施,解释性不强,企业难以精准控制成本和监管者难以精准监管的问题。此外,上述数据处理的合规性检测方法,不受限于某一数据类型或某一操作类型,可应用场景更为丰富,覆盖多样化场景需求,能够有效解决现有技术中存在的可应用的场景单一,应用范围不够广泛,难以满足实际检测场景复杂多变的需求的问题。
综上,本申请基于知识图谱对数据处理进行合规性检测的方案可行性高,并且可以克服现有技术中存在的难以落地、可应用范围小的问题,提高合规性检测解释性,从而促进数据的开发和利用,保障各方权益。
上述详细阐述了本申请实施例的方法,下面提供了本申请实施例的相关装置。
请参见图8,图8是本申请实施例提供的一种数据处理的合规性检测装置的结构示意图。该合规性检测装置10可以包括获取模块101、处理模块102、确定模块103,可选地,还包括配置模块104。其中,各个模块的详细描述如下:
获取模块101,用于获取规则信息,并对所述规则信息进行知识抽取;所述规则信息包括中国个人信息保护法语料、通用数据保护条例GDPR语料或中国数据安全法语料中的一种 或多种;
处理模块102,用于基于所述知识抽取的结果构建一个或多个知识图谱实体,并建立所述一个或多个知识图谱实体之间的关系,生成知识图谱;所述一个或多个知识图谱实体包括一个或多个合规判断条件、一个或多个合规状态中的一种或多种,所述一个或多个合规判断条件为是否遵守中国个人信息保护法、GDPR或中国数据安全法中的一种或多种的合规性的一个或多个判断条件,所述一个或多个合规状态为是否遵守中国个人信息保护法、GDPR或中国数据安全法中的一种或多种的合规性的可能的判断结果;
确定模块103,用于获取数据处理者或数据控制者的待检测的数据处理记录并输入至所述知识图谱,确定所述数据处理记录的合规性;所述数据处理记录包括处理人、处理时间、处理的具体操作类型和具体数据对象类型中的一种或多种。
在一种可能的实现方式中,所述确定模块103,具体用于:
通过决策树建立所述一个或多个知识图谱实体之间的关系,生成所述知识图谱;所述决策树包括一个或多个根节点、一个或多个内部节点和一个或多个叶子节点中的一种或多种,所述根节点用于接收所述数据处理记录,所述内部节点用于存储处理人、处理时间、处理的具体操作类型和具体数据对象类型中的一种或多种,所述决策树的叶子节点用于存储所述一个或多个知识图谱实体。
在一种可能的实现方式中,所述一个或多个知识图谱实体包括所述一个或多个合规判断条件和所述一个或多个合规状态;其中,所述一个或多个知识图谱实体中的各个合规判断条件之间的关系包括相与关系、相或关系或包含关系中的一种或多种;所述一个或多个知识图谱实体中的所述一个或多个合规判断条件与所述一个或多个合规状态之间的关系包括属于关系。
在一种可能的实现方式中,所述一个或多个合规判断条件包括一个或多个第一合规判断条件,所述数据处理记录涉及所述一个或多个第一合规判断条件;
所述确定模块103,具体用于:
当所述一个或多个第一合规判断条件之间的关系为相与关系时,若所述一个或多个第一合规判断条件中的每个合规判断条件均属于合规,则确定所述数据处理记录合规。
在一种可能的实现方式中,所述一个或多个合规判断条件包括一个或多个第二合规判断条件,所述数据处理记录涉及所述一个或多个第二合规判断条件;
所述确定模块103,具体用于:
当所述一个或多个第二合规判断条件之间的关系为相或关系时,若所述一个或多个第二合规判断条件中的任一个合规判断条件属于合规,则确定所述数据处理记录合规。
在一种可能的实现方式中,所述一个或多个合规判断条件包括一个第三合规判断条件和一个或多个第四合规判断条件,所述数据处理记录涉及所述一个第三合规判断条件和所述一个或多个第四合规判断条件;
所述确定模块103,具体用于:
当所述一个第三合规判断条件包含所述一个或多个第四合规判断条件时,若所述一个第三合规判断条件属于不合规,则确定所述数据处理记录不合规,并进一步确定所述一个或多个第四合规判断条件的合规性。
在一种可能的实现方式中,所述装置,还包括:
配置模块104,用于为所述一个或多个合规判断条件设置优先级系数;
所述确定模块103,具体用于:
基于所述优先级系数对所述数据处理记录涉及的合规判断条件进行判断,确定所述数据处理记录的合规性。
需要说明的是,本申请实施例中所描述的合规性检测装置10中各功能单元的功能可参见上述图5a中所述的方法实施例中步骤S500-步骤S502的相关描述,此处不再赘述。
请参见图9,图9是本申请实施例提供的另一种数据处理的合规性检测装置的结构示意图。如图9所示,该装置20可以包括:一个或多个处理器601;一个或多个输入设备602,一个或多个输出设备603和存储器604。上述处理器601、输入设备602、输出设备603和存储器604通过总线605连接。存储器604用于存储计算机程序,所述计算机程序包括程序指令,处理器601用于执行存储器604存储的程序指令。
其中,处理器601被配置用于调用所述程序指令执行:获取规则信息,并对所述规则信息进行知识抽取;所述规则信息包括中国个人信息保护法语料、通用数据保护条例GDPR语料或中国数据安全法语料中的一种或多种;基于所述知识抽取的结果构建一个或多个知识图谱实体,并建立所述一个或多个知识图谱实体之间的关系,生成知识图谱;所述一个或多个知识图谱实体包括一个或多个合规判断条件、一个或多个合规状态中的一种或多种,所述一个或多个合规判断条件为是否遵守中国个人信息保护法、GDPR或中国数据安全法中的一种或多种的合规性的一个或多个判断条件,所述一个或多个合规状态为是否遵守中国个人信息保护法、GDPR或中国数据安全法中的一种或多种的合规性的可能的判断结果;获取数据处理者或数据控制者的待检测的数据处理记录并输入至所述知识图谱,确定所述数据处理记录的合规性;所述数据处理记录包括处理人、处理时间、处理的具体操作类型和具体数据对象类型中的一种或多种。
应当理解,在本申请实施例中,所称处理器601可以是中央处理单元(Central Processing Unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
输入设备602可以包括触控板、麦克风等,输出设备603可以包括显示器(LCD等)、扬声器等。
该存储器604可以包括只读存储器和随机存取存储器,并向处理器601提供指令和数据。存储器604的一部分还可以包括非易失性随机存取存储器。例如,存储器604还可以存储设备类型的信息。
本申请中描述的合规性检测装置的范围并不限于此,而且合规性检测装置的结构可以不受图9的限制。该装置可以是独立的设备或者可以是较大设备的一部分。例如所述装置可以是:
(1)独立的集成电路IC,或芯片,或,芯片系统或子系统;
(2)具有一个或多个IC的集合,可选的,该IC集合也可以包括用于存储数据,计算机程序的存储部件;
(3)ASIC,例如调制解调器(Modem);
(4)可嵌入在其他设备内的模块;
(5)接收机、终端、智能终端、蜂窝电话、无线设备、手持机、移动单元、车载设备、 网络设备、云设备、人工智能设备等等;
(6)其他等等。
具体实现中,本申请实施例中所描述的处理器601、输入设备602、输出设备603可执行本申请实施例提供的数据处理的合规性检测方法中所描述的实现方式,也可执行本申请实施例所描述的数据处理的合规性检测装置的实现方式,在此不再赘述。作为一种可能的产品形态,本申请实施例所述的装置,可以由通用处理器来实现。应理解,上述各种产品形态的装置,具有上述方法实施例中数据处理的合规性检测方法的任意功能,此处不再赘述。
本申请实施例还提供一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,该计算机程序包括程序指令,该程序指令被处理器执行时实现图5a所示的数据处理的合规性检测方法,具体细节请参照图5a所示实施例的描述,在此不再赘述。
上述计算机可读存储介质可以是前述任一实施例所述的合规性检测装置或电子设备的内部存储单元,例如电子设备的硬盘或内存。该计算机可读存储介质也可以是该电子设备的外部存储设备,例如该电子设备上配备的插接式硬盘,智能存储卡(smart media card,SMC),安全数字(secure digital,SD)卡,闪存卡(flash card)等。进一步地,该计算机可读存储介质还可以既包括该电子设备的内部存储单元也包括外部存储设备。该计算机可读存储介质用于存储该计算机程序以及该电子设备所需的其他程序和数据。该计算机可读存储介质还可以用于暂时地存储已经输出或者将要输出的数据。
本申请实施例还提供一种计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行前述任一实施例中的方法。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请是参照本申请实施例的方法、装置和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程的处理设备的处理器以产生一个机器,使得通过计算机或其他可编程的处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程的处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程的处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管结合具体特征及其实施例对本申请进行了描述,显而易见的,在不脱离本申请的精神和范围的情况下,可对其进行各种修改和组合。相应地,本说明书和附图仅仅是所附权利 要求所界定的本申请的示例性说明,且视为已覆盖本申请范围内的任意和所有修改、变化、组合或等同物。显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (18)

  1. 一种数据处理的合规性检测方法,其特征在于,包括:
    获取规则信息,并对所述规则信息进行知识抽取;所述规则信息包括中国个人信息保护法语料、通用数据保护条例GDPR语料或中国数据安全法语料中的一种或多种;
    基于所述知识抽取的结果构建一个或多个知识图谱实体,并建立所述一个或多个知识图谱实体之间的关系,生成知识图谱;所述一个或多个知识图谱实体包括一个或多个合规判断条件、一个或多个合规状态中的一种或多种,所述一个或多个合规判断条件为是否遵守中国个人信息保护法、GDPR或中国数据安全法中的一种或多种的合规性的一个或多个判断条件,所述一个或多个合规状态为是否遵守中国个人信息保护法、GDPR或中国数据安全法中的一种或多种的合规性的可能的判断结果;
    获取数据处理者或数据控制者的待检测的数据处理记录并输入至所述知识图谱,确定所述数据处理记录的合规性;所述数据处理记录包括处理人、处理时间、处理的具体操作类型和具体数据对象类型中的一种或多种。
  2. 如权利要求1所述的方法,其特征在于,所述建立所述一个或多个知识图谱实体之间的关系,生成知识图谱,包括:
    通过决策树建立所述一个或多个知识图谱实体之间的关系,生成所述知识图谱;所述决策树包括一个或多个根节点、一个或多个内部节点和一个或多个叶子节点中的一种或多种,所述根节点用于接收所述数据处理记录,所述内部节点用于存储处理人、处理时间、处理的具体操作类型和具体数据对象类型中的一种或多种,所述决策树的叶子节点用于存储所述一个或多个知识图谱实体。
  3. 如权利要求1-2中任一项所述的方法,其特征在于,所述一个或多个知识图谱实体包括所述一个或多个合规判断条件和所述一个或多个合规状态;其中,所述一个或多个知识图谱实体中的各个合规判断条件之间的关系包括相与关系、相或关系或包含关系中的一种或多种;所述一个或多个知识图谱实体中的所述一个或多个合规判断条件与所述一个或多个合规状态之间的关系包括属于关系。
  4. 如权利要求3所述的方法,其特征在于,所述一个或多个合规判断条件包括一个或多个第一合规判断条件,所述数据处理记录涉及所述一个或多个第一合规判断条件;
    所述确定所述数据处理记录的合规性,包括:
    当所述一个或多个第一合规判断条件之间的关系为相与关系时,若所述一个或多个第一合规判断条件中的每个合规判断条件均属于合规,则确定所述数据处理记录合规。
  5. 如权利要求3所述的方法,其特征在于,所述一个或多个合规判断条件包括一个或多个第二合规判断条件,所述数据处理记录涉及所述一个或多个第二合规判断条件;
    所述确定所述数据处理记录的合规性,包括:
    当所述一个或多个第二合规判断条件之间的关系为相或关系时,若所述一个或多个第二合规判断条件中的任一个合规判断条件属于合规,则确定所述数据处理记录合规。
  6. 如权利要求3所述的方法,其特征在于,所述一个或多个合规判断条件包括一个第三 合规判断条件和一个或多个第四合规判断条件,所述数据处理记录涉及所述一个第三合规判断条件和所述一个或多个第四合规判断条件;
    所述确定所述数据处理记录的合规性,包括:
    当所述一个第三合规判断条件包含所述一个或多个第四合规判断条件时,若所述一个第三合规判断条件属于不合规,则确定所述数据处理记录不合规,并进一步确定所述一个或多个第四合规判断条件的合规性。
  7. 如权利要求1-6中任一项所述的方法,其特征在于,所述方法,还包括:
    为所述一个或多个合规判断条件设置优先级系数;
    所述确定所述数据处理记录的合规性,包括:
    基于所述优先级系数对所述数据处理记录涉及的合规判断条件进行判断,确定所述数据处理记录的合规性。
  8. 一种数据处理的合规性检测装置,其特征在于,包括:
    获取模块,用于获取规则信息,并对所述规则信息进行知识抽取;所述规则信息包括中国个人信息保护法语料、通用数据保护条例GDPR语料或中国数据安全法语料中的一种或多种;
    处理模块,用于基于所述知识抽取的结果构建一个或多个知识图谱实体,并建立所述一个或多个知识图谱实体之间的关系,生成知识图谱;所述一个或多个知识图谱实体包括一个或多个合规判断条件、一个或多个合规状态中的一种或多种,所述一个或多个合规判断条件为是否遵守中国个人信息保护法语料、GDPR或中国数据安全法中的一种或多种的合规性的一个或多个判断条件,所述一个或多个合规状态为是否遵守中国个人信息保护法语料、GDPR或中国数据安全法中的一种或多种的合规性的可能的判断结果;
    确定模块,用于获取数据处理者或数据控制者的待检测的数据处理记录并输入至所述知识图谱,确定所述数据处理记录的合规性;所述数据处理记录包括处理人、处理时间、处理的具体操作类型和具体数据对象类型中的一种或多种。
  9. 如权利要求8所述的装置,其特征在于,所述确定模块,具体用于:
    通过决策树建立所述一个或多个知识图谱实体之间的关系,生成所述知识图谱;所述决策树包括一个或多个根节点、一个或多个内部节点和一个或多个叶子节点中的一种或多种,所述根节点用于接收所述数据处理记录,所述内部节点用于存储处理人、处理时间、处理的具体操作类型和具体数据对象类型中的一种或多种,所述决策树的叶子节点用于存储所述一个或多个知识图谱实体。
  10. 如权利要求8-9中任一项所述的装置,其特征在于,所述一个或多个知识图谱实体包括所述一个或多个合规判断条件和所述一个或多个合规状态;其中,所述一个或多个知识图谱实体中的各个合规判断条件之间的关系包括相与关系、相或关系或包含关系中的一种或多种;所述一个或多个知识图谱实体中的所述一个或多个合规判断条件与所述一个或多个合规状态之间的关系包括属于关系。
  11. 如权利要求10所述的装置,其特征在于,所述一个或多个合规判断条件包括一个或 多个第一合规判断条件,所述数据处理记录涉及所述一个或多个第一合规判断条件;
    所述确定模块,具体用于:
    当所述一个或多个第一合规判断条件之间的关系为相与关系时,若所述一个或多个第一合规判断条件中的每个合规判断条件均属于合规,则确定所述数据处理记录合规。
  12. 如权利要求10所述的装置,其特征在于,所述一个或多个合规判断条件包括一个或多个第二合规判断条件,所述数据处理记录涉及所述一个或多个第二合规判断条件;
    所述确定模块,具体用于:
    当所述一个或多个第二合规判断条件之间的关系为相或关系时,若所述一个或多个第二合规判断条件中的任一个合规判断条件属于合规,则确定所述数据处理记录合规。
  13. 如权利要求10所述的装置,其特征在于,所述一个或多个合规判断条件包括一个第三合规判断条件和一个或多个第四合规判断条件,所述数据处理记录涉及所述一个第三合规判断条件和所述一个或多个第四合规判断条件;
    所述确定模块,具体用于:
    当所述一个第三合规判断条件包含所述一个或多个第四合规判断条件时,若所述一个第三合规判断条件属于不合规,则确定所述数据处理记录不合规,并进一步确定所述一个或多个第四合规判断条件的合规性。
  14. 如权利要求8~13中任一项所述的装置,其特征在于,所述装置,还包括:
    配置模块,用于为所述一个或多个合规判断条件设置优先级系数;
    所述确定模块,具体用于:
    基于所述优先级系数对所述数据处理记录涉及的合规判断条件进行判断,确定所述数据处理记录的合规性。
  15. 一种终端设备,其特征在于,包括处理器、输入设备、输出设备和存储器,所述处理器、输入设备、输出设备和存储器相互连接,其中,所述存储器用于存储计算机程序,所述计算机程序包括程序指令,所述处理器被配置用于调用所述程序指令,执行如权利要求1-7中任一项所述的方法。
  16. 一种计算机可读存储介质,其特征在于,所述计算机存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如权利要求1-7中任一项所述的方法。
  17. 一种计算机程序,其特征在于,所述计算机程序包括指令,当所述计算机程序被所述终端设备执行时,使得所述终端设备执行如权利要求1-7中任意一项所述的方法。
  18. 一种芯片系统,其特征在于,所述芯片系统包括至少一个处理器,存储器和接口电路,所述存储器、所述接口电路和所述至少一个处理器通过线路互联,所述至少一个存储器中存储有指令;所述指令被所述处理器执行时,权利要求1-7中任意一项所述的方法得以实现。
PCT/CN2022/132004 2021-11-18 2022-11-15 一种数据处理的合规性检测方法、装置和相关设备 WO2023088249A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111373056.8A CN116150384A (zh) 2021-11-18 2021-11-18 一种数据处理的合规性检测方法、装置和相关设备
CN202111373056.8 2021-11-18

Publications (1)

Publication Number Publication Date
WO2023088249A1 true WO2023088249A1 (zh) 2023-05-25

Family

ID=86354844

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/132004 WO2023088249A1 (zh) 2021-11-18 2022-11-15 一种数据处理的合规性检测方法、装置和相关设备

Country Status (2)

Country Link
CN (1) CN116150384A (zh)
WO (1) WO2023088249A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111782825A (zh) * 2020-08-20 2020-10-16 支付宝(杭州)信息技术有限公司 知识库构建方法及装置
EP3764265A1 (en) * 2019-07-12 2021-01-13 Commissariat à l'Energie Atomique et aux Energies Alternatives System, method and computer program product for monitoring compliance to legal requirements
CN112860872A (zh) * 2021-03-17 2021-05-28 广东电网有限责任公司 基于自学习的配电网操作票语义合规性的校验方法及系统
CN113128231A (zh) * 2021-04-25 2021-07-16 深圳市慧择时代科技有限公司 一种数据质检方法、装置、存储介质和电子设备
WO2021196520A1 (zh) * 2020-03-30 2021-10-07 西安交通大学 一种面向税务领域知识图谱的构建方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3764265A1 (en) * 2019-07-12 2021-01-13 Commissariat à l'Energie Atomique et aux Energies Alternatives System, method and computer program product for monitoring compliance to legal requirements
WO2021196520A1 (zh) * 2020-03-30 2021-10-07 西安交通大学 一种面向税务领域知识图谱的构建方法及系统
CN111782825A (zh) * 2020-08-20 2020-10-16 支付宝(杭州)信息技术有限公司 知识库构建方法及装置
CN112860872A (zh) * 2021-03-17 2021-05-28 广东电网有限责任公司 基于自学习的配电网操作票语义合规性的校验方法及系统
CN113128231A (zh) * 2021-04-25 2021-07-16 深圳市慧择时代科技有限公司 一种数据质检方法、装置、存储介质和电子设备

Also Published As

Publication number Publication date
CN116150384A (zh) 2023-05-23

Similar Documents

Publication Publication Date Title
US10416966B2 (en) Data processing systems for identity validation of data subject access requests and related methods
CN116506217B (zh) 业务数据流安全风险的分析方法、系统、存储介质及终端
TWI734466B (zh) 針對隱私資料洩漏的風險評估方法及裝置
CN112241543A (zh) 一种基于数据中台的敏感数据梳理方法
CN111783045B (zh) 基于分级分类的数据授权方法和装置
CN112417492A (zh) 基于数据分类分级的服务提供方法
US20220005126A1 (en) Virtual assistant for recommendations on whether to arbitrate claims
CN111488594B (zh) 一种基于云服务器的权限检查方法、装置、存储介质及终端
US10192262B2 (en) System for periodically updating backings for resource requests
Mantha et al. Assessment of the cybersecurity vulnerability of construction networks
US9058470B1 (en) Actual usage analysis for advanced privilege management
CN111931239A (zh) 一种数据库安全防护用数据防泄漏系统
US10013237B2 (en) Automated approval
CN115238247A (zh) 基于零信任数据访问控制系统的数据处理方法
WO2023088249A1 (zh) 一种数据处理的合规性检测方法、装置和相关设备
Chang et al. Risk factors of enterprise internal control: Governance refers to internet of things (iot) environment
WO2023031938A1 (en) System and method for managing data access requests
CN113064731B (zh) 基于云边端架构的大数据处理终端设备、处理方法和介质
CN112000727B (zh) 一种动态配置业务数据脱敏显示方法
WO2020228564A1 (zh) 一种应用服务方法与装置
CN116611093B (zh) 一种数据库资源的使用授权方法及设备
Liang et al. Modeling and global conflict analysis of firewall policy
CN116910651A (zh) 一种基于分级分类的数据安全治理方法、装置及可读介质
CN115567461A (zh) 一种基于分级的api动态防护方法
Zhou et al. A Method of Dynamically Associating Behavior Risks based on Time Thread in Smartphones

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22894782

Country of ref document: EP

Kind code of ref document: A1