CN115935421A - Data product publishing method, system and storage medium - Google Patents

Data product publishing method, system and storage medium Download PDF

Info

Publication number
CN115935421A
CN115935421A CN202211664319.5A CN202211664319A CN115935421A CN 115935421 A CN115935421 A CN 115935421A CN 202211664319 A CN202211664319 A CN 202211664319A CN 115935421 A CN115935421 A CN 115935421A
Authority
CN
China
Prior art keywords
data
product
determining
information
compliance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211664319.5A
Other languages
Chinese (zh)
Other versions
CN115935421B (en
Inventor
顾逸圣
刘汪根
陆懿庭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Transwarp Technology Shanghai Co Ltd
Original Assignee
Transwarp Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Transwarp Technology Shanghai Co Ltd filed Critical Transwarp Technology Shanghai Co Ltd
Priority to CN202211664319.5A priority Critical patent/CN115935421B/en
Publication of CN115935421A publication Critical patent/CN115935421A/en
Application granted granted Critical
Publication of CN115935421B publication Critical patent/CN115935421B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Storage Device Security (AREA)

Abstract

The invention discloses a data product publishing method, a data product publishing system and a storage medium. The method comprises the following steps: performing static desensitization and security policy configuration on the data to be issued according to the acquired metadata information of the data to be issued and the acquired compliance requirements of the data product, and storing and determining the compliance data to be issued; determining data communication information according to a storage mode of the compliance data to be released, and determining version information according to the modification times of the compliance data to be released; packaging and determining metadata information, a security policy, data communication information and version information as a data product to be released, and defining a consumption mode of the data product to be released according to the data communication information; and determining the product number of the data product to be released according to the metadata information, the security policy, the data communication information, the version information and the consumption mode, and releasing the numbered data product to be released to a data market in a data product release system. The technical scheme provided by the embodiment of the invention ensures the compliance of the whole data transaction process.

Description

Data product publishing method, system and storage medium
Technical Field
The present invention relates to the field of data security technologies, and in particular, to a method, a system, and a storage medium for publishing a data product.
Background
In the present society, data has become a key element for high-quality development and comprehensive digitalization of the economic society as a new production element. Data resources increasingly become production elements and strategic assets of human society, the opening and circulation of data are the premise and the basis of embodying the value of the data, and under the requirements of related laws and regulations for information and data security protection, how to safely open and share the data gradually becomes the problem which exerts the value of various industry data elements and needs to be solved urgently.
In the process of data transaction, national laws and regulations have strict safety compliance requirements for each step, and in the current scheme of data transaction and circulation, most of the prior schemes focus on a certain isolated process in the life cycle of data, such as processing for adding noise to personal information in the data of transaction, and marking the processes of application, approval and transaction by using a block chain intelligent contract technology so as to access and freeze information issued on a chain in real time when risks are found.
However, in the existing scheme, the compliance of the data product is not systematically limited by the laws and regulations in China, and the improvement of each isolated process is difficult to combine to achieve a better improvement effect. For example, when personal information in data is processed through differential privacy, the probability limit for the data is large, and data-defined query analysis cannot be realized, and the blockchain technology cannot meet the requirements of high throughput, low storage cost and easiness in deployment, so that the existing data transaction process is limited more, and a data product improved by an isolated process may cause a security risk in data transaction use due to insufficient security check.
Disclosure of Invention
The invention provides a data product publishing method, a data product publishing system and a storage medium, which ensure the compliance of data publishing and the safety in the data publishing process and reduce the risk of data circulation.
In a first aspect, an embodiment of the present invention provides a data product publishing method, which is applied to a data providing end of a data product publishing system, and includes:
extracting data information of the acquired data to be issued, and determining corresponding metadata information;
performing static desensitization and security policy configuration on data to be issued according to the metadata information and the acquired compliance requirements of the data product, and determining and storing the compliance data to be issued;
determining data communication information according to a storage mode of the compliance data to be released, and determining version information according to the modification times of the compliance data to be released;
packaging and determining metadata information, a security policy, data communication information and version information as a data product to be released, and defining a consumption mode of the data product to be released according to the data communication information;
and determining the product number of the data product to be released according to the metadata information, the security policy, the data communication information, the version information and the consumption mode, and releasing the numbered data product to be released to a data market in a data product release system.
In a second aspect, an embodiment of the present invention further provides a data product publishing method, which is applied in a data market of a data product publishing system, and includes:
after a data product application request of a data demand end of a data product publishing system is received, checking a data consumption environment of the data demand end;
if the data consumption environment is consistent with the consumption mode of the data product corresponding to the data product application request and comprises an encrypted database and a development tool corresponding to the consumption mode, pushing the security policy of the data product to the encrypted database;
and sending the connection information of the encryption database and the data communication information of the data product to a data providing end corresponding to the data product so that the encryption database receives compliance data corresponding to the data product.
In a third aspect, an embodiment of the present invention further provides a data product publishing system, including at least one data providing end, at least one data demanding end, and a data market, where each data providing end and each data demanding end are located in domains isolated from each other;
the data providing terminal is used for extracting data information, performing data compliance processing, determining data communication information, determining version information and determining a consumption mode of the acquired data to be released, generating a data product to be released containing a product number and releasing the data product to be released to a data market;
the data market is used for determining a target data product according to the data product application request and checking the data consumption environment of the data demand end according to the consumption mode corresponding to the target data product when the data product application request of the data demand end is received;
the data demand end is used for receiving the security policy of a target data product pushed by a data market when the data demand end has a data consumption environment, and storing the security policy into the encryption database;
the data market is used for sending the connection information of the encrypted database and the data communication information of the target data product to the data providing end;
the data providing end is used for pushing compliance data corresponding to the target data product to the encryption database according to the connection information of the encryption database and the data communication information of the target data product;
the product to be released comprises metadata information, a security policy, data communication information and version information, and the product number comprises the metadata information, the security policy, the data communication information, the version information and a consumption mode.
In a fourth aspect, the embodiments of the present invention further provide a computer-readable storage medium, where computer instructions are stored, and the computer instructions are configured to enable a processor to implement the data product issuing method according to any one of the embodiments of the present invention when the computer instructions are executed.
According to the data product release method, the data product release system and the storage medium, the data information of the acquired data to be released is extracted, and the corresponding metadata information is determined; performing static desensitization and security policy configuration on data to be issued according to the metadata information and the acquired compliance requirements of the data product, and determining and storing the compliance data to be issued; determining data communication information according to a storage mode of the compliance data to be issued, and determining version information according to the number of times of modification of the compliance data to be issued; packaging and determining metadata information, a security policy, data communication information and version information as a data product to be released, and defining a consumption mode of the data product to be released according to the data communication information; and determining the product number of the data product to be released according to the metadata information, the security policy, the data communication information, the version information and the consumption mode, and releasing the numbered data product to be released to a data market in a data product release system. By adopting the technical scheme, after a data source is developed at a data providing end to obtain data to be published, basic information of the data to be published is extracted, static desensitization and security policy configuration are carried out on the data to be published according to the extracted basic information and the compliance requirements of pre-obtained data products, so that the corresponding security policies are dynamically configured for different types of data demand ends on the basis that the configured compliance data to be published meets the national legal and regulatory requirements, data communication information determined according to a storage mode of the compliance data to be published, version information determined according to the number of modification times, consumption mode information defined according to the data communication information and other information are combined, the data products to be published are numbered, and the data products are published to a data market. The issued data product completely comprises parameters for keeping data compliance in the whole data life cycle, the problems that data circulation safety control is isolated and the data product issuing safety is difficult to guarantee are solved, the compliance data which are corresponding to the data product and contain the safety strategy are sent to the data demand end of the request data product, the control precision of the data supply end on the data use mode of the data demand end is improved, the compliance of the whole data transaction process is guaranteed, the issued data product is guaranteed, and no safety risk exists in the use process.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present invention, nor do they necessarily limit the scope of the invention. Other features of the present invention will become apparent from the following description.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart of a data product publishing method according to a first embodiment of the present invention;
FIG. 2 is a flowchart of a method for publishing a data product according to a second embodiment of the present invention;
fig. 3 is a schematic flow chart illustrating that the static desensitization data is determined by performing de-identification processing on the to-be-released data according to the second embodiment of the present invention;
FIG. 4 is a flowchart of a data product publishing method according to a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of a data product distribution system in the fourth embodiment of the present invention;
fig. 6 is a diagram illustrating a structure of a data product distribution system in a fourth embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example one
Fig. 1 is a flowchart of a data product publishing method according to an embodiment of the present invention, where the embodiment of the present invention is applicable to a situation where a data providing end constructs a data product and completes publishing the data product, and the method may be applied to a data providing end of a data product publishing system, where the data providing end may be implemented by software and/or hardware, and the data providing end may be configured in a private domain logically isolated from a public network, or in an environment where internal data privacy of the data providing end can be guaranteed, which is not limited in this embodiment of the present invention.
As shown in fig. 1, a method for publishing a data product provided by an embodiment of the present invention specifically includes the following steps:
s101, extracting data information of the acquired data to be issued, and determining corresponding metadata information.
In this embodiment, the data providing end may be specifically understood as a main body in the data product publishing system, which is used to generate and publish the data product and provide the actual data corresponding to the data product externally. The data demand side can be specifically understood as a main body in the data product release system, which needs to acquire the actual data of the data product drinking and develop the acquired actual data. The data product can be specifically understood as a carrier for sharing basic information of actual data from a data providing end to a data demanding end and providing modes and requirements. The data to be released can be specifically understood as data which is obtained by a data provider after development according to the acquired original data and can be used for correspondingly generating a data product and providing the data product to a data demand side after processing. The metadata information may be specifically understood as information extracted from the data to be published and used for describing features of the data to be published, and for example, the metadata information may be description information such as names of a library, a table, and a data field, such as a file name, a size, creation time, a creator, a list name of a library, and sample data, and may also be used for describing basic information of a data product corresponding to the data to be published, which is not limited in this embodiment of the present invention.
Specifically, after the obtained original data is developed by using a development tool in the data providing end, to-be-released data which can be used for generating a data product to be released to the outside is obtained, and the content which can describe the characteristic information of the to-be-released data is extracted through a preset data identification rule or a manual calibration mode, so that metadata information corresponding to the to-be-released data is obtained.
For example, the data product publishing system may be deployed in a private cloud form, a single data providing end may create a logically isolated enterprise tenant on the cloud as an environment for business execution, when data to be published is generated, a database may be directly created under the tenant of the data providing end, and the database is used as a data source for storing original data, or the original data is imported into the database of the data providing end through a derivative tool in/out of the tenant, and a developer of the data providing end performs a development operation on the original data in the database through a development tool arranged in the data providing end, and determines the data obtained after the development as the data to be published. Optionally, the development tool in the data provider may be an Extract-Transform-Load (ETL) tool, a Structured Query Language (SQL) editor, and the like, which is not limited in the embodiment of the present invention, and the inside of the development tool depends on a strongly isolated work area to implement control of data visibility, so that different developers can associate data sources from the inside or the far end of the data provider to the development tool in their respective supply areas to execute development operations.
S102, performing static desensitization and security policy configuration on the to-be-issued data according to the metadata information and the acquired compliance requirements of the data product, and determining and storing the to-be-issued compliance data.
In the embodiment, the compliance requirement of the data product can be specifically understood as a requirement that the issued data product does not reveal privacy and meet the requirement that the sensitivity requirement of a specific natural person cannot be identified without additional information according to the national law and regulation regulations or adaptively set according to the actual situation. Static desensitization can be specifically understood as a desensitization scheme for shielding and deforming sensitive data and reducing the sensitivity level by using a preset model, desensitization algorithm or other processing methods, and is suitable for desensitizing data with a general desensitization rule, such as desensitization of personal information. The security policy can be specifically understood as desensitization and protection policy set according to national laws and regulations, industry guidelines and actual conditions of different data demand terminals, so that the data demand terminals access corresponding data of data products, and the desensitization and protection policy can be dynamically adjusted for different objects to reject or desensitize access to fields of a certain security level or fields of a certain field type at the time of access. The compliance data to be released can be specifically understood as data which is subjected to compliance processing, can be stored in the data providing end as actual data corresponding to the data product and is provided to the data requiring end.
Specifically, the sensitive states of various types of data in the data to be issued are distinguished by using metadata information, the data which needs to be subjected to static desensitization in the data to be issued are determined according to the compliance requirements of data products, the operations such as shielding and deformation are performed on the data, the sensitive states of the data to be issued which meet the static desensitization conditions are reduced, meanwhile, the security policies which meet different application conditions are determined according to the compliance requirements of the data products, the security policies are configured in the data to be issued which complete the static desensitization, the compliance data to be issued are obtained, and the compliance data to be issued are stored in a database of a data providing end.
In the embodiment of the invention, the data to be issued is respectively subjected to static desensitization and security policy configuration according to the compliance requirements of the data product, and desensitization treatment on the data to be issued is realized according to different conditions, so that the compliance and adaptability of the data corresponding to the data product to be issued are ensured, the control strength of the data providing end on the data demand end for data use is enhanced, and the data use security is ensured.
S103, determining data communication information according to the storage mode of the compliance data to be released, and determining version information according to the modification times of the compliance data to be released.
In this embodiment, the data connectivity information may be specifically understood as actual data used for describing the direction of the data product, and the connection information of the location in the data providing end may be represented in different forms according to different delivery forms, that is, in other words, the storage form of the compliance data to be released in the data providing end in different delivery forms. It is to be noted that the data communication information is not open to the data requiring end, and is called only after the delivery form is determined, and belongs to the default encrypted information, so as to avoid data leakage. The version information may be specifically understood as information for recording the version condition of the data product itself and the actual content corresponding to each version, and it is clear that the version information is updated with each modification or parameter change for the compliance data to be released.
Specifically, because the compliance data to be published is stored in the database of the data providing end, the storage modes for accessing the storage location of the compliance data to be published are defined to be different according to different data delivery modes, such as API delivery or federal study delivery, and the like, the set of the storage modes corresponding to the different delivery modes is determined as the data communication information of the compliance data to be published, and meanwhile, the version information of the current compliance data to be published is determined according to the number of times of modification of the compliance data to be published after the completion of development. That is, the version of the compliance data to be issued after desensitization and security policy configuration are completed for the first time is the first version, and then the version information is changed in an accumulated manner for each modification of the compliance data to be issued.
And S104, packaging and determining the metadata information, the security policy, the data communication information and the version information as a data product to be released, and defining a consumption mode of the data product to be released according to the data communication information.
In this embodiment, the data product to be distributed may be specifically understood as a carrier that is not yet distributed to the public network for other entities to access, and includes basic information of compliance data stored in the data providing terminal and providing modes and requirements. The consumption mode can be specifically understood as a form that the data providing end delivers the corresponding data of the data product to the data requiring end.
Specifically, metadata information, a security policy, data communication information and version information corresponding to the compliance data to be released are packaged into a set, the set is a data product to be released corresponding to the compliance data to be released, the data product to be released includes information on the content of the compliance data to be released and the issuing mode of the compliance data to be released, and the data communication information of the data product to be released includes multiple different storage modes, so that the data product to be released needs to be issued and approved first before being released to a data market, and at the moment, a person participating in approval of a shared process at a data providing end registers and enters an organizational relationship in the data market, that is, different consumption modes can be defined for different data requiring ends aiming at the same data product to be released according to the importance of the data product to be released and the property of a data requiring end capable of accessing the data of the data product to be released.
Illustratively, the same data product to be released can be delivered through different delivery modes, if all delivery modes corresponding to the data communication information are embodied in the consumption mode of the same data product to be released, all data demand terminals applying for the data product to be released can consume the data corresponding to the data product to be released through the consumption mode with the lowest security, so that in order to guarantee the consumption security of different types of data demand terminals, a plurality of different consumption modes can be determined for the same data product to be released according to different delivery modes.
And S105, determining the product number of the data product to be released according to the metadata information, the security policy, the data communication information, the version information and the consumption mode, and releasing the numbered data product to be released to a data market in a data product release system.
In this embodiment, the product number may be specifically understood as a number used for uniquely identifying the data product to be released at the data market and the data providing end. The data market may be specifically understood as a body deployed in a public network to carry a plurality of data products for access by a data consumer accessing the data products.
Specifically, by presetting a product number rule, different types corresponding to metadata information, different types of security policies, different storage modes corresponding to data communication information, version numbers and different consumption modes are numbered and defined to obtain a product number uniquely identifying a data product to be released, the product number can be used as an abstract of the data product to be released, basic information of the data product to be released is summarized, and the data product to be released is retrieved by each data demand terminal after being released to a data market. Marking the data products to be issued through the product numbers, namely, after the product numbers and the data products to be issued form a one-to-one correspondence relationship, issuing the numbered data products to be issued to a data market in a data product issuing system, so that a data demand end accessing the data market can access the data market.
For example, assuming that data connectivity information corresponding to a data product to be released is denoted as CI, a security policy is denoted as PS, version information is denoted as V, a consumption mode is denoted as CM, and metadata information is denoted as AI, a product number PC generated corresponding to the data product to be released may be denoted as:
PC=digest(CI,AI,PS,V,CM)。
according to the technical scheme of the embodiment, data information extraction is carried out on the acquired data to be issued, and corresponding metadata information is determined; performing static desensitization and security policy configuration on data to be issued according to the metadata information and the acquired compliance requirements of the data product, and determining and storing the compliance data to be issued; determining data communication information according to a storage mode of the compliance data to be released, and determining version information according to the modification times of the compliance data to be released; packaging and determining metadata information, a security policy, data communication information and version information as a data product to be released, and defining a consumption mode of the data product to be released according to the data communication information; and determining the product number of the data product to be released according to the metadata information, the security policy, the data communication information, the version information and the consumption mode, and releasing the numbered data product to be released to a data market in a data product release system. By adopting the technical scheme, after a data source is developed at a data providing end to obtain data to be published, basic information of the data to be published is extracted, static desensitization and security policy configuration are carried out on the data to be published according to the extracted basic information and the compliance requirements of pre-obtained data products, so that the corresponding security policies are dynamically configured for different types of data demand ends on the basis that the configured compliance data to be published meets the national legal and regulatory requirements, data communication information determined according to a storage mode of the compliance data to be published, version information determined according to the number of modification times, consumption mode information defined according to the data communication information and other information are combined, the data products to be published are numbered, and the data products are published to a data market. The issued data product completely comprises parameters for keeping data compliance in the whole data life cycle, the problems that data circulation safety control is isolated and the data product issuing safety is difficult to guarantee are solved, the compliance data which are corresponding to the data product and contain the safety strategy are sent to the data demand end of the request data product, the control precision of the data supply end on the data use mode of the data demand end is improved, the compliance of the whole data transaction process is guaranteed, the issued data product is guaranteed, and no safety risk exists in the use process.
Example two
Fig. 2 is a flowchart of a data product publishing method according to a second embodiment of the present invention, where the technical solution of the second embodiment of the present invention is further optimized based on the optional technical solutions, so as to specify a method for determining a data type and a security level of data to be published according to metadata information and a compliance requirement of a data product, and further perform de-identification processing on the data to be published according to the determined data type and security level, so as to achieve a purpose of static desensitization, and further determine a method for formulating a security policy for the data to be published according to the compliance requirement of the data product. Meanwhile, the method for defining the consumption mode of the data product to be issued according to different data communication information is determined, so that the compliance processing for the development process of the data product is more systematic, the control precision of the data providing end to the data demand end data using mode is enhanced, the compliance of the whole data transaction process is guaranteed, the issued data product is guaranteed, and no safety risk exists in the using process.
As shown in fig. 2, a method for publishing a data product provided by the second embodiment of the present invention specifically includes the following steps:
s201, extracting data information of the acquired data to be issued, and determining corresponding metadata information.
S202, determining at least one data type of the data to be issued according to the metadata information.
In this embodiment, the data type may be specifically understood as a preset and defined description of data base classifications for different industries, for example, the data type may be a name, a gender, a customer name, a face, an ID, and the like, and the data types determined for the structured data to be published and the unstructured data to be published may be different, which is not limited in this embodiment of the present invention.
Specifically, the data to be published are classified according to the data basic information included in the metadata information corresponding to the data to be published, and the data to be published belonging to the same classification are classified into the same data type.
S203, identifying the data to be issued, the data types of which belong to the preset personal information types.
Wherein the identification process comprises a direct identifier identification process and a quasi identifier identification process.
In the present embodiment, the preset personal information type may be specifically understood as information preset according to actual conditions and capable of being directly or indirectly directed to a specific natural person. A direct identifier is understood in particular to be an identifier which identifies data containing personal information which can directly locate a natural person. A quasi-identifier is understood in particular to be an identifier which cannot be identified directly by data containing personal information of a natural person.
Specifically, determining that the data type in the data to be released belongs to the data of the preset personal information type, further determining whether the data type belongs to the data type capable of directly positioning the natural person, if so, adding a direct identifier to the data of the corresponding data type, and if not, adding a standard identifier to the data of the corresponding data type. For example, the data of the direct identifier type may be a name, an identification number and the like, which can directly indicate the identity of a natural person, or directly indicate the identity of a small part of the natural person; the quasi-identifier type data may be data such as age that does not directly indicate natural person identity or that points to a larger number of subjects.
And S204, determining the security level of the data corresponding to each data type in the data to be issued according to each data type and the compliance requirement of the data product.
In the present embodiment, the security level is specifically understood as a level which is defined in advance according to actual conditions or laws and regulations and indicates the importance of data.
Specifically, security level definitions are performed on different types of data types in advance according to the compliance requirements of data products, and after a plurality of corresponding data types in the data to be published are determined, the security level corresponding to the data type is determined as the security level corresponding to the data to be published. Furthermore, after the security levels of the data corresponding to the data types in the data to be released are matched according to the data types, the security levels corresponding to some data can be manually adjusted, the data must meet the requirements of laws and regulations in the manual adjustment process, and meanwhile, the security levels can only be increased but cannot be decreased.
In the embodiment of the invention, the security levels of different data types in the data to be issued are determined in a mode of combining data type matching and manual adjustment, so that the determination of the security levels is more in line with the generation requirements of actual data products, the security levels can only be increased but not reduced when manual adjustment is carried out, and the security of processing the data according to the security levels is improved.
It should be clear that there is no obvious precedence relationship between S203 and S204, and they may be executed simultaneously or according to different precedence orders.
S205, carrying out de-identification processing on the data to be issued, and determining static desensitization data.
In the present embodiment, the process of de-identification can be specifically understood as a technique that makes it impossible for an attacker to identify a specific subject from de-identified personal data by deleting or transforming a direct identifier or a quasi-identifier. The static desensitization data can be specifically understood as data to be issued after personal information type data de-identification is completed.
Specifically, the data marked by the direct identifier or the quasi-identifier in the data to be issued is subjected to de-identification processing, and the data to be issued which is subjected to de-identification is determined as the static desensitization data.
Further, fig. 3 is a schematic flowchart of a process for performing de-identification processing on data to be distributed and determining static desensitization data according to a second embodiment of the present invention, and as shown in fig. 3, the process specifically includes the following steps:
s2051, inputting the to-be-processed issuing data with the direct identifier or the quasi-identifier in the to-be-issued data into a preset de-identification model.
In this embodiment, the to-be-processed publishing data may be specifically understood as data marked by the identifier in the to-be-published data. The preset de-identification model is specifically understood to be a preset or pre-trained model which can perform marking processing on data marked by different types of identifiers. Optionally, the preset de-identification model may be a K-anonymity model, an L-diversity model, or the like, which is not limited in this embodiment of the present invention.
And S2052, determining the de-identification data and the de-identification rating corresponding to the de-identification data according to the output result.
Specifically, the output result of the preset de-identification model contains de-identification data obtained by performing de-identification processing on the input data, and meanwhile, the output de-identification data is subjected to de-identification rating according to pre-input data sharing type information while the data de-identification processing is performed, so that whether the de-identification purpose of the input data is finished or not is determined through the de-identification rating. Optionally, the pre-input data sharing type information may be a data type set according to an actual situation and allowed to be provided to the data demand side to a greater extent in the data to be issued. The rating standard of the de-identification rating may be preset according to a national standard, or may be adjusted according to an actual situation, which is not limited in the embodiment of the present invention.
And S2053, determining static desensitization data according to the de-identification data, the de-identification rating and the to-be-issued data.
Specifically, whether the processing of the de-identification data achieves the de-identification purpose is determined according to the de-identification rating, if so, corresponding data in the data to be issued can be replaced through the de-identification data, and the replaced data to be issued is determined as static desensitization data; if the static desensitization data is not reached, the aim of de-identification is not achieved, the de-identification data needs to be adjusted again at the moment until the aim of de-identification is achieved, corresponding data in the to-be-issued data are replaced through the de-identification data, and the replaced to-be-issued data are determined to be the static desensitization data.
Further, the static desensitization data is determined according to the de-identification data, the de-identification rating and the to-be-issued data, and the method can be specifically realized by the following steps:
and if the de-identification rating does not meet the preset rating condition, taking the de-identification data as new to-be-processed release data, adjusting parameters in the preset de-identification model, and returning to execute the step of inputting the to-be-processed release data with the direct identifier or the quasi identifier in the to-be-released data into the preset de-identification model. And if not, replacing the to-be-processed release data with the de-identified data, and determining the replaced to-be-released data as the static desensitization data.
In this embodiment, the preset rating condition may be specifically understood as a condition preset according to an actual situation and used for evaluating whether the output de-identification data achieves the de-identification purpose.
Specifically, if the de-identification rating does not satisfy the preset rating condition, it may be considered that the de-identification data output by the preset de-identification model does not achieve the desired de-identification purpose, in order to avoid waste of the de-identification processing operation, the currently obtained de-identification data is used as new to-be-processed release data in the to-be-released data, parameters in the preset de-identification model are adjusted, so that the de-identification granularity of the preset de-identification model is correspondingly adjusted, the new to-be-processed release data is input into the preset de-identification model after parameter adjustment until the de-identification rating output by the model can satisfy the preset rating condition, at this time, corresponding data in the to-be-released data is replaced by the de-identification data, and the replaced to-be-released data is determined as static desensitization data.
In the embodiment of the invention, the de-identification result of the data to be issued is evaluated through de-identification grading, and if the de-identification purpose is not achieved, the de-identification processing is further completed on the basis of the previous de-identification processing, so that the waste of data processing resources is avoided, the sensitivity of outputting static desensitization data is reduced, and the compliance of the data is ensured.
S206, determining the security policy of the static desensitization data according to the security level and the compliance requirement of the data product, and correspondingly configuring the security policy and the static desensitization data to obtain compliance data to be issued.
Specifically, because the static desensitization data only completes desensitization of the data to be issued, which contains the personal information type data, the static desensitization data has a higher security level, that is, the data with higher sensitivity does not perform desensitization operation, so as to ensure that the data product cannot access data which is unusable and has higher sensitivity after being provided to the data demand end, the security policies required by different data demand ends need to be customized in a personalized manner according to the security levels of other data and the compliance requirements of the data product, and then the configured static desensitization data is determined as the compliance data to be issued after the security policies are configured in the static desensitization data.
Further, determining a security policy for static desensitization data according to the security level and the data product compliance requirements may specifically include the following:
a. a security access level is determined according to the data product compliance requirements, and the security access level is determined as a first security policy for the static desensitization data.
In this embodiment, the security access level may be specifically understood as a security level that a data demand side preset according to a regulation can access data. It should be clear that the security access level is the most basic security policy making requirement, i.e. the data product compliance requirement must include the security access level.
Specifically, sensitive data with high security level can be prevented from being accessed on the most basic level through the security access level, so that corresponding security access levels can be determined for different data demand ends according to the compliance requirements of data products, and the security level in the static desensitized data equal to the security access level is determined as a first security policy. That is, the data requiring end suitable for the first security policy can only access the data with the security access level in the static desensitization data, thereby fundamentally ensuring the security of the sensitive data.
b. If the field access control requirement is included in the data product compliance requirements, the security access level and the field access control requirement are determined to be a second security policy for the static desensitized data.
In this embodiment, the field access control requirement may be specifically understood as a requirement for performing access control on a certain field in specific data.
Specifically, when the compliance requirements of the data product include field access control requirements, it may be considered that a higher requirement is provided for the data compliance, and the method is applicable to a special case based on the first security policy, that is, when the first security policy base is satisfied, the data providing end considers that the static desensitization data still needs to satisfy the control of accessing a certain field, and further may determine the security access level and the field access control requirements as the second security policy for the static desensitization data. That is, the data requiring end applicable to the second security policy may access only data with a security access level in the static desensitized data, and may not access a field of data with a security access level in the static desensitized data, or may access a field of data with a security access level in the static desensitized data.
c. If the user field access control requirement is included in the data product compliance requirement, the security access level and the user field access control requirement are determined to be a third security policy for the static desensitization data.
In this embodiment, the user field access control requirement may be specifically understood as a requirement for a specific user to enable the user to perform access control on a field in specific data.
Specifically, when the compliance requirements of the data product include user field access control requirements, it may be considered that higher requirements are provided for the data compliance, and the data compliance is applicable to a special case based on the first security policy, that is, when the first security policy base is satisfied, the data providing end does not consider that the static desensitization data needs to satisfy the control of accessing a certain field, but when the data are provided to a certain specific user, it still needs to satisfy the access control of the certain field, and further, the security access level and the user field access control requirements may be determined as a third security policy for the static desensitization data. That is, the data requiring end applicable to the third security policy may access only data with a security access level in the static desensitized data, and may not access data with a security access level in the static desensitized data, which is the same as the user field access control requirement, or may access data with a security access level in the static desensitized data, which is the same as the user field access control requirement and is not the security access level.
d. And if the data product compliance requirements comprise field access control requirements and user field access control requirements, determining the security access level, the field access control requirements and the user field access control requirements as a fourth security policy of the static desensitization data.
Specifically, when the compliance requirements of the data product include both field access control requirements and user field access control requirements, it may be considered that higher requirements are provided for the data compliance, and the requirements are applicable to a special case based on the second security policy, that is, when the second security policy is satisfied, when the static desensitization data is provided to a specific user, the static desensitization data still needs to satisfy the control of accessing a certain field therein, and further, the security access level, the field access control requirements, and the user field access control requirements may be determined as the fourth security policy for the static desensitization data. That is, the data requiring end applicable to the fourth security policy not only can access the data with the security level equal to the security access level in the static desensitization data, and needs to control the data corresponding to a certain field in the static desensitization data, but also needs to access and control the specific field corresponding to the field.
For example, assume security access levels of 1, 2, and 3, a name, identification card, and gender are data types of security access level 2, field access control requirements are to block access to the identification card field, and user field access control requirements are to block access to the gender information. Under the assumed condition, the data demand side executing the first security policy can access data with security levels of 1-3 in the static desensitized data; the data demand end executing the second security policy can access data with the security level of 1-3 in the static desensitization data, but cannot access the identity card information; the data demand end executing the third security policy can access data with security levels of 1-3 in the static desensitized data, but cannot access sex information; the data demand side executing the fourth security policy can access data with security levels of 1-3 in the static desensitization data, but cannot access the identity card information and the gender information.
And S207, storing the compliance data to be issued in a database in the data providing terminal.
And S208, determining data communication information according to the storage mode of the compliance data to be issued, and determining version information according to the number of times of modification of the compliance data to be issued.
S209, packaging and determining the metadata information, the security policy, the data communication information and the version information as a data product to be released.
S210, determining at least one data delivery mode of the data product to be issued according to the data communication information.
In this embodiment, the delivery mode may be specifically understood as information used to describe how the data product is delivered to the data demanding side, for example, the delivery mode may include a gateway call, API access, privacy calculation, federal study, and the like, which is not limited in this embodiment of the present invention.
Specifically, at least one feasible data delivery mode of the data product to be issued is determined according to different storage forms contained in the data communication information. Illustratively, the data delivery mode supports direct access to the encrypted database of the data provider through an SQL gateway; the data delivery mode supports data access through an API (application programming interface), and when the data access through the API is supported, an exposed API path, access parameters, corresponding database query statements and the like need to be further configured; the data delivery mode also comprises modes of privacy calculation, federal learning and the like.
S211, according to the property of the target data demand side corresponding to the data product to be issued, checking each data delivery mode, and determining the data delivery mode passing the checking as the consumption mode of the data product to be issued.
In this embodiment, the target data demand side may be specifically understood as a data demand side that can apply for a to-be-issued data product issued to a data market.
Specifically, the data delivery method includes the steps that a data product to be released is determined to be released to a data demand end expected to be applied in the middle and later periods of a data market, the data demand end is determined to be a target data demand end, the safety requirements of the target data demand end on a data transmission process and a data use process are further determined according to the property of the target data demand end, different data delivery modes of the data product to be released are audited based on the safety requirements, and the data delivery mode meeting the safety requirements is determined to be a consumption mode of the data product to be released.
S212, determining the product number of the data product to be released according to the metadata information, the security policy, the data communication information, the version information and the consumption mode, and releasing the numbered data product to be released to a data market in a data product release system.
According to the technical scheme, data classification and security level grading are carried out on data to be issued, so that the sensitivity condition of each data type of data is determined, identification and de-identification processing is carried out on the data which belong to a preset personal information type, corresponding static desensitization data are obtained, and then according to the security access level, field access control requirements and/or user field access control requirements in the security level and data product compliance requirements, security strategies with different priorities are configured for the static desensitization data, dynamic desensitization after data products are issued is achieved, meanwhile, the consumption mode of the data products to be issued is defined according to different data communication information, so that compliance processing in the data product development process is more systematic, the control accuracy of the data providing end to the data requiring end data using mode is improved, the compliance of the whole data transaction process is guaranteed, the issued data products are guaranteed, and no security risk exists in the using process.
EXAMPLE III
Fig. 4 is a flowchart of a data product publishing method provided by a third embodiment of the present invention, where the third embodiment of the present invention is applicable to a situation where a data market correspondingly provides a data product to a data demand side after receiving a request from the data demand side for the data product, and the method may be applied to a data market of a data product publishing system, where the data market may be implemented by software and/or hardware, the data market may be configured in a public network environment isolated from a data providing side and the data demand side, or the data market publishing side and the data market subscribing side may be respectively configured in a public network, the data providing side may implement publishing of the data product through the data market publishing side, and the data demand side may implement browsing and applying of the data product through accessing the data market subscribing side, and the third embodiment of the present invention is not limited thereto.
As shown in fig. 4, a method for publishing a data product provided by the third embodiment of the present invention specifically includes the following steps:
s301, after receiving a data product application request of a data demand end of a data product publishing system, checking a data consumption environment of the data demand end.
In this embodiment, the data product application request may be specifically understood as information that is sent to the data market by the data consumer accessing the data market and desires to request a certain data product distributed in the data market. The data consumption environment can be specifically understood as a configuration situation provided by a data demand side for supporting data product consumption.
Specifically, each data demand end accessed to the data market can browse a plurality of data products published on the data market, after the required data products are determined, the data demand end sends data product application requests to the corresponding data market aiming at the required data products, and at the moment, the data market carries out data consumption environment inspection on the data demand end according to the configuration condition required by delivery of the data products corresponding to the data product application requests.
S302, if the consumption mode of the data product corresponding to the data product application request is consistent with the consumption mode of the data product, and the data product application request comprises an encryption database and a development tool corresponding to the consumption mode, pushing the security policy of the data product to the encryption database.
Specifically, when the data consumption environment of the data demand end is consistent with the consumption mode of the data product corresponding to the data product application request, the data product can be considered to have the possibility of being delivered to the data demand end for development, and because the data product needs to transmit the compliance data corresponding to the data product to the data demand end for development and use when being delivered, the data demand end needs to be provided with an encryption database and a development tool corresponding to the consumption mode, otherwise, the data product can be considered to be incapable of being delivered to the data demand end, and when the data demand end is provided with the encryption database and the development tool corresponding to the consumption mode, the data market pushes the security policy of the data product to the encryption database of the data demand end, so that the data demand end can correspondingly store and configure the security policy, and further, after the connection mode of the encryption database is subsequently registered to the working areas corresponding to all the development tools, each development tool can acquire the data from the encryption database according to the security policy for development.
And S303, sending the connection information of the encryption database and the data communication information of the data product to a data providing end corresponding to the data product so that the encryption database receives compliance data corresponding to the data product.
In the present embodiment, compliance data may be specifically understood as actual data stored in a database at the data provider corresponding to the data product that has been configured by static desensitization and security policies.
Specifically, the data market can acquire connection information of an encryption database in a data demand end in the checking process, meanwhile, data communication information of a data product is information which is only visible for a data providing end corresponding to the data product, the data market provides the connection information and the data communication information to the data providing end, so that the data providing end can establish a secure data link between the database of the data providing end and the encryption database of the data demand end according to the information, and further the encryption database of the data demand end can receive compliance data corresponding to the data product through the secure data link.
Further, after checking the data consumption environment of the data demand side, the method further comprises:
and if the consumption modes of the data product corresponding to the data product application request and the data consumption environment are inconsistent, rejecting the data product application request.
Specifically, if the consumption pattern of the data product corresponding to the data product application request is inconsistent with the data consumption environment in the data demand side, the data demand side may be considered to be unable to apply for the data product, or even if the data product is applied, delivery of compliance data corresponding to the data product may not be realized, and at this time, the data market rejects the data product application request of the data demand side, so as to reduce data processing pressure on the data providing side caused by incorrect application.
Further, after checking the data consumption environment of the data demand side, the method further comprises:
and if the data consumption environment is consistent with the consumption mode of the data product corresponding to the data product application request and does not comprise the encrypted database and the development tool corresponding to the consumption mode, constructing or prompting to construct the data consumption environment at the data demand end.
Specifically, if the data consumption environment in the data demand end is consistent with the consumption mode of the data product corresponding to the data product application request, the data product may be considered to have the possibility of being delivered to the data demand end for development, and since the data product needs to transmit the compliance data corresponding to the data product to the data demand end for development and use when being delivered, the data demand end needs to have an encryption database and a development tool corresponding to the consumption mode, otherwise, the data product may not be delivered to the data demand end, and when the data demand end does not have the encryption database and the development tool corresponding to the consumption mode, the data market may notify the data product issuing system to construct the encryption database and the development tool in the data demand end according to the deployment position of the data demand end, or prompt the data demand end to construct the data consumption environment of the data demand end itself.
Illustratively, when a data demand end and a data supply end serve as two mutually isolated enterprise tenants and are deployed in the same enterprise cloud, a data market and the two are deployed in the same enterprise cloud, after the data market receives a data product application request, firstly, the data market detects whether a data sandbox exists on the data supply end sending the data product application request, and if the data demand end does not have the data sandbox, the data product issuing system is informed to establish a sandbox example in the data demand end; the data market checks whether an encryption database and a development tool which accord with the consumption mode exist in a sandbox of the data demand end according to the consumption mode of the data product corresponding to the data product application request, if not, the encryption database and the development tool are established in the sandbox through a data product publishing system, then the encryption database and the development tool are associated to the data sandbox, the data sandbox automatically registers the connection mode of the encryption database to a working area corresponding to all the development tools, and therefore when the data demand end carries out subsequent data development, the development tool can directly extract and use the compliance data stored in the sandbox according to a safety strategy configured in the encryption database.
Illustratively, the data demand end and the data providing end are not deployed in the same enterprise cloud, but the data demand end is deployed independently in a customized all-in-one manner, at this time, the data market is divided into a publishing end data market and a subscribing end data market, wherein the publishing end data market and the data providing end are arranged in the same enterprise cloud, the data product publishing of the data demand end is received, the subscribing end data market and the data demand end are arranged in a correlated manner and used for receiving a data product application request of the data demand end, the publishing end data market and the subscribing end data market are connected in an encrypted manner through a private line or a VPN, synchronization of data products and request information is realized between the publishing end data market and the subscribing end data market, when the data market detects that the data consumption environment of the data demand end does not meet delivery requirements of the data products, the data market cannot directly inform a data product publishing system of constructing a data consumption environment at the data demand end, at this time, the data market sends prompt information to the data demand end, so that a worker at the data demand end can correspondingly construct a data consumption environment after the prompt information is constructed, and then execute subsequent safe delivery and combine the data delivery.
According to the technical scheme, after a data product application request of a data demand end is received, a data consumption environment of the data demand end is checked and configured in a data market, a security policy of the data product is pushed to an encryption database corresponding to the data demand end after the data demand end is checked successfully, and then after the security policy configuration of the data demand end is completed, the data demand end and data communication information are all sent to a data providing end to achieve delivery of compliance data, so that the delivery success rate of the data product is guaranteed.
Example four
Fig. 5 is a schematic structural diagram of a data product publishing system according to a fourth embodiment of the present invention, and as shown in fig. 5, the data product publishing system includes: at least one data supplier 41, at least one data consumer 42 and a data market 43, wherein each data supplier 41 and each data consumer 42 are located in isolated domains, and in the embodiment of the present invention, one data supplier 41 and one data consumer 42 are taken as an example.
The data providing terminal 41 is configured to perform data information extraction, data compliance processing, data connectivity information determination, version information determination, and consumption mode determination on the acquired data to be published, generate a data product to be published, which includes a product number, and publish the data product to be published to the data market 43;
the data market 43 is configured to, when receiving a data product application request from the data demand side 42, determine a target data product according to the data product application request, and check a data consumption environment of the data demand side 42 according to a consumption mode corresponding to the target data product;
the data demand end 42 is used for receiving the security policy of the target data product pushed by the data market when the data consumption environment is available, and storing the security policy into the encryption database;
a data market 43, configured to send the connection information of the encrypted database and the data communication information of the target data product to the data providing terminal 41;
the data providing end 41 is used for pushing compliance data corresponding to the target data product to the encrypted database according to the connection information of the encrypted database and the data communication information of the target data product;
the product number comprises metadata information, a security policy, data communication information and version information.
Further, the data market 43 is further configured to reject the data product application request of the data consumer 42 if the data consumption environment is inconsistent with the consumption mode after checking the data consumption environment of the data consumer 42 according to the consumption mode corresponding to the target data product.
Further, the data market 43 is further configured to, after checking the data consumption environment of the data consumer 42 according to the consumption pattern corresponding to the target data product, build or prompt to build the data consumption environment at the data consumer 42 if the data consumption environment is consistent with the consumption pattern and the data consumer 42 does not include the encrypted database and the development tool corresponding to the consumption pattern.
According to the technical scheme of the embodiment of the invention, the basic information of the data to be issued is extracted, and then the static desensitization and the configuration of the security policy are carried out on the data to be issued according to the extracted basic information and the compliance requirements of the data products acquired in advance, so that the corresponding security policy is dynamically configured for different types of data demand terminals on the basis that the configured compliance data to be issued meets the requirements of national laws and regulations, and then the data communication information determined according to the storage mode of the compliance data to be issued, the version information determined according to the number of times of modification, the consumption mode defined according to the data communication information and other information are combined, and the data products to be issued are numbered and the data products are issued to the data market. The issued data product completely comprises parameters for keeping data compliance in the whole data life cycle, the data market checks and configures the data consumption environment of the data demand end after receiving the data product application request of the data demand end, and pushes the security policy of the data product to the encrypted database corresponding to the data demand end after the data market succeeds in checking, so that after the security policy configuration of the data demand end is completed, the data demand end and the data communication information are both sent to the data providing end to achieve delivery of compliance data, the delivery success rate of the data product is guaranteed, meanwhile, due to the fact that the security policy for issuing the data product is configured for the data demand end, compliance of the data product corresponding to the compliance data product in subsequent application is guaranteed, compliance of the whole data transaction process is guaranteed, the issued data product is guaranteed, and safety risks cannot occur in the using process.
Fig. 6 is a diagram illustrating a structure of a data product publishing system according to a fourth embodiment of the present invention, and as shown in fig. 6, the data product publishing system includes a data providing end 51, a publishing end data market 52, and a data demanding end 531 that are disposed in the same enterprise cloud, a data demanding end 532 that is disposed independently of the enterprise cloud and has a customized all-in-one machine, and a subscribing end data market 54 that corresponds to the data demanding end 532. The data market 54 of the subscriber can be disposed in the data requirement end 532, or disposed outside the data requirement end 532, which is not limited in the embodiment of the present invention.
The data providing end 51 comprises a development tool 511, a compliance tool 512 and a pushing tool 513, wherein a worker of the data providing end 51 executes development operation on data obtained by a data source through the development tool 511, performs static desensitization and security policy configuration processing on the developed data to be published through the compliance tool 512 to obtain and store the data to be published, determines corresponding metadata information, security policy, data communication information and version information according to the data to be published, encapsulates the data to be published to obtain a data product to be published, determines a product number of the data product to be published according to the metadata information, the security policy, the data communication information, the version information and the consumption mode after auditing a consumption mode of the data product to be published, and publishes the numbered data product to the publishing end data market 52. The publisher data marketplace 52 transmits the data products published therein to the subscriber data marketplace 54 via private/VPN encryption, such that either the data consumers 531 of the publisher data marketplace 52 are accessed, or the data consumers 532 of the subscriber data marketplace 54 are accessed. After receiving the data product application request, the publishing terminal data market 52 or the subscribing terminal data market 54 performs data consumption environment check on the corresponding data demand terminal 531 or 532, issues the security policy corresponding to the data product to the encryption database or the secure storage computing environment therein after the check is successful, and simultaneously sends the connection information corresponding to the data demand terminal 531 or 532 and the data communication information of the data product to the pushing tool 513 of the data providing terminal 51, so that the data providing terminal 51 can push the compliance data corresponding to the data product to the corresponding data demand terminal 531 or 532 through the pushing tool 513.
The data product release system provided by the embodiment of the invention can execute the data product release method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
In some embodiments, the data product distribution method may be implemented as a computer program tangibly embodied in a computer-readable storage medium, such as a storage unit. In some embodiments, part or all of the computer program may be loaded and/or installed onto the data product distribution system via ROM and/or the communication unit. When the computer program is loaded into RAM and executed by a processor, it may perform one or more of the steps of the data product distribution method described above. Alternatively, in other embodiments, the processor may be configured to perform the data product distribution method by any other suitable means (e.g., by way of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for implementing the methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.
The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (13)

1. A data product publishing method is applied to a data providing end of a data product publishing system, and comprises the following steps:
extracting data information of the acquired data to be issued, and determining corresponding metadata information;
performing static desensitization and security policy configuration on the to-be-issued data according to the metadata information and the acquired compliance requirements of the data product, and determining and storing the to-be-issued compliance data;
determining data communication information according to the storage mode of the compliance data to be issued, and determining version information according to the modification times of the compliance data to be issued;
packaging and determining the metadata information, the security policy, the data communication information and the version information as a data product to be released, and defining a consumption mode of the data product to be released according to the data communication information;
and determining the product number of the data product to be released according to the metadata information, the security policy, the data communication information, the version information and the consumption mode, and releasing the numbered data product to be released to a data market in the data product release system.
2. The method according to claim 1, wherein the determining and storing compliance data to be published by performing static desensitization and security policy configuration on the data to be published according to the metadata information and the obtained compliance requirements of the data product comprises:
determining at least one data type of the data to be issued according to the metadata information;
determining the security level of data corresponding to each data type in the data to be issued according to each data type and the compliance requirement of the data product;
carrying out de-identification processing on the data to be issued to determine static desensitization data;
determining a security policy of the static desensitization data according to the security level and the compliance requirement of the data product, and correspondingly configuring the security policy and the static desensitization data to obtain compliance data to be issued;
and storing the compliance data to be issued in a database in the data providing terminal.
3. The method according to claim 2, further comprising, after said determining at least one data type of the data to be published according to the metadata information:
identifying the data to be issued, of which the data type belongs to a preset personal information type;
wherein the identification process comprises a direct identifier identification process and a quasi identifier identification process.
4. The method according to claim 3, wherein the de-identifying the data to be distributed and determining static desensitization data comprises:
inputting the to-be-processed issuing data with the direct identifier or the quasi-identifier in the to-be-issued data into a preset de-identification model;
determining de-identification data and de-identification ratings corresponding to the de-identification data according to the output result;
and determining static desensitization data according to the de-identification data, the de-identification rating and the to-be-issued data.
5. The method of claim 4, wherein determining static desensitization data from the de-identified data, the de-identified rating, and the data to be published comprises:
if the de-identification rating does not meet the preset rating condition, taking the de-identification data as new to-be-processed release data, adjusting parameters in a preset de-identification model, and returning to execute the step of inputting the to-be-processed release data with direct identifiers or quasi identifiers in the to-be-released data into the preset de-identification model;
and if not, replacing the to-be-processed release data with the de-identification data, and determining the replaced to-be-released data as static desensitization data.
6. The method of claim 2, wherein determining the security policy for the static desensitization data according to the security level and the data product compliance requirements comprises:
determining a security access level according to the data product compliance requirements, and determining the security access level as a first security policy of the static desensitization data;
if the data product compliance requirements include field access control requirements, determining the security access level and the field access control requirements as a second security policy for the static desensitization data;
if the data product compliance requirements comprise user field access control requirements, determining the security access level and the user field access control requirements as a third security policy of the static desensitization data;
if the data product compliance requirements include field access control requirements and user field access control requirements, determining the security access level, the field access control requirements, and the user field access control requirements as a fourth security policy for the static desensitization data.
7. The method according to claim 1, wherein the defining a consumption pattern of the data product to be released according to the data connectivity information comprises:
determining at least one data delivery mode of the data product to be issued according to the data communication information;
and auditing the data delivery modes according to the property of the data product to be issued corresponding to the target data demand side, and determining the data delivery mode passing the auditing as the consumption mode of the data product to be issued.
8. A data product publishing method, applied to a data market of a data product publishing system, the method comprising:
after receiving a data product application request of a data demand end of the data product issuing system, checking a data consumption environment of the data demand end;
if the consumption mode of the data product corresponding to the data product application request is consistent with the consumption mode of the data product, and the data product application request comprises an encryption database and a development tool corresponding to the consumption mode, pushing the security policy of the data product to the encryption database;
and sending the connection information of the encryption database and the data communication information of the data product to a data providing end corresponding to the data product so that the encryption database receives compliance data corresponding to the data product.
9. The method of claim 8, further comprising, after said checking a data consumption environment of said data consumer:
if the consumption mode of the data product corresponding to the data product application request is inconsistent with the consumption mode of the data product, rejecting the data product application request;
and if the data consumption environment is consistent with the consumption mode of the data product corresponding to the data product application request and does not comprise an encryption database and a development tool corresponding to the consumption mode, constructing or prompting to construct the data consumption environment at the data demand end.
10. A data product release system is characterized by comprising at least one data providing end, at least one data demand end and a data market, wherein each data providing end and each data demand end are positioned in mutually isolated domains;
the data providing terminal is used for extracting data information, performing data compliance processing, determining data communication information, determining version information and determining a consumption mode of the acquired data to be issued, generating a data product to be issued containing a product number and issuing the data product to be issued to the data market;
the data market is used for determining a target data product according to the data product application request when receiving the data product application request of the data demand end, and checking the data consumption environment of the data demand end according to the consumption mode corresponding to the target data product;
the data demand end is used for receiving the security policy of the target data product pushed by the data market when the data consumption environment is available, and storing the security policy into an encryption database;
the data market is used for sending the connection information of the encrypted database and the data communication information of the target data product to the data providing end;
the data providing end is used for pushing compliance data corresponding to the target data product to the encrypted database according to the connection information of the encrypted database and the data communication information of the target data product;
the product number comprises the metadata information, the security policy, the data communication information, the version information and the consumption mode.
11. The system of claim 10,
the data market is also used for establishing a sandbox at the data demand end through the data product issuing system when the data demand end does not have the data consumption environment and the data consumption environment in the data demand end is arranged through a data sandbox, and establishing an encryption database corresponding to the consumption mode and at least one development tool in the sandbox;
the data demand side is used for registering the connection mode of the encrypted database into each development tool after the encrypted database is created, so that when data development is carried out through each development tool, the data corresponding to the target data product is read based on the security strategy in the encrypted database.
12. The system of claim 10,
the data market is further used for sending data consumption environment construction prompt information to the data demand end when the data demand end does not have the data consumption environment and the data consumption environment of the data demand end is arranged through the data all-in-one machine, so that the data demand end constructs the data consumption environment according to the data consumption environment construction prompt information.
13. A computer-readable storage medium storing computer instructions for causing a processor to perform the data product distribution method of any one of claims 1-9 when executed.
CN202211664319.5A 2022-12-23 2022-12-23 Data product release method, system and storage medium Active CN115935421B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211664319.5A CN115935421B (en) 2022-12-23 2022-12-23 Data product release method, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211664319.5A CN115935421B (en) 2022-12-23 2022-12-23 Data product release method, system and storage medium

Publications (2)

Publication Number Publication Date
CN115935421A true CN115935421A (en) 2023-04-07
CN115935421B CN115935421B (en) 2024-01-30

Family

ID=86700738

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211664319.5A Active CN115935421B (en) 2022-12-23 2022-12-23 Data product release method, system and storage medium

Country Status (1)

Country Link
CN (1) CN115935421B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170272472A1 (en) * 2016-03-21 2017-09-21 Vireshwar K. Adhar Method and system for digital privacy management
WO2020133346A1 (en) * 2018-12-29 2020-07-02 Nokia Shanghai Bell Co., Ltd. Data sharing
CN112732811A (en) * 2020-12-31 2021-04-30 广西中科曙光云计算有限公司 Data open platform
CN112818390A (en) * 2021-01-26 2021-05-18 支付宝(杭州)信息技术有限公司 Data information publishing method, device and equipment based on privacy protection
CN114021184A (en) * 2021-10-28 2022-02-08 深圳乐信软件技术有限公司 Data management method and device, electronic equipment and storage medium
CN114077610A (en) * 2020-08-14 2022-02-22 深信服科技股份有限公司 Data publishing method and related device
CN115129716A (en) * 2022-06-27 2022-09-30 浪潮工业互联网股份有限公司 Data management method, equipment and storage medium for industrial big data

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170272472A1 (en) * 2016-03-21 2017-09-21 Vireshwar K. Adhar Method and system for digital privacy management
WO2020133346A1 (en) * 2018-12-29 2020-07-02 Nokia Shanghai Bell Co., Ltd. Data sharing
CN114077610A (en) * 2020-08-14 2022-02-22 深信服科技股份有限公司 Data publishing method and related device
CN112732811A (en) * 2020-12-31 2021-04-30 广西中科曙光云计算有限公司 Data open platform
CN112818390A (en) * 2021-01-26 2021-05-18 支付宝(杭州)信息技术有限公司 Data information publishing method, device and equipment based on privacy protection
CN114021184A (en) * 2021-10-28 2022-02-08 深圳乐信软件技术有限公司 Data management method and device, electronic equipment and storage medium
CN115129716A (en) * 2022-06-27 2022-09-30 浪潮工业互联网股份有限公司 Data management method, equipment and storage medium for industrial big data

Also Published As

Publication number Publication date
CN115935421B (en) 2024-01-30

Similar Documents

Publication Publication Date Title
US11755770B2 (en) Dynamic management of data with context-based processing
CN109964228B (en) Method and system for double anonymization of data
CN110727954B (en) Data authorization desensitization automation method, device and storage medium
US8375427B2 (en) Holistic risk-based identity establishment for eligibility determinations in context of an application
US10846644B2 (en) Cognitive process learning
CN111868727B (en) Method and system for data anonymization
US11416874B1 (en) Compliance management system
AU2016422515A1 (en) Tracing objects across different parties
US11362997B2 (en) Real-time policy rule evaluation with multistage processing
AU2014385227A1 (en) System and methods for location based management of cloud platform data
CN111126948A (en) Processing method and device for approval process
US20230145461A1 (en) Receiving and integrating external data into a graphical user interface of an issue tracking system
CN113989058A (en) Service generation method and device
CN116015840B (en) Data operation auditing method, system, equipment and storage medium
CN116340355A (en) Data query method and device
CN112580065A (en) Data query method and device
US9424543B2 (en) Authenticating a response to a change request
CN115935421B (en) Data product release method, system and storage medium
CN112131257B (en) Data query method and device
CN111414591A (en) Workflow management method and device
US11847143B2 (en) Systems and methods for automated data governance
US20230153457A1 (en) Privacy data management in distributed computing systems
US11741409B1 (en) Compliance management system
KR102235775B1 (en) Personal information processing agency and management method and computer program
CN115906131A (en) Data management method, system, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant