CN115310132A

CN115310132A - Data identity identification and data fragmentation method and device

Info

Publication number: CN115310132A
Application number: CN202211027229.5A
Authority: CN
Inventors: 龚䶮; 张微; 陈晓; 王志轩; 李志男
Original assignee: Huajiao Lianchuang Xiamen Technology Co ltd; Beijing Huayixin Technology Co ltd
Current assignee: Beijing Huayixin Technology Co ltd; Lin Shaowei
Priority date: 2022-08-25
Filing date: 2022-08-25
Publication date: 2022-11-08
Anticipated expiration: 2042-08-25
Also published as: CN115310132B

Abstract

The invention discloses a method and a device for fragmenting data identity identification and data, which take the association of cut-off data owner identity data and a data structure as a core, fragmentize the association of the data owner identity data and the data structure, fragment a data owner identity private key and a data value associated with an association fragment, and perform offline cold storage on association tables of the data owner identity private key fragment, the data value fragment and the association fragment, thereby isolating data, a data owner and other third parties and ensuring the security and privacy of the data and the data owner.

Description

Data identity identification and data fragmentation method and device

Technical Field

The invention relates to the technical field of data security and user privacy security, in particular to a data identity identification and data fragmentation method and device.

Background

Data security is actually mainly two dimensions, namely, data can not be tampered; another is the privacy security of the data. The block chain naturally solves the problem of tamper resistance, but cannot solve the problem of privacy security, and still needs to be used together with a security method, such as asymmetric encryption, secure random numbers and the like. The combination of the block chain and the cryptographic method can play an effective role in solving the problems of asymmetric data information, single data dimension, low-cost acquisition of effective data, data security and privacy protection and the like, can realize authority management on data acquisition, and provides effective privacy protection and data right confirmation for data of users with various roles. Thus, the blockchain + cipher approach is the most common and has achieved some success, but the approach still has some limitations.

First, although most approaches employ cryptography, experts still point out: all cryptographic based schemes introduce a significant computational complexity to some extent, which is difficult to eliminate with increasing hardware, and in the long run the cryptographic method is always less efficient than another scheme that is not based on cryptography. At present, there are many methods for reducing the decryption time of data to the maximum extent in the process of data transmission, data storage and data encryption, which essentially require a user to send data to a third party and are contradictory to the data security and privacy requirements. The cryptographic method with high security level has slower application efficiency and larger calculation amount.

Secondly, the application of emerging methods results in increasingly blurred boundaries for "personal data". At present, the internet gathers mass user data, but the trust crisis of the internet cannot undertake trust reconstruction of a data carrier. The problem of establishing data trust is that the data privacy security problem is explained, and the large data platform can perform random transaction after the data is uploaded, so that the benefit share of all data parties in the transaction cannot be guaranteed. After encryption, the data must be processed as a whole. However, the data as a whole may be guessed as the source, and the security and privacy of the data and the data owner cannot be completely guaranteed. Therefore, the blockchain + password method also causes a problem of 'unwilling data sharing between a data owner and a large data platform, and leading a large amount of data to emerge in an isolated island'.

Thirdly, a new method of big data, block chains, unmanned driving, facial recognition, wearable equipment, smart homes, medical monitoring equipment, behavior biological data and the like is applied, a new field of data-driven innovation is continuously developed, and the data scale is continuously enlarged. In addition, new requirements for enhancing data protection and releasing data value are continuously provided in scenes such as industrial data flow and data cross-border flow of important projects or key scientific and technological achievement evaluation, industrial internet platforms, enterprise core data processing and the like. In addition, new requirements of laws, ethics and governments brought by some data fields related to personal privacy, including government governance, medical treatment, credit investigation, etc., have partially exceeded the coverage of traditional information protection mechanisms, but at the same time, present new challenges to the availability of data circulation. However, the block chain + password method is not favorable for data sharing and security privacy, so that large-scale data value mining is difficult to guarantee, data values are difficult to fully release, and data circulation based on the data values is difficult to realize.

In order to solve the problems, the invention provides a method and a device for fragmenting data identity identification and data, which is a method for cutting off the association between data owner identity data and a data structure as a core, fragmenting the association between the data owner identity data and the data structure, fragmenting a data owner identity private key and a data value which are associated with an association fragment, and performing offline cold storage on association tables of the data owner identity private key fragment, the data value fragment and the association fragment so as to isolate data, a data owner and other third parties and ensure the security and privacy of the data and the data owner. In the method, the related data fragmentation and the data owner identity private key fragmentation are different from the current other data fragmentation and user key fragmentation, and the difference is that the traditional data fragmentation is used for ensuring the data and the storage safety thereof and preventing the data leakage; the traditional objective of fragmenting the user key is to protect the user password and the backup security thereof and prevent the user password from being decrypted; the data fragmentation and the data owner identity private key fragmentation in the invention mainly aim at isolating the data and the data owner, and prevent a third party from directly searching or indirectly deducing the data owner through the data. The method is developed around the fragmentation of the relevance of the data structure and the data owner identity data, and then the private key of the data value and the data owner identity is fragmented to cut off the relevance of the data and the data owner.

Disclosure of Invention

The invention aims to provide a data identity and data fragmentation method and device, which fundamentally solve the problem of safety and privacy of data and data owners, particularly the problem of directly searching or indirectly inferring the data owners through data by a third party, further solve the problem of unwilling data sharing between the data owners and a large data platform and leading a large amount of data islands to emerge, realize large-scale data value mining on the basis, fully release data value and further realize data circulation on the basis of the data value.

The method scheme adopted by the invention is as follows: the data identity identification and data fragmentation method provided by the invention refers to a method for cutting off the association between data owner identity data and a data structure as a core, fragmenting the association between the data owner identity data and the data structure, simultaneously fragmenting a data owner identity private key and a data value associated with an association fragment, and performing offline cold storage on association tables of the data owner identity private key fragment, the data value fragment and the association fragment, thereby isolating data, a data owner and other third parties and ensuring the security and privacy of the data and the data owner.

In order to achieve the above object, a first aspect of the embodiments of the present invention discloses a technical solution:

a data identity identification and data fragmentation method specifically comprises the following steps:

s1, performing fragmentation processing on association of a data structure and identity data of a data owner;

s2, fragmenting the data owner identity private key associated with the associated fragment according to the data structure and the associated fragmentation result of the data owner identity data;

s3, randomly associating the data owner identity private key fragments with the data owner identity data fragments, and generating an association table A by using a random association result;

s4, fragmenting data values associated with the relevance fragments according to the data structure and the associated fragmentation result of the data owner identity data;

s5, randomly associating the data value fragments with the data structure fragments, and generating an association table B from a random association result;

and S6, storing the association table A and the association table B in an off-line cold storage mode through a storage medium, wherein the off-line cold storage mode is only used for on-line call verification and updating when the identity of a data owner needs to be locked, and the on-line call downloading is not allowed.

Preferably, the step S1 specifically includes:

s1.1, establishing a data structure set and a preliminary association set of a data owner identity data set according to data uploading behaviors of the data owner;

s1.2, extracting the associated elements of the data structure and the data owner identity data in the preliminary associated set, and further constructing an associated set of the data structure and the data owner identity data;

s1.3, setting an allowable range for the association degree value of the data structure and the data owner identity data;

s1.4, mining the association degree value of the data structure and the data owner identity data of the association set;

s1.5, for the association degree value exceeding the allowable range, fragmenting the relation between the data structure and the identity data of the data owner, and establishing an association set corresponding to the data structure to be fragmented and the identity data of the data owner;

and S1.6, fragmenting the relevance set.

Preferably, the step S2 specifically includes:

s2.1, determining the fragmentation number of the identity data of the data owner in the association fragmentation according to the data structure and the association fragmentation result of the identity data of the data owner;

s2.2, determining the fragmentation number of the identity private key of the data owner, wherein the fragmentation number is required to be larger than the fragmentation number of the identity data of the data owner;

s2.3, setting a length lower limit of the data owner identity private key fragment;

s2.4, when the fragmentation number of the data owner identity private key and the length lower limit of the data owner identity private key fragment are multiplied to exceed the length of the data owner identity private key, determining the length lacking the data owner identity private key, and filling the length of the lacking data owner identity private key with a space private key;

and S2.5, fragmenting the data owner identity private key according to the fragmentation number and the fragmentation length of the data owner identity private key.

Preferably, the step S4 specifically includes:

s4.1, determining the fragmentation number of the data structure in the association fragmentation according to the association fragmentation result of the data structure and the data owner identity data;

s4.2, determining the fragmentation number of the data value, wherein the fragmentation number of the data value is required to be larger than the fragmentation number of the data structure;

s4.3, setting a lower limit of the length of the data value, wherein the requirement is greater than 2 minimum data units;

s4.4, when the fragmentation number of the data structure is multiplied by the lower limit of the length of the data value to exceed the length of the data structure, determining the length of the data value which is lack, and filling the length of the data value which is lack with a space private key;

and S4.5, fragmenting the identity private key of the data owner according to the fragmentation number and the length of the data value.

Preferably, in the step S2.5, the data owner identity private key associated with the relevance fragment is fragmented, and the data owner identity private key is divided by a random method or an average field method;

preferably, in step S4.5, the data values are marked, corresponding fragmentation processing is performed on the data values according to the data value marking result, and the data value fragmentation processing results corresponding to the marks are intersected to obtain a final data value fragmentation result;

preferably, the marking method is to mark the data values according to three data value dimensions of time, space and path, including;

the data value time dimension mark comprises data owner uploading of the data value, a data structure fragmentation result, data storage, data processing, data transaction and time associated with the data owner;

the data value space dimension mark comprises a data structure and a data value;

the data value path dimension mark comprises a credibility grade, a data processing path and a data transaction path.

The second aspect of the invention discloses a data identity and data fragmentation device for isolating data from a data owner, which comprises:

a memory storing executable program code;

a processor coupled with the memory;

the processor calls the executable program code stored in the memory to execute any one of the data identity identification and data fragmentation methods disclosed in the first aspect of the embodiments of the present invention;

a third aspect of the present invention discloses a computer storage medium, where the computer storage medium stores computer instructions, and the computer instructions, when called, are used to execute any one of the data identity identification and data fragmentation methods disclosed in the first aspect of the embodiments of the present invention.

Compared with the prior art, the invention has the beneficial effects that:

(1) The data identity identification and data fragmentation method provided by the invention is superior to the traditional method of taking user data as a whole and protecting the safety privacy by a cryptography method in the aspect of ensuring the safety privacy of data owners and data. The key of the method for guaranteeing the safety and the privacy of the data and the data owner is to develop the association fragmentation between the data structure and the data of the data owner, the identity and the private key fragmentation of the data owner and the data value fragmentation by taking the isolation of the data and the data owner as main targets. The data and the identity of the data owner are fragmented before entering the platform or entering the data pool, the data and the data owner are isolated, the anonymous data uploading of the data owner is achieved through the auxiliary big data platform, the platform is guaranteed to know only users with unknown data, the buyer is guaranteed not to know the data owner, any third party can not directly search or indirectly estimate the data owner according to the data, and therefore the privacy of the data owner is guaranteed not to be learned by the platform and the buyer.

(2) The data identity identification and data fragmentation method provided by the invention ensures that the data of a data owner is not directly traded but is put into a super-large data pool for processing, so that a data processing party cannot know the specific source of any data. A more normalized and standardized free data processing space is provided for data processors, a data processor is prompted to process and process data from the perspective of data value, and data processing basic conditions for fully mining and releasing data value are provided for a large data platform.

(3) The data identity identification and data fragmentation method provided by the invention promotes the data owner to be more willing to upload data on the basis of ensuring the safety and privacy of the data owner and the data and being beneficial to mining the data value, is beneficial to a served big data platform to gather more user data, promotes the data processor and the big data platform to be more willing to carry out data processing and service work, and can also support the safe data sharing of the big data platform and other big data platforms, thereby breaking a data island between the big data platforms and between the data owners.

Drawings

FIG. 1 is a schematic diagram of a data identity and data fragmentation method;

FIG. 2 is a flow diagram of a method for associated fragmentation of data structures and data owner identity data;

FIG. 3 is a flow diagram of a data identity fragmentation method for data isolation from data owners;

FIG. 4 is a flow diagram of a data value fragmentation method for data isolation from data owners;

FIG. 5 is a flow chart of a data value fragmentation method according to the criteria of a data value fragmentation machinable process.

Detailed Description

The data identity and data fragmentation methods provided by the present invention are further described in detail below with reference to the accompanying drawings and specific embodiments.

The method is suitable for the field of data application, research or transaction with certain requirements on data safety and user privacy safety. The data in the invention is uploaded by a data owner and comprises various data generally recognized by society, such as digital products, data assets, data elements and the like. The method of the invention is used as a data fragmentation method, and requires that the data quantity is at least more than or equal to 2 basic quantity units.

For example, assume a large data platform with data owner labeled DOn (PS-ID), n =1,2. Wherein n is the sequence number of the data owner on the big data platform, PS is the identity private key of the data owner on the big data platform, and ID is the identity data set of the data owner.

The data on the large data platform is labeled ADm (DS-DV), m =1,2. Where m is the data owner's sequence number on the big data platform, DS is the data structure, and DV is the data value.

In large data platforms that do not employ the present technology, the association of each data owner with its data is explicit. The big data platform can determine the data owner to which the data belongs according to the association between the data in the database and the data owner. In order to ensure the security and privacy of the data owner, a big data platform usually chooses to hide the identity data of the data owner and associates the identity key of the data owner with the data, namely BR (DOn (PS), ADm (DS-DV)). Inevitably, the data owner and data association mode enables some third parties (data processors, data users, large data platforms and data transaction service parties) familiar with data to infer data owner identity data DOn (ID) through data ADm (DS-DV), and further decipher data owner DOn (PS-ID), so that the data and data owner is completely exposed, and the security and privacy of the data and data owner are destroyed. Furthermore, this potential risk leads to unwilling data sharing between data owners and big data platforms in the data application, research or transaction field that have certain requirements on data security and user privacy security, and thus, a large amount of data islands emerge. Further, large-scale data circulation based on data value cannot be realized in the fields, large-scale data value mining cannot be carried out, and the data value cannot be fully released.

The present technology includes a data identity fragmentation method for data isolation from a data owner and a data fragmentation method for data isolation from a data owner. The principle is shown in fig. 1, wherein the data identity fragmentation method for data isolation from the data owner refers to a data owner identity fragmentation method based on the associated fragmentation of the data structure and the data owner identity data. The data fragmentation method for isolating data from a data owner refers to a data value fragmentation method which meets the data value fragmentation machinable processing standard principle after data value fragmentation based on the association fragmentation of data structures and data owner identity data and the data value fragmentation. The data fragmentation method can not only ensure that the data is divided to avoid divulgence, ensure the privacy and safety of the data, but also ensure the processing of the data, and is favorable for the requirement of the next data value mining.

The core of the method of the present technology is the associated fragmentation of the data structure and the data owner identity data, which means that the associated BR (DOn (ID), ADm (DS)) of the data structure ADm (DS) and the data owner identity data DOn (ID) is fragmented, and after fragmentation, a third party (data processor, data user, big data platform, data transaction service) cannot be reached according to the data structure ADm (DS) and the associated fragments BRs (DOn (ID), ADm (DS)) (associated fragment number s =1,2,.. Once.n), and the associated BR (DOn, ADm (DS)) of the data structure ADm (DS) and the data owner DOn and the data owner identity data DOn (ID) are inferred.

1) The basic flow of the associated fragmentation of the data structure ADm (DS) and the data owner identity data DOn (ID) is shown in fig. 2, and the specific method is as follows:

1-1) set up a data structure set { ADm (DS) } and a data owner identity data set { DOn (ID) }.

Establishing a data structure set { ADm (DS) }accordingto a data structure ADm (DS), wherein the construction adopts a method that the data sequence number m is taken as the number of data in the data structure set { ADm (DS) }, and the data structure DS is taken as an element of the data structure set { ADm (DS) }, and the element of the data structure set is marked as ADm (DS).

Establishing a data owner identity data set { DOn (ID) }accordingto the owner identity data DOn (ID), and constructing a method adopting 'the data owner identity sequence number n as the number of the data owner identity in the data owner identity data set { DOn (ID) }, and the identity data set ID of the data owner as the element of the data owner identity data set { DOn (ID) }', wherein the data set element is marked as DOn (ID).

1-2) mining the relevance of a data structure set { ADm (DS) } and a data owner identity data set { DOn (ID) }, wherein the specific method comprises the following steps:

1-2-1) establishing a data structure set and a preliminary association set of a data owner identity data set according to data uploading behaviors of the data owner.

And judging the corresponding association m-n of the data owner identity sequence number n and the data sequence number m according to the data uploading behavior of the data owner, determining the basic corresponding association of an element ADm (DS) of a data structure set { ADm (DS) } and an element DOn (ID) of a data owner identity data set { DOn (ID) }, and combining the association set into a preliminary association set { BR (ADm (DS), DOn (ID), m-n) } of the data structure set and the data owner identity data set.

1-2-2) extracting the associated elements of the data structure and the data owner identity data in the preliminary associated set of the data structure set and the data owner identity data set, and further constructing the preliminary associated set of the data structure and the data owner identity data.

And extracting the association elements BR (adm (DS), don (ID), m-n) of the data structure and the data owner identity data on the basis of the preliminary association set { BR (adm (DS), don (ID), m-n) } of the data structure set and the data owner identity data set. According to the preliminary association m-n in the association elements BR (adm (DS), don (ID) and m-n), matching each data structure adm (DS) with data owner identity data don (ID), screening data adm-n (DS) in the data structure adm (DS) which accord with the preliminary association m-n, screening data owner identity data don-m (ID) in the data owner identity data don (ID) which accord with the preliminary association m-n, integrating the data structure adm-n (DS) and the data owner identity data don-m (ID) into a preliminary association combination element BR (adm-n (DS) and don-m (ID)), and further constructing a preliminary association set { BR (adm-n (DS), don-m (ID)) } of the data structure adm (DS) and the data owner identity data don (ID).

1-2-3) mining association degree values of the data structure and the data owner identity data according to the preliminary association set of the data structure and the data owner identity data.

And mining association values of the data structure and the data owner identity data by adopting a data mining method aiming at a preliminary association set of the data structure and the data owner identity data (BR (adm-n (DS), don-m (ID)) }. The method for mining the correlation value comprises the following steps: randomly and randomly combining data in a data structure adm-n (DS) conforming to the preliminary association m-n with ac (adm-n (DS)), randomly and randomly combining data in data owner identity data don-m (ID) conforming to the preliminary association m-n with ac (don-m (ID)), and mining and calculating association degree values VR (ac (adm-n (DS)) and ac (don-m (ID)) of ac (adm-n (DS)) and ac (don-m (ID)) by adopting methods such as a clustering method, a gray association analysis method and an Apriori algorithm.

1-2-4) determining the association of data structures that need to be fragmented and data owner identity data.

The higher the value of the association degree between ac (adm-n (DS)) and ac (don-m (ID)), the higher the association between the two, the higher the mutual speculativity, and the higher the data security and data owner privacy security risks. Therefore, the result value distribution situation is calculated according to the association degree value of ac (adm-n (DS)) and ac (don-m (ID)), the allowable range of the association degree value is set, the association degree value is lower than the probability distribution below PA%, the association degree of the ac (adm-n (DS)) and the ac (don-m (ID)) is considered to meet the requirements of data security and data owner privacy security risk, and the association relation m-n (ac (adm-n (DS)), ac (don-m (ID)) of the ac (adm-n (DS)) and the ac (don-m (ID)) does not need to be fragmented, otherwise, the fragmentation is needed.

1-2-5) establishing a correlation set of the data structure and the identity data of the data owner, which needs fragmentation, by taking the correlation relationship between the data structure which needs fragmentation and the identity data of the data owner as a center.

According to the step 1-2-4), an association set { l-k (ac (adl-k (DS)), ac (dok-l (ID))), ac (adl-k (DS)), ac (dok-l (ID)) } of the data structure and the data owner identity data, which takes the association relation l-k (ac (adl-k (DS)), ac (dok-l (ID))) of the data structure and the data owner identity data, which needs fragmentation, as the center, is established. Where the association l-k belongs to the association m-n and the association k-l belongs to the association n-m.

1-2-6) further mining the association set of the data structure and the data owner identity data which needs fragmentation to complete association fragmentation.

Relabeling as { l-k ] a set of associations of a data structure with data owner identity data that require fragmentation ^j ^-h (adl-k ^j-h (DS)，dok-l ^h-j (ID))，adl-k ^j-h (DS)，dok-l ^h-j (ID) }. Wherein j-h is the association of adl-k (DS) to dok-l (ID), and h-j is the association of dok-l (ID) to adl-k (DS). Calculating the association degree value VR (adl-k) of the association set according to the step 1-2-3) ^j-h (DS)，dok-l ^h-j (ID))。

And counting the association sets of all the data structures and the identity data of the data owners needing fragmentation, comparing the data structures, the identity data of the data owners and the association degree values contained in the association sets, and calculating the association degree values between the data structures and the data owners contained in the association sets and the data owners and the data structures and the data owners contained in the association sets by using a matrix equation. And judging how the association fragmentation should be carried out according to the association degree values.

The specific method comprises the following steps: association set of data structures and data owner identity data l-k requiring fragmentation ^j-h (adl-k ^j-h (DS)，dok-l ^h-j (ID)) } is alternatively its set of data structures { adl-k ^j-h (DS) } arbitrary partitioning and data owner identity data set { dok-l ^h-j (ID) }. Set of associations { l-k ^j-h (adl-k ^j ^-h (DS)，dok-l ^h-j (ID)) } has an association degree value VR (adl-k ^j-h (DS)，dok-l ^h-j (ID)). Calculating adl-k according to the step 1-2-3) ^j-h Various partitions of (DS) and data owner identity data set { dok-l ^h-j (ID) } various divided series of association degree values. According to adl-k ^j-h (DS) and dok-l ^h-j (ID) partitioning scheme and corresponding series of correlation degree values, and adl-k ^j-h (DS) and dok-l ^h-j (ID) and correlation metric VR (adl-k) ^j-h (DS)，dok-l ^h-j (ID)) comparison.

Among the fragmentation alternatives, a scheme is chosen in which adl-k ^j-h (DS) the set of data structures, dok-l, of each part formed after the division ^h-j (ID) the maximum value, the minimum value and the average value of the association degree values among the formed data structure sets and the formed identity data sets of the data owners of the parts are all lower than PA% and the three values are the lowest compared with other division schemes.

And after all the association sets needing fragmentation are subjected to fragmentation, counting all association fragmentation results, and selecting an association fragmentation scheme with the minimum data structure set and data owner identity data set of each part as a final scheme. The association fragmentation result is tagged as the association fragmentation set { BRs (DOn (ID), ADm (DS)) (association fragmentation number s =1,2.

2) A data identity fragmentation method for data isolation from data owners.

According to the result of the association fragmentation, the association fragmentation of the association fragmentation set { BRs (DOn (ID), ADm (DS)) (association fragmentation number s =1,2.,. N) } cuts off the association between the data structure and the data owner identity data. But in essence the data owner identity data DOn (ID) in the associated shard has a strong association with the data owner identity private key DOn (PS) on the big data platform. In order to avoid the association of third parties (data processors, data users, big data platforms, data transaction service parties) with the data owner identity private key DOn (PS), the data owner is inferred. It is necessary to further cut off the association of the data owner identity private key DOn (PS) representing the data owner identity with the association shard on the basis of the association shard. The fragmentation method of the data owner identity private key is also called data identity fragmentation.

The basic flow is shown as the attached figure 3, and the specific method comprises the following steps:

2-1) according to the data structure and the associated fragmentation result of the data owner identity data, the fragmentation number NUM (brs (DOn (ID)) of the data owner identity data in the associated fragmentation can be determined. The fragmentation number NUM (s (DOn (PS)) of the data owner identity and the private key is greater than or equal to the fragmentation number NUM (brs (DOn (ID)) of the data owner identity and the private key fragmentation of the data owner identity and the private key can be divided by adopting methods such as fixed, random, average fields and the like, the fragmentation number upper limit of the data owner identity and the private key is related to the private key length LIN (DOn (PS)), and the condition that the length lower limit of the fragmented private key is required to be 3 bytes after the private key is fragmented is assumed to be set, namely the length of the fragmented private key is at least 3 bytes after the private key is fragmented.

2-2) when the fragmentation number NUM (brs (DOn (ID))) of the data owner identity data in the association fragmentation is too high, the fragmentation number NUM (s (DOn (PS)) of the data owner identity data does not meet the requirement of the fragmentation number NUM (brs (DOn (ID)) of the data owner identity data, determining the length of a key which is short of the data owner identity key to be NUM (brs (DOn (ID)) -NUM (s (DOn (PS))) -3 according to the lower limit requirement that the length of the key fragments is at least 3 bytes after the fragmentation of the key, adding a space key with the length of NUM (brs (DOn (ID)) -NUM (s (DOn (PS))) -3 on the basis of the data owner identity key, randomly or formulating the space key to be added to a specific position of the data owner identity key to form a new data owner identity data 32ft, and repeating the step 3238' (XPS) (32ft) of data ownership data zps).

2-3) randomly associating the data owner identity private key fragment s (DOn (PS)) or s (DOn' (PS)) with the data owner identity data fragment brs (DOn (ID)) according to the steps 2-1) and 2-2), and generating an association table of the data owner identity private key fragment and the data owner identity data fragment according to a random association result. In order to ensure the safety and privacy, the association table is stored in an off-line cold storage mode through a permanent storage medium with large capacity and low performance requirement, and the purpose is to ensure the safety and the reliability of the association table. Meanwhile, the method is limited to call verification and updating on line when the identity of the data owner needs to be locked, and does not allow call downloading on line. When the data owner identity needs to be locked, the data owner identity data fragment brs (DOn (ID)) corresponding to the data owner identity private key fragment s (DOn (PS)) can be inquired through the data owner identity private key DOn (PS) and the association table, and then the data owner identity DOn (PS-ID) is recovered.

3) A data fragmentation method for data isolation from data owners.

According to the association fragmentation result, the association fragmentation of the association fragmentation set { BRs (DOn (ID), ADm (DS)) (association fragmentation number s =1,2.,. N) } cuts off the association between the data structure and the data owner identity data. But the data structure ADm (DS) and the data value ADm (DV) in the substantially associative shard have a strong association in the large data platform database. In order to avoid third parties (data processors, data users, big data platforms, data transaction service parties) to infer which data the data owner owns by the association of the association shards and the data values ADm (DV). It is necessary to further cut off the association of data values ADm (DV) with associated shards on the basis of the associated shards. This fragmentation method is also referred to as a data fragmentation method for data isolation from the data owner.

The basic process is shown as the attached figure 4, and the specific method comprises the following steps:

3-1) according to the associated fragmentation result of the data structure and the data owner identity data, the fragmentation number NUM (brs (ADm (DS)) of the data structure in the associated fragmentation can be determined. The number of data value shards NUM (s (ADm (DV))) should be greater than or equal to the number of data structure shards NUM (brs (ADm (DS))). The division of data value fragmentation is related to the standard principle of data value fragmentation machinable processing (the method is as the following step 5)), and the data value fragmentation result is obtained according to the standard principle of data value fragmentation machinable processing, so as to obtain the number NUM (s (ADm (DV))) of data value fragments.

3-2) when the number of fragmented data structures NUM (brs (ADm (DS))) in the associated fragmentation is too high, the number of fragmented data values NUM (s (ADm (DV))) does not meet the requirement of the number of fragmented data structures NUM (brs (ADm (DS))). Comparing the number of data value shards with the number of data structure shards, the structural length of the data value deficit can be determined to be NUM (brs (ADm (DS))) -NUM (s (ADm (DV))). On the basis of the original data value, adding space data values with the length of NUM (brs (ADm (DS))) -NUM (s (ADm (DV))), randomly or regularly adding the space data values at specific positions of the original data value to form new data values ADm '(DV), and repeating the step 3-1) to fragment the data values ADm' (DV).

3-3) randomly associating the data value shard s (ADm (DV)) or s (ADm' (DV)) with the data structure shard s (ADm (DS)) according to the steps 3-1) and 3-2), and generating an association table of the data value shards and the data structure shards according to the random association result. In order to ensure the safety and privacy, the association table is stored in an off-line cold storage mode through a permanent storage medium with large capacity and low performance requirement, and the purpose is to ensure the safety and the reliability of the association table. Meanwhile, the method is limited to the verification and updating of the online call when the data owned by the data owner needs to be determined, and the downloading of the online call is not allowed. When the data owned by the data owner needs to be determined, the data structure fragment s (ADm (DS)) corresponding to the data value fragment s (ADm (DV)) can be queried through the data value ADm (DV) and the association table, so as to recover the data ADm (DS-DV) owned by the data owner.

4) Data value fragmentation can be a standard principle of the process.

Firstly, the data in the technology is uploaded by a data owner, and comprises various data generally recognized by society, such as digital products, data assets, data elements and the like. Data value fragmentation is the fragmentation of data values for each type of data greater than 2 minimum data units to ensure data integrity, processability, and recoverability.

Secondly, on the basis of the aim of ensuring the safety and privacy of data and data owners, particularly solving the problem that a third party directly searches or indirectly deduces the data owners through data, the data value fragmentation is divided according to the data value fragmentation processable processing standard principle in order to further realize the large-scale data value mining, fully release the data value and further realize the data circulation on the basis of the data value. The specific division method comprises the following steps:

the data values ADm (DV) are marked according to three data value dimensions of time, space and path. The basic flow is shown in figure 5, and the marks comprise:

the data value ADm (DV) time dimension marks the data owner upload time UT (ADm (DV)), the data structure fragmentation result time FRT (ADm (DS)), the data storage time ST (ADm (DV)), the data processing time PRT (ADm (DS-DV)), the data transaction time TT (ADm (DS-DV)), the time ORT (ADm (DS-DV)) associated with the data owner. The data value ADm (DV) space dimension flag includes a data structure DSm and a data value DVm. The data-value-path dimension labels include a confidence level CL (ADm (DS-DV)), a data-processing path PR (ADm (DS-DV)), and a data-transaction path TC (ADm (DS-DV)). The marking results are daily management data of the big data platform to the data, and the daily management data can be directly provided by the big data platform.

And performing fragmentation processing on the data value ADm (DV) corresponding to fragmentation according to each marking result of the data value ADm (DV). For example, if the data value ADm (DV) is uploaded by the data owner in batch, the data owner upload time UT (ADm (DV)) of the data value ADm (DV) includes some partial sets of data values ADm (DV) corresponding to the upload time. The fragmentation result of the data value ADm (DV), i.e. the partial set of data values ADm (DV) corresponding to these upload times of the columns, is labeled as sUT (ADm (DV)). The fragmentation processing results of the data values corresponding to the other corresponding items are respectively marked as sFRT (ADm (DS)), sST (ADm (DV)), sPRT (ADm (DS-DV)), sTT (ADm (DS-DV)), sORT (ADm (DS-DV)), sDSm, sDVm, sCL (ADm (DS-DV)), sPR (ADm (DS-DV)), sTC (ADm (DS-DV)). And taking the intersection of the data value ADm (DV) and the data value fragmentation processing result corresponding to all the marks, thereby obtaining a final data value fragmentation result s (ADm (DV)).

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and equivalent substitutions or changes according to the technical scheme and the inventive concept of the present invention should be covered by the scope of the present invention.

Claims

1. A data identity identification and data fragmentation method is characterized by specifically comprising the following steps:

2. The method for data identity identification and data fragmentation according to claim 1, wherein the step S1 specifically comprises:

s11, establishing a data structure set and a preliminary association set of a data owner identity data set according to data uploading behaviors of the data owner;

s12, extracting the associated elements of the data structure and the data owner identity data in the preliminary associated set, and constructing an associated set of the data structure and the data owner identity data;

s13, setting an allowable range for the association degree value of the data structure and the data owner identity data;

s14, mining the association degree value of the data structure and the data owner identity data of the association set;

s15, for the association degree value exceeding the allowable range, fragmenting the relation between the data structure and the data owner identity data, and establishing an association set corresponding to the data structure to be fragmented and the data owner identity data;

and S16, fragmenting the relevance set.

3. The method for data identity identification and data fragmentation according to claim 1, wherein the step S2 specifically comprises:

s21, determining the fragmentation number of the identity data of the data owner in the association fragmentation according to the association fragmentation result of the data structure and the identity data of the data owner;

s22, determining the fragmentation number of the identity private key of the data owner, wherein the fragmentation number is required to be more than the fragmentation number of the identity data of the data owner;

s23, setting a length lower limit of the data owner identity private key fragments;

s24, when the fragmentation number of the data owner identity private key and the length lower limit of the data owner identity private key fragment are multiplied to exceed the length of the data owner identity private key, determining the length of the data owner identity private key lack, and filling the length of the data owner identity private key lack with a space private key;

and S25, fragmenting the data owner identity private key according to the fragmentation number of the data owner identity private key and the fragmentation length of the data owner identity private key.

4. The method for data identity identification and data fragmentation according to claim 1, wherein the step S4 specifically comprises:

s41, determining the fragmentation number of the data structure in the association fragmentation according to the association fragmentation result of the data structure and the data owner identity data;

s42, determining the fragmentation number of the data value, wherein the fragmentation number of the data value is required to be larger than the fragmentation number of the data structure;

s43, setting a lower limit of the data value length, wherein the requirement is greater than 2 minimum data units;

and S44, when the multiplication of the fragmentation number of the data structure and the lower limit of the length of the data value exceeds the length of the data structure, determining the length of the data value which is lack, and filling the length of the data value which is lack with a space private key.

And S45, fragmenting the identity private key of the data owner according to the fragmentation number and the length of the data value.

5. The method according to the data identity and the data fragmentation according to claim 3, wherein in the step S25, the data owner identity private key associated with the relevance fragmentation is fragmented, and the data owner identity private key is divided by a random method or an average field method.

6. The method according to claim 4, wherein the data values are marked in step S45, the corresponding fragmentation processing is performed on the data values according to the data value marking result, and the data value fragmentation processing results corresponding to the marks intersect to obtain the final data value fragmentation result.

7. The method according to data identity and data fragmentation of claim 6, wherein the tagging method is for tagging the data values according to three data value dimensions of time, space and path, including;

the data value time dimension mark comprises data owner uploading of the data value, data structure fragmentation results, data storage, data processing, data transaction and time associated with the data owner;

8. A data identity and data fragmentation apparatus for data isolation from a data owner, the apparatus comprising:

a memory storing executable program code;

a processor coupled with the memory;

the processor calls the executable program code stored in the memory to execute the data identity and data fragmentation method according to any one of claims 1 to 7.

9. A computer storage medium having stored thereon computer instructions which, when invoked, perform a data identity and data fragmentation method according to any one of claims 1 to 7.