CN117313158A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN117313158A
CN117313158A CN202311379517.1A CN202311379517A CN117313158A CN 117313158 A CN117313158 A CN 117313158A CN 202311379517 A CN202311379517 A CN 202311379517A CN 117313158 A CN117313158 A CN 117313158A
Authority
CN
China
Prior art keywords
data
privacy
target
processed
subset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311379517.1A
Other languages
Chinese (zh)
Inventor
孙军芳
张海宁
李海龙
宋继红
李生帛
张容福
张广德
马进财
雷晓萍
赵云鹏
马英辉
马静
李晓艳
王钰琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Qinghai Electric Power Co Ltd
Information and Telecommunication Branch of State Grid Qinghai Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Qinghai Electric Power Co Ltd
Information and Telecommunication Branch of State Grid Qinghai Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Qinghai Electric Power Co Ltd, Information and Telecommunication Branch of State Grid Qinghai Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202311379517.1A priority Critical patent/CN117313158A/en
Publication of CN117313158A publication Critical patent/CN117313158A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Storage Device Security (AREA)

Abstract

The disclosure provides a data processing method and device. Wherein the method comprises the following steps: acquiring a first data set to be processed in a trusted environment configured with a first data security level, wherein the first data set comprises data to be processed, and each piece of data to be processed comprises at least one field information and target data matched with the at least one field information; sequentially analyzing the data to be processed according to the field information and the target data, determining a privacy part and a non-privacy part from the data to be processed, aggregating the privacy parts corresponding to the data to be processed into a privacy data subset, and aggregating the non-privacy parts corresponding to the data to be processed into a non-privacy data subset; and carrying out encryption processing on the private data subset through a target encryption method to obtain an encrypted data subset, and sending the encrypted data subset and the non-private data subset to a non-trusted environment. The technical problem of low relevant data processing efficiency is solved.

Description

Data processing method and device
Technical Field
The invention relates to the field of electric power data security, in particular to a data processing method and device.
Background
With the rapid development of internet technology, especially mobile internet, a large amount of information is reserved in the internet, namely, large data based on the internet is generated, and the process of producing and utilizing the data is greatly convenient for the operation and the life of enterprises, and provides convenience for the leakage of confidential business information and personal privacy information, and even promotes the generation of industrial chains and industrial networks which are specially stolen, sold and attacked and need to protect the information, and meanwhile, damages are caused to the enterprises and the individuals, so that the social management cost is increased.
From the perspective of privacy information protection methods, privacy computing has become an important branch and development direction in the field of electric power data security in recent years, and in the related art, privacy data is often protected by directly encrypting the data without any processing. In this way, only a small amount of data is generally encrypted, but with the rapid development of the mobile internet and the increasing requirements of each enterprise for data security, the data to be secured will only be more and more, and in this case, the data processing efficiency of the data processing method in the related art will be lower and lower.
In view of the above problems, no effective solution has been proposed at present.
Disclosure of Invention
The embodiment of the invention provides a data processing method and a data processing device, which at least solve the technical problem of low processing efficiency of the existing data processing method.
According to an aspect of an embodiment of the present invention, there is provided a data processing method including: acquiring a first data set to be processed in a trusted environment, wherein the trusted environment is configured with a first data security level, the first data set comprises a plurality of pieces of data to be processed, and each piece of data to be processed comprises at least one piece of field information and target data matched with the at least one piece of field information; sequentially analyzing the data to be processed according to the field information and the target data, determining a privacy part and a non-privacy part from the data to be processed, aggregating the privacy parts corresponding to the data to be processed into a privacy data subset, and aggregating the non-privacy parts corresponding to the data to be processed into a non-privacy data subset, wherein the privacy data subset is used for processing in the trusted environment, the non-privacy data subset is used for processing in the non-trusted environment, and the second data security level configured by the non-trusted environment is lower than the first data security level; and carrying out encryption processing on the privacy data subset through a target encryption method to obtain an encrypted data subset, and sending the encrypted data subset and the non-privacy data subset to the non-trusted environment.
According to another aspect of the embodiment of the present invention, there is also provided a data processing apparatus including: an obtaining unit, configured to obtain a first data set to be processed in a trusted environment, where the trusted environment is configured with a first data security level, where the first data set includes a plurality of pieces of data to be processed, and each piece of data to be processed includes at least one field information and target data matched with at least one field information; the analyzing unit is used for sequentially analyzing the data to be processed according to the field information and the target data, determining a privacy part and a non-privacy part from the data to be processed, aggregating the privacy parts corresponding to the data to be processed into a privacy data subset, and aggregating the non-privacy parts corresponding to the data to be processed into a non-privacy data subset, wherein the privacy data subset is used for processing in the trusted environment, the non-privacy data subset is used for processing in the non-trusted environment, and the second data security level configured in the non-trusted environment is lower than the first data security level; and the encryption unit is used for carrying out encryption processing on the privacy data subset through a target encryption method to obtain an encryption data subset, and sending the encryption data subset and the non-privacy data subset to the non-trusted environment.
Optionally, the parsing unit includes: a first obtaining unit, configured to obtain a target field library, where the target field library includes a plurality of privacy field information for indicating a privacy part in the data to be processed; a second obtaining unit, configured to obtain a data feature library, where the data feature library includes a plurality of private data feature information that is used to indicate a private part in the data to be processed; a first determining unit, configured to determine a privacy part from the data to be processed according to a first matching result of the field information included in the data to be processed and the target field library, and a second matching result of a target data feature of the target data included in the data to be processed and the data feature library; and a second determining unit configured to determine, as the non-private portion, data excluding the private portion from the data to be processed.
Optionally, the first determining unit includes: a first adding unit configured to add, when reference field information in the field information included in the data to be processed is the privacy field information included in the target field library, the reference field information and the target data to which the reference field information matches to the privacy portion of the data to be processed; a third obtaining unit, configured to obtain reference data matched with reference field information in the field information, and obtain data feature information of the reference data; a second adding unit configured to add the reference field information and the reference data to the private portion of the data to be processed, in a case where the data feature information of the reference data is private data feature information in the data feature library.
Optionally, the data processing apparatus further includes a response unit configured to parse the data to be processed sequentially according to the field information and the target data, and before determining the private portion and the non-private portion from the data to be processed, at least one of: adding the privacy field information in the target field library in response to a first editing operation on the target field library; deleting the selected privacy field information from the target field library in response to a second editing operation on the target field library; responding to a third editing operation of the data feature library, and adding the privacy data feature information into the data feature library; and deleting the selected privacy data characteristic information from the data characteristic library in response to a fourth editing operation on the target field library.
Optionally, the encryption unit includes a fourth obtaining unit, configured to obtain the target encryption method from an encryption algorithm packet, where the encryption algorithm packet includes at least one of: symmetric encryption algorithm, asymmetric encryption algorithm, and password substitution algorithm; and the encryption subunit is used for carrying out encryption processing on the privacy data subset according to the target encryption method to obtain the encrypted data subset.
Optionally, the encryption subunit includes at least one of: a first encryption subunit, configured to obtain the private portion of one piece of the data to be processed from the private data subset, and encrypt the private portion according to the target encryption method to obtain a first encrypted portion; adding said first encrypted portion to said encrypted data subset; a second encryption subunit, configured to, when the target encryption method includes a plurality of target encryption algorithms, acquire the private portion of one piece of the data to be processed from the private data subset, encrypt target data corresponding to different field information in the private portion, respectively using the target encryption algorithms that are respectively matched with each other; adding the processed second encrypted portion to the encrypted data subset; a third encryption subunit, configured to obtain, in a case where the target encryption method includes a plurality of target encryption algorithms, the privacy portion of one piece of the data to be processed, and sequentially obtain target privacy data corresponding to target privacy field information in the privacy portion; encrypting a plurality of numerical subsequences in the numerical sequence by adopting the target encryption algorithm matched with each other under the condition that the target privacy data is the numerical sequence; and adding the processed third encrypted part to the encrypted data subset.
Optionally, the data processing apparatus further includes: a fifth obtaining unit, configured to obtain a data type of the target data included in the private data subset before performing encryption processing on the private data subset according to the target encryption method to obtain the encrypted data subset; and the conversion unit is used for converting the target data of the text type into the target data of the numerical value type through a target matching rule when the data type is the text type.
Optionally, the data processing apparatus further includes: the processing unit is used for processing the first data set according to a data processing method matched with the service requirement after acquiring the first data set to be processed in the trusted environment to obtain a second data set; the first encryption unit is used for carrying out encryption processing on the privacy processing step in the data processing method according to a first encryption method to obtain privacy step data; and the first sending unit is used for sending the second data set, the privacy step data and the non-privacy processing data corresponding to the non-privacy processing step in the data processing method to the non-trusted environment.
Optionally, the data processing apparatus further includes a second encryption unit, configured to encrypt the private data subset by using a target encryption method to obtain an encrypted data subset, and encrypt the target encryption method according to a second encryption method to obtain method encrypted data; and the second sending unit is used for sending the method encryption data to the non-trusted environment.
According to a further aspect of embodiments of the present invention, there is also provided a computer-readable storage medium having a computer program stored therein, wherein the computer program is arranged to perform the above-described data processing method when run.
According to yet another aspect of embodiments of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the above-described data processing method.
According to still another aspect of the embodiments of the present invention, there is also provided an electronic device including a memory in which a computer program is stored, and a processor configured to execute the data processing method by the computer program.
In the embodiment of the invention, a first data set to be processed is acquired in a trusted environment, wherein the trusted environment is configured with a first data security level, the first data set comprises a plurality of pieces of data to be processed, and each piece of data to be processed comprises at least one piece of field information and target data matched with the at least one piece of field information; sequentially analyzing the data to be processed according to the field information and the target data, determining a privacy part and a non-privacy part from the data to be processed, aggregating the privacy parts corresponding to the data to be processed into a privacy data subset, and aggregating the non-privacy parts corresponding to the data to be processed into a non-privacy data subset, wherein the privacy data subset is used for processing in the trusted environment, the non-privacy data subset is used for processing in the non-trusted environment, and the second data security level configured by the non-trusted environment is lower than the first data security level; and carrying out encryption processing on the privacy data subset through a target encryption method to obtain an encrypted data subset, and sending the encrypted data subset and the non-privacy data subset to the non-trusted environment, so that high-efficiency data processing on the acquired data is realized.
In the data processing method, firstly, a data set comprising a plurality of pieces of data to be processed is acquired in a trusted environment, then the data to be processed in the data set is sequentially analyzed according to field information of the data to be processed and target data matched with the field information, and non-private data and private data are determined from the data to be processed, so that encryption processing is only carried out on the private data in a targeted manner in the trusted environment, encryption is not needed on the non-private data in the trusted environment, rapid data processing is realized, and the technical problem of low processing efficiency of the conventional data processing method is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention and together with the description serve to explain the invention and do not constitute a limitation on the invention. In the drawings:
FIG. 1 is a schematic diagram of a hardware environment of an alternative data processing method according to an embodiment of the present invention;
FIG. 2 is a flow chart of an alternative data processing method according to an embodiment of the invention;
FIG. 3 is a schematic diagram of an alternative private data identification function according to an embodiment of the invention;
FIG. 4 is a schematic diagram of an alternative privacy data safety governance function in accordance with an embodiment of the present invention;
FIG. 5 is a schematic diagram of an alternative privacy logic safety remediation function according to an embodiment of the present invention;
FIG. 6 is a flow chart of another alternative data processing method according to an embodiment of the invention;
FIG. 7 is a schematic diagram of an alternative data processing method according to an embodiment of the invention;
FIG. 8 is a schematic diagram of another alternative data processing method according to an embodiment of the invention;
FIG. 9 is a schematic diagram of an alternative data processing apparatus according to an embodiment of the present invention;
fig. 10 is a schematic structural view of an alternative electronic device according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
According to an aspect of the embodiment of the present invention, a data processing method is provided, and as an alternative implementation, the data processing method may be applied, but not limited to, to a hardware environment formed by the terminal device 102, the server 104, the terminal device 106, and the network 110 as shown in fig. 1. As shown in fig. 1, terminal device 102 and terminal device 106 are in connected communication with server 104 via network 110, which may include, but is not limited to: a wired network, a wireless network, wherein the wired network comprises: local area networks, metropolitan area networks, and wide area networks, the wireless network comprising: bluetooth, WIFI, and other networks that enable wireless communications. The terminal device may include, but is not limited to, at least one of: a mobile phone (e.g., an Android mobile phone, iOS mobile phone, etc.), a notebook computer, a tablet computer, a palm computer, a MID (Mobile Internet Devices, mobile internet device), a PAD, a desktop computer, a smart television, a vehicle-mounted device, etc.
The terminal device 102 and the terminal device 106 are further provided with a display, a processor and a memory, wherein the display can be used for displaying a program interface of the resource platform, and various processing operations can be performed on a plurality of pieces of data to be processed in the program interface; the processor may be configured to parse the data; the memory is used for caching the acquired data.
The server 104 may be a single server, a server cluster composed of a plurality of servers, or a cloud server. The server includes a database and a processing engine. The processing engine is used for clustering and analyzing the data set; the database may be used to store data.
According to an aspect of the embodiment of the present invention, the data processing method specifically includes the following steps: firstly, the terminal device 102 performs step S102, and sends a first data set including a plurality of pieces of data to be processed in a trusted environment to the server 104 through the network 110;
next, in the server 104, step S104 to step S108 are executed, and a first data set to be processed is obtained in a trusted environment, where the trusted environment is configured with a first data security level, the first data set includes a plurality of pieces of data to be processed, and the plurality of pieces of data to be processed includes at least one field information and target data matched with the at least one field information; sequentially analyzing data to be processed according to field information and target data, determining a privacy part and a non-privacy part from the data to be processed, aggregating the privacy parts corresponding to the data to be processed into a privacy data subset, and aggregating the non-privacy part corresponding to the data to be processed into a privacy data subset, wherein the privacy data subset is used for processing in a trusted environment, the non-privacy data subset is used for processing in a non-trusted environment, and the security level of second data configured by the non-trusted environment is lower than that of first data; encrypting the privacy data subset in a target encryption mode to obtain an encrypted data subset;
Next, the server 104 performs step S110: the encrypted data subset and the non-private data subset are transmitted to the non-trusted environment in the terminal device 106 via the network 110.
It may be understood that, in order to verify the reliability of data processing, the terminal device 102 and the terminal device 106 may also be the same terminal device, where a trusted environment and an untrusted environment may be set in the same terminal device, and when verifying the reliability of data processing, the server 104 obtains a first data set to be processed in the trusted environment of the terminal device, performs the data processing operations of S106 to S108 on the first data set, and sends the obtained encrypted data subset and the obtained non-private data subset to the untrusted environment in the terminal device, so as to help a developer verify the reliability of data processing using the same terminal device.
It will be appreciated that in the case of the terminal device having a high computing power, the numbers S104 to S108 may also be completed by the terminal device, for example: after the terminal device 102 acquires the first data set to be processed in the trusted environment of the terminal device 102, the steps S106 to S108 are executed to process the acquired first data set, and the encrypted data subset and the non-private data subset obtained by the processing are sent to the non-trusted environment of the terminal device 106 through the network 110.
In the embodiment of the invention, a first data set to be processed is acquired in a trusted environment, wherein the trusted environment is configured with a first data security level, the first data set comprises a plurality of pieces of data to be processed, and each piece of data to be processed comprises at least one piece of field information and target data matched with the at least one piece of field information; sequentially analyzing the data to be processed according to the field information and the target data, determining a privacy part and a non-privacy part from the data to be processed, aggregating the privacy parts corresponding to the data to be processed into a privacy data subset, and aggregating the non-privacy parts corresponding to the data to be processed into a non-privacy data subset, wherein the privacy data subset is used for processing in the trusted environment, the non-privacy data subset is used for processing in the non-trusted environment, and the second data security level configured by the non-trusted environment is lower than the first data security level; and carrying out encryption processing on the privacy data subset through a target encryption method to obtain an encrypted data subset, and sending the encrypted data subset and the non-privacy data subset to the non-trusted environment, so that high-efficiency data processing on the acquired data is realized.
In the data processing method, firstly, a data set comprising a plurality of pieces of data to be processed is acquired in a trusted environment, then the data to be processed in the data set is sequentially analyzed according to field information of the data to be processed and target data matched with the field information, and non-private data and private data are determined from the data to be processed, so that different data processing is carried out on the non-private data and the private data in different environments, further, the purpose of carrying out targeted processing on the different data in different environments is realized, complex processing means of the private data are avoided, rapid processing of the data is realized, and the technical problem of low processing efficiency of the existing data processing method is solved.
The above is merely an example, and is not limited in any way in the present embodiment.
As an alternative embodiment, as shown in fig. 2, the above data processing method may include the following steps:
s202, a first data set to be processed is obtained in a trusted environment, wherein the trusted environment is configured with a first data security level, the first data set comprises a plurality of pieces of data to be processed, and each piece of data to be processed comprises at least one piece of field information and target data matched with the at least one piece of field information;
S204, analyzing the data to be processed according to the field information and the target data in sequence, determining a privacy part and a non-privacy part from the data to be processed, aggregating the privacy parts corresponding to the data to be processed into a privacy data subset, and aggregating the non-privacy parts corresponding to the data to be processed into a non-privacy data subset, wherein the privacy data subset is used for processing in a trusted environment, the non-privacy data subset is used for processing in a non-trusted environment, and the security level of second data configured by the non-trusted environment is lower than that of the first data;
s206, the private data subset is encrypted through a target encryption method to obtain an encrypted data subset, and the encrypted data subset and the non-private data subset are sent to a non-trusted environment.
Firstly, it should be noted that, in the related embodiment of the present application, the acquiring manner of the first data set including the plurality of pieces of data to be processed accords with the specification of the related normative file, and the authorization permission of the corresponding object account needs to be acquired before the first data set including the plurality of pieces of data to be processed related to the object account is acquired.
The trusted environment in the step S202 is a safe execution environment created based on the combination of software and hardware, and the trusted environment can better ensure confidentiality and integrity of calculation and data processing; each piece of data to be processed in the plurality of pieces of data to be processed included in the first data set in S202 includes field information and target data matched with the field information, where each piece of data to be processed may be, but is not limited to, understood as a key value pair data, and in the case where the data to be processed is key value pair data, the field information is a key (key) in the key value pair data, and the target data is a value (value) in the key value pair data. The above-mentioned field information may be understood as a field name (a name of a column in a data table) of the data to be processed, and at this time, the above-mentioned target data may be understood as a field value (actual data stored in a column) corresponding to the field name, and the above-mentioned first data security level in S202 and second data security level in S204 are sensitivity level divisions of the data according to differences in value, sensitivity level of the content, influence of the data, distribution range, and the like of the data.
The parsing in S204 may be, but is not limited to, an operation of specifically classifying the pieces of data to be processed included in the first data set, dividing the pieces of data to be processed into the data to be processed in the private part and the data to be processed in the non-private part, and the aggregating into the data processing operation of aggregating the scattered data together.
The target encryption method in S206 may be a plurality of preset encryption algorithms, for example: symmetric encryption algorithms, asymmetric encryption algorithms, cryptographic substitution algorithms, etc.
In the embodiment of the invention, a first data set to be processed is acquired in a trusted environment, wherein the trusted environment is configured with a first data security level, the first data set comprises a plurality of pieces of data to be processed, and each piece of data to be processed comprises at least one piece of field information and target data matched with the at least one piece of field information; sequentially analyzing the data to be processed according to the field information and the target data, determining a privacy part and a non-privacy part from the data to be processed, aggregating the privacy parts corresponding to the data to be processed into a privacy data subset, and aggregating the non-privacy parts corresponding to the data to be processed into a non-privacy data subset, wherein the privacy data subset is used for processing in the trusted environment, the non-privacy data subset is used for processing in the non-trusted environment, and the second data security level configured by the non-trusted environment is lower than the first data security level; and carrying out encryption processing on the privacy data subset through a target encryption method to obtain an encrypted data subset, and sending the encrypted data subset and the non-privacy data subset to the non-trusted environment, so that high-efficiency data processing on the acquired data is realized.
In the data processing method, firstly, a data set comprising a plurality of pieces of data to be processed is acquired in a trusted environment, then the data to be processed in the data set is sequentially analyzed according to field information of the data to be processed and target data matched with the field information, and non-private data and private data are determined from the data to be processed, so that different data processing is carried out on the non-private data and the private data in different environments, further, the purpose of carrying out targeted processing on the different data in different environments is realized, complex processing means of the private data are avoided, rapid processing of the data is realized, and the technical problem of low processing efficiency of the existing data processing method is solved.
In an optional embodiment, the sequentially parsing the data to be processed according to the field information and the target data, and determining the privacy part and the non-privacy part from the data to be processed includes:
s1, acquiring a target field library, wherein the target field library comprises a plurality of privacy field information used for indicating privacy parts in data to be processed;
s2, acquiring a data feature library, wherein the data feature library comprises a plurality of private data feature information used for indicating private parts in data to be processed;
S3, determining a privacy part from the data to be processed according to a first matching result of field information included in the data to be processed and a target field library and a second matching result of target data characteristics of target data included in the data to be processed and a data characteristic library;
and S4, determining the data which does not comprise the privacy part in the data to be processed as a non-privacy part.
It should be noted that, the target field library in S1 may be a preset field library in which all stored data are sensitive fields. The plurality of pieces of data to be processed included in the first data set in S1 may be, but not limited to, understood as raw data, and the process of acquiring the first data set may be understood as an importing process of raw data, where the raw data (i.e., the plurality of pieces of data to be processed) refers to data to be processed as a privacy removing method (data processing method), and the raw data may be from different devices, such as a smart meter, a management PC, an OA system of an enterprise, and the like; the same kind of data may also be in a plurality of different formats, basically presenting multi-source heterogeneous characteristics. After the first data set is acquired, determining whether field information in a plurality of pieces of data to be processed included in the first data set is matched with a plurality of sensitive fields included in a target field library, namely whether the target field library includes the field information in the data to be processed, and determining that the field information in the data to be processed is matched with the target field library under the condition that the target field library includes the field information in the data to be processed.
The data feature library in S2 may be preset data features corresponding to different data types, for example: the data corresponding to the mobile phone number is characterized in that: the first bit of the data is 1, the data length is 11 bits, and the last 10 bits of the data are numbers between 3 and 9 (when the character matching is carried out on the mobile phone numbers, a regular expression can be adopted for matching, and whether the target data accords with the character of the mobile phone numbers or not is judged); the data corresponding to the bank card number is characterized in that: the data length is 12 bits to 19 bits, etc. (the matching verification algorithm for the bank card number needs some special verification rules, for example, luhn algorithm can be adopted to judge whether the target data accords with the characteristics of the bank card number, but when the bank card number is verified, the card number rules of different banks may be different, so the bank card number verification characteristics stored in the data characteristic library can also be rule characteristics matched with different banks, and thus, in the case that the target data is the bank card number, the target data can be accurately matched with the data characteristic library); the rule corresponding to the ID card number is characterized in that: the data length is 15-18 bits, the identification card number with 15 bits is composed of digits, the last digit of the identification card number with 18 bits is composed of digits and letters, etc. (the rule for verifying the matching of the identification card number can also be in the form of regular expression). The above features are only embodiments of the present application, and specific data features (such as a bank card number and an identity card number) also need to be determined according to actual situations, and if necessary, the data may be matched by using an accurate verification algorithm provided by a corresponding mechanism, so as to ensure the correctness of data matching.
The following description will be given of the above S3 to S4 by taking fig. 3 as an example: for simplicity of description, the original data (i.e., the pieces of data to be processed included in the first data set) is denoted as D, the original data D generally exists in the form of a running of a field name (the field information) and a field value (the target data), and the original data D may be denoted as D (k, v), where k represents a field name (key), i.e., key, and v represents a field value (value), i.e., value). In addition, the original data D may be data such as a picture, and when the original data D is picture data, the field name may be a picture number, and the field may be picture position information or a specific attribute (an attribute indicating a picture uniquely and accurately) of the picture.
The original data D is divided into two different parts, namely, non-private data Dn (the above-mentioned non-private part) and private data Dy (the above-mentioned private part) by an integral data set (the above-mentioned first data set) through an F1 function, as shown in fig. 3. F1 functions are privacy data identification functions, and assuming that the functions consist of two functions of fk and fv, the relationship can be expressed as: f1 Fv (fk), where fk is a key recognition function (for matching field information included in the data to be processed with the target field library), and fv is a value recognition function (for matching target data features of the target data included in the data to be processed with the data feature library). The specific flow of F1 is as follows:
S1, starting and running a fk function;
it should be noted that the fk function is used to identify the k value in D, so as to identify whether the privacy field is included in the plurality of k. The implementation of the process depends on a field word stock (the target field stock), the identification process is to collide (match) all k values in D with the word stock, give out a hit result, and judge the hit k values as privacy fields (if matched, k is the privacy field), otherwise, the hit k values are non-privacy fields. It should be noted that the word stock of the field is maintainable and can include, for example, a user's name, ID card number, mobile phone number, bank account number, etc., the user can flexibly configure according to different scenes, and the field in the word stock can be taken out or a new field can be added;
s2, starting and running the fv function.
It will be appreciated that fv as described above is used to identify the v value in D in order to identify whether private data is included in the plurality of v. Considering that the field value in D (k, v) is not necessarily aligned with the field name, there is a misalignment, for example, the mobile phone number corresponds to the "name" and the first three digits of the mobile phone number are "130, 150, 155", etc., or the identification card number corresponds to the first three digits, and the 7 th digit starts, and the birth date of a person is represented by 6 digits. If there is a mismatch, when the name is not protected as privacy information in a certain scenario, the corresponding mobile phone number or identification card number under the name may be moved out of the range of the privacy information and not protected, resulting in "failure" of the privacy removal method.
Therefore, the fv function in the application is used for identifying the field value, mainly for judging the value, and is different from the field name identification, the function is not collided with the sensitive word stock, but is a pluggable algorithm library set based on the characteristics of a mobile phone number coding method, an identity card number coding method and the like, and a preset data type judging library (the data characteristic library) is also maintainable and can be configured by a user according to the needs of a scene.
It can be understood that, in order to avoid the above-mentioned case that the field name is misplaced with the field value, when the field value is matched, the method matches all target data in the plurality of pieces of data to be processed included in the first data set, so that the case that the field name is misplaced with the field value can be avoided, and all private data included in the data to be processed can be accurately identified. So that the data to be processed in the first data set is accurately classified into a private part and a non-private part. The reliability of classifying private data and non-private data is improved to a certain extent.
In the embodiment of the invention, an acquisition target field library is adopted, wherein the target field library comprises a plurality of privacy field information for indicating privacy parts in data to be processed; acquiring a data feature library, wherein the data feature library comprises a plurality of private data feature information for indicating private parts in data to be processed; determining a privacy part from the data to be processed according to a first matching result of the field information included in the data to be processed and the target field library and a second matching result of the target data feature of the target data included in the data to be processed and the data feature library; and determining the data which does not comprise the privacy part in the data to be processed as a non-privacy part, and accurately separating the privacy part and the non-privacy part in the data to be processed through different matching rules of field information and target data in the data to be processed, so that a data processing mode aiming at the data of the privacy part and the data of the non-privacy part is carried out later, and the data processing efficiency is improved.
In an optional implementation manner, determining the privacy part from the data to be processed according to the first matching result of the field information included in the data to be processed and the target field library and the second matching result of the target data feature of the target data included in the data to be processed and the data feature library includes:
s1, adding the reference field information and target data matched with the reference field information to a privacy part of data to be processed under the condition that the reference field information in the field information included in the data to be processed is the privacy field information included in the target field library;
s2, acquiring reference data matched with reference field information in the field information, and acquiring data characteristic information of the reference data;
and S3, in the case that the data characteristic information of the reference data is the private data characteristic information in the data characteristic database, adding the reference field information and the reference data to the private part of the data to be processed.
The processing operation in S1 may be, but is not limited to, including a plurality of pieces of field information in the plurality of pieces of data to be processed, traversing each piece of field information in the plurality of pieces of field information, and specifically includes the steps of: the following operations are repeatedly performed until each piece of field information in the plurality of pieces of field information is traversed: one piece of field information is acquired from the plurality of pieces of field information as reference field information, the reference field information is matched with the plurality of pieces of privacy field information included in the target field library, and the reference field information and the target data matched with the reference field information are added to the privacy portion of the data to be processed in the case where the reference field information is matched with at least one piece of the plurality of pieces of privacy field information in the target field library.
The processing operations in S2 to S3 described above can be understood as, but are not limited to: target data which are respectively matched with the field information are included in the data to be processed, and each piece of target data in the targets is traversed, wherein the specific steps are as follows: the following operations are repeatedly performed until each item of the multi-item tag data is traversed: and acquiring one item of target data from the multiple items of target data as reference data, matching the reference data with the multiple pieces of private data characteristic information included in the data characteristic library, and adding the reference data and field information corresponding to the reference data to the private part of the data to be processed under the condition that the reference reagent library is determined to be matched with the private data characteristic information in the data characteristic library.
In the embodiment of the invention, the reference field information in the field information included in the data to be processed is adopted, and the reference field information and the target data matched with the reference field information are added to the privacy part of the data to be processed under the condition that the reference field information is the privacy field information included in the target field library; acquiring reference data matched with reference field information in the field information, and acquiring data characteristic information of the reference data; under the condition that the data characteristic information of the reference data is the private data characteristic information in the data characteristic library, the reference field information and the reference data are added to the private part of the data to be processed, and the accurate matching of the field information and the target field library and the accurate matching of the target data and the private data characteristic information in the data characteristic library ensure that the matching process does not miss data, and the reliability of the classified data is improved.
In an optional embodiment, the analyzing the data to be processed according to the field information and the target data sequentially further includes responding to at least one of the following before determining the privacy part and the non-privacy part from the data to be processed:
responding to first editing operation of the target field library, and adding the privacy field information into the target field library;
deleting the selected privacy field information from the target field library in response to a second editing operation on the target field library;
responding to third editing operation of the data feature library, and adding the privacy data feature information into the data feature library;
and responding to fourth editing operation on the target field library, and deleting the selected privacy data characteristic information from the data characteristic library.
It may be understood that the operations from the first response to the fourth response are flexible configuration (adding and deleting) operations for the preset target field library and the data feature library according to different editing operations, where the flexible configuration operations may be flexible configuration for the target field library and the data feature library by different users according to different scenarios, for example, in a banking scenario, the user needs to provide a banking number to a banking person, and at this time, the "banking card number" field in the target field library and the private data feature information corresponding to the banking card number in the data feature library may be deleted. Another example is: in the case that the user identity card is lost and the user needs to make up the identity card, the user needs to provide the personal identity card number for the staff making up the identity card, and at this time, the 'identity card number' field in the target field library and the privacy data characteristic information corresponding to the identity card number in the data characteristic library can be deleted and the like.
In practice, the category of private information is "dynamic", some of which may be protected in some situations, but may not be protected as privacy in other situations, or even where it is necessary to open the information to the user, for example, the name and identification number of an individual typically need to be protected as private information, but real information may have to be provided when transacting a business such as a bank counter transfer, where privacy protection measures are not practical. If the scene is not divided, all private information, even comprehensive information, is protected, unnecessary pressure is caused on a computer system, and meaningless calculation overhead is increased. The same is true in data management, such as personal users needing to provide identification card numbers, mobile phone numbers, etc. as power usage account openings, where these private information needs to be opened to the management department for business needs. Some enterprise users may perform moderate privacy protection on data in some financial scenarios, for example, the enterprise users may supplement their own power consumption information as a part of credit data to the due investigation of a financial institution, and may also perform privacy protection on the part of data by using a light password based on the requirement of computing power overhead optimization.
By the above embodiment of the present application, the privacy field information is added to the target field library in response to the first editing operation on the target field library; deleting the selected privacy field information from the target field library in response to a second editing operation on the target field library; responding to a third editing operation of the data feature library, and adding the privacy data feature information into the data feature library; and deleting the selected privacy data characteristic information from the data characteristic library in response to a fourth editing operation on the target field library. The target field library and the data feature library are flexibly configured according to different application scenes, so that the application scenes of the data processing method are widened, the data accuracy of the data processing method in different application scenes is ensured, and the reliability of data processing is improved.
In an optional embodiment, the encrypting the private data subset by the target encryption method to obtain an encrypted data subset includes:
s1, acquiring the target encryption method from an encryption algorithm package, wherein the encryption algorithm package comprises at least one of the following components: symmetric encryption algorithm, asymmetric encryption algorithm, and password substitution algorithm;
S2, carrying out encryption processing on the privacy data subset according to the target encryption method to obtain the encrypted data subset.
It should be noted that, the above-mentioned multiple target encryption methods may be used for encrypting private data, specifically, may be selected specifically according to the type of the private data and the characteristics of the private data, for example, in a scenario of encrypting data with large data volume that needs to be encrypted, such as a large file, a database, and communication data, a symmetric encryption manner may be adopted, so as to improve the speed of encrypting and decrypting the data; in the scenes of key exchange, digital signature, identity verification and the like, an asymmetric encryption mode with higher complexity (in the asymmetric encryption mode, a public key is used for encryption and a private key is used for decryption) can be adopted, so that the security of data and the like is improved.
By the above embodiment of the present application, the target encryption method is obtained from an encryption algorithm package, where the encryption algorithm package includes at least one of the following: symmetric encryption algorithm, asymmetric encryption algorithm, and password substitution algorithm; and carrying out encryption processing on the privacy data subset according to the target encryption method to obtain the encrypted data subset. Different encryption methods are flexibly selected to carry out targeted encryption processing on the privacy data subsets, the encryption mode most suitable for the application scene can be selected from a plurality of encryption methods, the purpose that different application scenes want to be achieved is achieved, and the data processing method is completely matched with the application scene.
In an optional embodiment, the encrypting the private data subset by the target encrypting method to obtain an encrypted data subset includes at least one of the following:
firstly, acquiring the privacy part in one piece of the data to be processed from the privacy data subset, and encrypting the privacy part according to the target encryption method to obtain a first encryption part; adding said first encrypted portion to said encrypted data subset;
in a second aspect, when the target encryption method includes a plurality of target encryption algorithms, the private portion of one piece of the data to be processed is obtained from the private data subset, and target data corresponding to different field information in the private portion are encrypted by using the target encryption algorithms that are matched with each other; adding the processed second encrypted portion to the encrypted data subset;
in a third mode, under the condition that the target encryption method includes a plurality of target encryption algorithms, acquiring the privacy part in one piece of the data to be processed, and sequentially acquiring target privacy data corresponding to target privacy field information in the privacy part; encrypting a plurality of numerical subsequences in the numerical sequence by adopting the target encryption algorithm matched with each other under the condition that the target privacy data is the numerical sequence; and adding the processed third encrypted part to the encrypted data subset.
The encryption method in the first embodiment may be that each piece of private data in the private data subset is directly encrypted, and the obtained encryption result is sequentially added to the encrypted data subset until all pieces of private data in the private data subset are encrypted, so as to obtain a final encrypted data subset.
In the second mode, when a plurality of target encryption algorithms are obtained from the encryption algorithm package, different encryption algorithms can be adopted to encrypt according to different field information (field names) corresponding to the target data, and a specific encryption algorithm can be selected according to the actual requirements of scene and data encryption.
In the third mode, when a plurality of target encryption algorithms are obtained from the encryption algorithm package, different encryption algorithms may be used to encrypt the target data (field values) of different privacy portions obtained from the privacy data subset, for example, the mobile phone number is encrypted by an encryption algorithm corresponding to the mobile phone number (for example, the last four digits of the mobile phone number are encrypted), the identification card number is encrypted by an encryption algorithm corresponding to the identification card number (for example, the 7 th-12 th digits of the identification card number are encrypted), the bank card number is encrypted by an encryption algorithm corresponding to the bank card number (for example, the lower 7-10 digits of the bank card number are encrypted), and so on.
Through the above embodiment of the present application, different encryption modes are adopted for different scenes and data, namely: acquiring the privacy part in one piece of the data to be processed from the privacy data subset, and encrypting the privacy part according to the target encryption method to obtain a first encryption part; adding said first encrypted portion to said encrypted data subset; when the target encryption method comprises a plurality of target encryption algorithms, acquiring the privacy part in the data to be processed from the privacy data subset, and encrypting target data corresponding to different field information in the privacy part by adopting the target encryption algorithms matched with the target data; adding the processed second encrypted portion to the encrypted data subset; under the condition that the target encryption method comprises a plurality of target encryption algorithms, acquiring the privacy part in one piece of data to be processed, and sequentially acquiring target privacy data corresponding to target privacy field information in the privacy part; encrypting a plurality of numerical subsequences in the numerical sequence by adopting the target encryption algorithm matched with each other under the condition that the target privacy data is the numerical sequence; and adding the processed third encrypted part to the encrypted data subset. Different encryption modes are carried out on different target data and different field information, so that the reliability of encrypting the data is improved, and the compatibility of a data processing method with different scenes and different data is also improved.
In an optional embodiment, the foregoing encrypting the private data subset according to the target encryption method further includes, before obtaining the encrypted data subset:
s1, acquiring the data type of the target data included in the privacy data subset;
s2, when the data type is text type, converting the target data of the text type into the target data of a numerical value type through a target matching rule.
The operations in S1 to S2 described above are described below by taking fig. 3 and 4 as examples:
as shown in fig. 3, the private data D (a plurality of pieces of data to be processed included in the first data set) may be divided into private data Dy (the private portion) and non-private data Dn (the non-private portion) after being subjected to F1 function processing;
next, as shown in fig. 4, the privacy data Dy may be further divided into the text type data Du and the numeric type data Dm by using an F2 function (privacy data protection function). Assuming that the F2 function is composed of fc and fq functions, the D2 function can be expressed as: f2 =fq (fc), where fc is a data attribute identification function and fq is an encryption function. The specific execution flow of the F2 function is as follows:
S1, starting and running an fc function;
the value in the privacy data Dy is identified (value is identified instead of key), the purpose of the identification is to distinguish the data therein into quality data (data describing properties, features or attributes, data usually expressed in text form, namely, data of text type in S2 above) and numerical data, the main method is to distinguish whether v value is numerical type data or text type data (also called character type data), if the v value is text type data, the v value is converted into numerical type data, the conversion method and the measurement are configurable, such as "excellent, good, medium and bad" are converted into "1, 2, 3 and 4", the process often needs to be set by a user according to own needs, whether the setting is positive or negative, setting logic is recorded, and once the setting is determined, logic is not changed randomly in the continuous analysis process, so as to prevent confusion of the data logic.
S2, starting and running the fq function.
The fq function is an encryption function, and the method can be selected from an encryption algorithm packet or a configurable plug-in mode, such as symmetric encryption, asymmetric encryption, password substitution and the like, so as to realize encryption of numerical values. F2 function processing and output results can be expressed as:
Dm=F2(fq(fc(Dy(v))))。
After the data type of the target data is identified in S1, the data type may be directly encrypted by directly obtaining an encryption algorithm suitable for matching with the data type from the encryption algorithm package.
By the above embodiment of the present application, the data type of the target data included in the private data subset is acquired; and when the data type is text type, converting the target data of the text type into the target data of numerical value type through a target matching rule. According to the method, the text type data are converted into the numerical value type data, so that an encryption algorithm can be adopted to encrypt all target data, the data encryption efficiency is improved, meanwhile, after the text type data are subjected to data type conversion and encryption, even if the text type data are decrypted by other people, the other people cannot know the specific text meaning represented by the numerical value type data after decryption, the reliability of data encryption is improved, and the reliability of a data processing method is further improved.
In an alternative embodiment, after the acquiring the first data set to be processed in the trusted environment, the method further includes:
S1, processing the first data set according to a data processing method matched with service requirements to obtain a second data set;
s2, carrying out encryption processing on the privacy processing step in the data processing method according to a first encryption method to obtain privacy step data;
and S3, transmitting the second data set, the privacy step data and the non-privacy processing data corresponding to the non-privacy processing step in the data processing method to the non-trusted environment.
The privacy step data in S2 above may be understood as, but is not limited to: specific steps and/or arithmetic logic for obtaining target data of the privacy portion, for example, the target data is 10, and specific steps for obtaining 10 are (2+3) ×2; the specific steps described above include two steps, the first: 2+3; a second section 5×6; the above-mentioned arithmetic logic is the concrete calculation logic of the data, including add, subtract, multiply, divide, etc., then according to the rule of the logical operation, can confirm the arithmetic symbol such as add (+), subtract (-) as the non-privacy logical operation (the above-mentioned non-privacy processing step); if the operation symbol such as multiplication (x) and division (x) is determined as the privacy logic operation (the privacy processing step), then in the step (2+3) ×2, multiplication (x) is the privacy logic operation, and the multiplication (x) may be encrypted by the first encryption method, or if necessary, the whole of the (2+3) ×2 may be used as the privacy processing step, and the privacy step data may be obtained by encrypting by the first encryption method.
The following description will be given of the above S1 to S3 by taking fig. 5 as an example:
because the management department needs to manage for different users, different scenes and business types, statistical analysis based on data becomes indispensable, so that a huge model group is established and maintained through the business, and the logic of the models in the process of establishing and operating is used as confidential information of the management department and the enterprise, and often becomes important content of enterprise privacy protection. When processing the calculation data (du+dm), various mathematical operation rules are used, and some of these operation rules are often privacy information to be protected and privacy removal is required, so that an F3 function (assuming that the above-mentioned division (identification) of the privacy processing step and the non-privacy step and encryption process are implemented by the F3 function (a function of the privacy removal process for the data operation logic)) is introduced in the present application.
As shown in fig. 5, the calculation data (du+dm) is divided into different calculation parts according to the application scenario of the original data (or subjective judgment by the user, and the user is used for the data processing method here), that is, the calculation logic Cu that does not need encryption and the calculation logic Cm that needs encryption, and the procedure before the F3 function can be expressed as follows:
Cm+cu=application scenario according to the original data (du+dm);
similar to the F2 function, the F3 function represents an algorithm package, and the privacy removing algorithm encryption method may be called from the algorithm encryption package, or may be a configurable plug-and-pull manner, for example: isohomomorphic encryption, multiplication homomorphic, addition homomorphic, etc. F3 function processing and output results can be expressed as: lm=f3 (Cm), where Lm is the encrypted arithmetic logic.
According to the embodiment of the application, the first data set is processed according to a data processing method matched with service requirements to obtain a second data set; the privacy processing step in the data processing method is encrypted according to a first encryption method to obtain privacy step data; and transmitting the second data set, the privacy step data and the non-privacy processing data corresponding to the non-privacy processing step in the data processing method to the non-trusted environment. By encrypting the privacy logic, even if the privacy data is revealed, other users cannot derive the original data of the privacy data under the condition of no operation logic, so that the safety of the source of the privacy data is ensured, and the reliability of the data processing method is further improved.
In an optional embodiment, after the encrypting the private data subset by the target encrypting method to obtain the encrypted data subset, the method further includes:
s1, encrypting the target encryption method according to a second encryption method to obtain method encryption data;
s2, sending the encrypted data of the method to the un-trusted environment.
The second encryption method is used for encrypting the target encryption method, and the target encryption method can also comprise the first encryption method, so that the target privacy data and the privacy step data can be encrypted secondarily.
According to the embodiment of the application, the target encryption method is adopted to encrypt according to the second encryption method, so that method encryption data are obtained; and sending the encrypted data of the method to the non-trusted environment. By encrypting the target encryption method again, the security of the privacy part and the privacy step data is improved, and the reliability of the data processing method is further improved.
A complete embodiment of the present application is described below with reference to fig. 6-8 as a power data privacy preserving function operation and results.
S602, importing original data;
The original data in S602 is a plurality of pieces of data to be processed included in the first data set, for example, the original data is power data (i.e., D in the foregoing description), which is: a user named "Zhang" with an identification number of "32" and "1987" and a mobile phone number of "192" and "62" with no label, respectively, with power consumption of 60, 45, 78 KW-h for 3-6 months in 2023.
The data processing process for the original data is as follows:
s604, a privacy data identification function (F1);
the operation of the privacy data identification function is the first step of data processing of the acquired data in the application: data is imported into a post-package system containing the multiple composite function privacy preserving algorithm, and the F1 function is operated. The fk function recognizes that 3 fields of a name, an identity card number and a mobile phone number are privacy fields through a built-in field library, namely the 3 fields are classified as privacy data through a tag system; after the fv function is run, it is considered that 62 without labels may be the bank card number and also labeled as private data.
The name, the identification card number, the mobile phone number and the bank card number of the user are treated as privacy information, and the privacy data Dy (data) is obtained by labeling and identifying non-privacy data through a label system, wherein the privacy data Dy is represented as Dy (data) (the privacy data comprises name, identification card number, mobile phone number and bank card number: k-v). The power consumption data of the user does not enter the privacy data set Dy as privacy data.
S606, privacy data security management function (F2);
after the above-mentioned privacy data identification function is run, then, the privacy data security management function (F2) is run: the F2 function divides the handling of Dy (data) into 2 processes, the first step being to identify the data properties therein, running fc to get Dy (data) as a result: (privacy data: name-attribute data, identification card number-numeric data, cell phone number-numeric data, bank card number-numeric data); running fq, the encryption method and encryption result aiming at different types of data are obtained, namely Dy (data): (privacy data: name-quality data-kanized name, identification number-numerical data-Ha Xihou numerical value, cell phone number-numerical data-4 numerical value after which is numerical value covered with symbol, bank card number-numerical data-numerical value hiding last 3 digits).
The pseudonymization, hash operation and the last four digits are covered by symbols, and the last 3 digits are hidden, which are all privacy removing methods adopted for different types of privacy data.
S608, privacy logic safety control function (F3);
after the operation of the privacy data safety control function is completed, the privacy logic safety control function is operated: the user's electricity consumption data is not recognized as private information (corresponding to the above Du, non-private data) by the privacy algorithm mentioned herein, and assuming that the power management department needs to perform statistical analysis on part of the data, the calculation result is provided to the statistical office as a resident consumption data reference.
The processing process of the 3-6 month electricity consumption data 60, 45 and 78 KW-h is that the 3 month electricity consumption is added up, then the average is calculated, and the result is recorded as Q. The process is expressed as: q= (60+45+78)/3.
The arithmetic logic used in the above process, i.e., "+" and "/", assumes that "+" does not require privacy removal and "/" does require privacy removal in this scenario. The subjective judgment result of the statistical analyst is: cm: (/), namely treating the division rule as privacy, and then starting an F3 function to obtain Lm=F3 (/: PHE "/"), wherein PHE "/" represents the division rule processed by the semi-homomorphic encryption method.
S610, outputting a result.
With the experimental result after the data processing is performed on the obtained power data by the above steps, as shown in fig. 7, the abscissa is the service scene type and the ordinate is the CPU utilization rate in terms of the force overhead. Red is the CPU utilization of the algorithm EPDPPAMCF presented herein, and blue is the CPU utilization of the privacy preserving algorithm TPPA under the conventional unified principle. As for the occupancy rate of the storage resources, as shown in fig. 8, the abscissa is the traffic scene type, and the ordinate is the storage resource occupancy rate. Red is the storage resource utilization of the algorithm EPDPPAMCF set forth herein, and blue is the storage resource utilization of the privacy preserving algorithm TPPA under the conventional unified principle.
As can be seen from fig. 7 to fig. 8, in most experimental service scenarios, the algorithms proposed herein have less CPU utilization and less storage occupancy than conventional privacy preserving algorithms.
The reason for this is mainly: in the actual business of experiments, the category of private information is "dynamic", some information needs to be protected in some scenarios, but in other scenarios may not need to be protected as privacy, and even it is necessary to open the information to the user. For example: the person's name and identification number are typically required to be protected as private information, but when transacting a business such as a bank counter transfer, real information may have to be provided, where privacy protection measures are not practical. Because the traditional privacy protection method does not divide specific business scenes, the privacy information is subjected to unified protection processing, unnecessary pressure is caused on a computer system, and meaningless calculation overhead and occupation of storage resources are increased.
By adopting the data processing method, the privacy information in the original data set is constructed into the privacy protection algorithm framework and the technical logic by utilizing the multiple composite functions, and the privacy protection is pertinently protected through three stages of identification, data protection and logic protection, so that the resource occupation can be reduced on the basis of integrating the existing privacy protection technology, the algorithm efficiency is improved, the better privacy information protection effect is achieved, and an effective integral solution and a common technology foundation can be provided for the privacy protection in the big data age. The key points of the invention are data privacy identification, privacy data protection and model privacy logic protection flow and algorithm based on multiple composite functions. The data processing method has the characteristics of carrying out differential selection and operation according to scenes, purposes and degrees, namely, the algorithm optimization concept of privacy on demand and protection on demand; the method has the characteristics that the method only carries out operation on the privacy data to be protected, and adopts different algorithm combinations for the privacy information with different protection degrees, so that unnecessary calculation overhead and occupation of storage resources can be greatly reduced, and the privacy operation efficiency is improved as a whole.
Meanwhile, the data processing method can carry out differential selection and operation according to scenes, purposes and degrees, and is convenient for users; the method can also reflect the algorithm optimization concepts of 'privacy on demand' and 'protection on demand' from the data privacy identification stage to the final model privacy logic protection, only calculates the privacy data to be protected, adopts different algorithm combinations for privacy information with different protection degrees, and can greatly reduce unnecessary calculation overhead and occupation of storage resources, thereby improving the privacy calculation efficiency on the whole; privacy protection may also be performed according to scene differences.
It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present invention is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present invention. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present invention.
According to another aspect of the embodiment of the present invention, there is also provided a data processing apparatus for implementing the above data processing method. As shown in fig. 9, the apparatus includes:
an obtaining unit 902, configured to obtain a first data set to be processed in a trusted environment, where the trusted environment is configured with a first data security level, where the first data set includes a plurality of pieces of data to be processed, and each piece of data to be processed includes at least one field information and target data matched with at least one piece of field information;
an parsing unit 904, configured to parse the to-be-processed data sequentially according to the field information and the target data, determine a private portion and a non-private portion from the to-be-processed data, aggregate the private portions corresponding to each of the plurality of to-be-processed data into a private data subset, and aggregate the non-private portions corresponding to each of the plurality of to-be-processed data into a non-private data subset, where the private data subset is used for processing in the trusted environment, the non-private data subset is used for processing in the non-trusted environment, and the second data security level configured by the non-trusted environment is lower than the first data security level;
And an encryption unit 906, configured to encrypt the private data subset by using a target encryption method to obtain an encrypted data subset, and send the encrypted data subset and the non-private data subset to the non-trusted environment.
Alternatively, in this embodiment, the embodiments to be implemented by each unit module may refer to the embodiments of each method described above, which are not described herein again.
According to still another aspect of the embodiment of the present invention, there is also provided an electronic device for implementing the above data processing method, where the electronic device may be a terminal device or a server as shown in fig. 10. The present embodiment is described taking the electronic device as a terminal device as an example. As shown in fig. 10, the electronic device comprises a memory 1002 and a processor 1004, the memory 1002 having stored therein a computer program, the processor 1004 being arranged to perform the steps of any of the method embodiments described above by means of the computer program.
Alternatively, in this embodiment, the electronic device may be located in at least one network device of a plurality of network devices of the computer network.
Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:
S1, acquiring a first data set to be processed in a trusted environment, wherein the trusted environment is configured with a first data security level, the first data set comprises a plurality of pieces of data to be processed, and each piece of data to be processed comprises at least one piece of field information and target data matched with the at least one piece of field information;
s2, analyzing the data to be processed according to the field information and the target data in sequence, determining a privacy part and a non-privacy part from the data to be processed, aggregating the privacy parts corresponding to the data to be processed into a privacy data subset, and aggregating the non-privacy parts corresponding to the data to be processed into a non-privacy data subset, wherein the privacy data subset is used for processing in the trusted environment, the non-privacy data subset is used for processing in the non-trusted environment, and the second data security level configured by the non-trusted environment is lower than the first data security level;
s3, carrying out encryption processing on the privacy data subset through a target encryption method to obtain an encryption data subset, and sending the encryption data subset and the non-privacy data subset to the non-trusted environment.
Alternatively, it will be understood by those skilled in the art that the structure shown in fig. 10 is only illustrative, and the electronic device may be a tablet computer, a palm computer, a mobile internet device (Mobile Internet Devices, MID), a PAD, or other terminal devices. Fig. 10 is not limited to the structure of the electronic device described above. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 10, or have a different configuration than shown in FIG. 10.
The memory 1002 may be configured to store software programs and modules, such as program instructions/modules corresponding to the data processing methods and apparatuses in the embodiments of the present invention, and the processor 1004 executes the software programs and modules stored in the memory 1002 to perform various functional applications and data processing, that is, implement the data processing methods described above. The memory 1002 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, the memory 1002 may further include memory located remotely from the processor 1004, which may be connected to the terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 1002 may be used to store, but is not limited to, file information such as a target logical file. As an example, as shown in fig. 10, the memory 1002 may include, but is not limited to, the acquisition unit 902, the parsing unit 904, and the encryption unit 906 in the data processing apparatus. In addition, other module units in the data processing apparatus may be included, but are not limited to, and are not described in detail in this example.
Optionally, the transmission device 1006 is configured to receive or transmit data via a network. Specific examples of the network described above may include wired networks and wireless networks. In one example, the transmission means 1006 includes a network adapter (Network Interface Controller, NIC) that can be connected to other network devices and routers via a network cable to communicate with the internet or a local area network. In one example, the transmission device 1006 is a Radio Frequency (RF) module for communicating with the internet wirelessly.
In addition, the electronic device further includes: a display 1008, and a connection bus 1010 for connecting the various module components in the electronic device described above.
In other embodiments, the terminal device or the server may be a node in a distributed system, where the distributed system may be a blockchain system, and the blockchain system may be a distributed system formed by connecting the plurality of nodes through a network communication. Among them, the nodes may form a Peer-To-Peer (Peer To Peer) network, and any type of computing device, such as a server, a terminal, etc., may become a node in the blockchain system by joining the Peer-To-Peer network.
According to one aspect of the present application, a computer program product is provided, comprising a computer program/instructions containing program code for performing the method shown in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via a communication portion, and/or installed from a removable medium. When executed by a central processing unit, performs the various functions provided by the embodiments of the present application.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
According to an aspect of the present application, there is provided a computer-readable storage medium, from which a processor of a computer device reads the computer instructions, the processor executing the computer instructions, causing the computer device to perform the above-described data processing method.
Alternatively, in the present embodiment, the above-described computer-readable storage medium may be configured to store a computer program for performing the steps of:
s1, acquiring a first data set to be processed in a trusted environment, wherein the trusted environment is configured with a first data security level, the first data set comprises a plurality of pieces of data to be processed, and each piece of data to be processed comprises at least one piece of field information and target data matched with the at least one piece of field information;
S2, analyzing the data to be processed according to the field information and the target data in sequence, determining a privacy part and a non-privacy part from the data to be processed, aggregating the privacy parts corresponding to the data to be processed into a privacy data subset, and aggregating the non-privacy parts corresponding to the data to be processed into a non-privacy data subset, wherein the privacy data subset is used for processing in the trusted environment, the non-privacy data subset is used for processing in the non-trusted environment, and the second data security level configured by the non-trusted environment is lower than the first data security level;
s3, carrying out encryption processing on the privacy data subset through a target encryption method to obtain an encryption data subset, and sending the encryption data subset and the non-privacy data subset to the non-trusted environment.
Alternatively, in this embodiment, it will be understood by those skilled in the art that all or part of the steps in the methods of the above embodiments may be performed by a program for instructing a terminal device to execute the steps, where the program may be stored in a computer readable storage medium, and the storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic or optical disk, and the like.
The integrated units in the above embodiments may be stored in the above-described computer-readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present invention may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing one or more computer devices (which may be personal computers, servers or network devices, etc.) to perform all or part of the steps of the above-described method of the various embodiments of the present invention.
In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the above, is merely a logical function division, and may be implemented in another manner, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims (13)

1. A method of data processing, comprising:
acquiring a first data set to be processed in a trusted environment, wherein the trusted environment is configured with a first data security level, the first data set comprises a plurality of pieces of data to be processed, and each piece of data to be processed comprises at least one piece of field information and target data matched with at least one piece of field information;
Analyzing the data to be processed according to the field information and the target data in sequence, determining a privacy part and a non-privacy part from the data to be processed, aggregating the privacy parts corresponding to the data to be processed into a privacy data subset, aggregating the non-privacy parts corresponding to the data to be processed into a non-privacy data subset, wherein the privacy data subset is used for processing in the trusted environment, the non-privacy data subset is used for processing in the non-trusted environment, and the second data security level configured by the non-trusted environment is lower than the first data security level;
and carrying out encryption processing on the privacy data subset through a target encryption method to obtain an encrypted data subset, and sending the encrypted data subset and the non-privacy data subset to the non-trusted environment.
2. The method according to claim 1, wherein sequentially parsing the data to be processed according to the field information and the target data, and determining the private portion and the non-private portion from the data to be processed includes:
Acquiring a target field library, wherein the target field library comprises a plurality of privacy field information used for indicating privacy parts in the data to be processed;
acquiring a data feature library, wherein the data feature library comprises a plurality of private data feature information used for indicating private parts in the data to be processed;
determining a privacy part from the data to be processed according to a first matching result of the field information included in the data to be processed and the target field library and a second matching result of target data characteristics of the target data and the data characteristic library included in the data to be processed;
and determining the data which does not comprise the privacy part in the data to be processed as the non-privacy part.
3. The method according to claim 2, wherein the determining the privacy portion from the data to be processed based on the first matching result of the field information included in the data to be processed and the target field library, and the second matching result of the target data feature of the target data included in the data to be processed and the data feature library includes:
Adding the reference field information and the target data matched with the reference field information to the privacy part of the data to be processed under the condition that the reference field information in the field information included in the data to be processed is the privacy field information included in the target field library;
acquiring reference data matched with reference field information in the field information, and acquiring data characteristic information of the reference data;
in the case where the data characteristic information of the reference data is private data characteristic information in the data characteristic library, the reference field information and the reference data are added to the private portion of the data to be processed.
4. The method according to claim 2, wherein the parsing the data to be processed sequentially according to the field information and the target data, before determining the private portion and the non-private portion from the data to be processed, further comprises at least one of:
adding the privacy field information in the target field library in response to a first editing operation on the target field library;
Deleting the selected privacy field information from the target field library in response to a second editing operation on the target field library;
adding the private data feature information in the data feature library in response to a third editing operation on the data feature library;
and deleting the selected privacy data characteristic information from a data characteristic library in response to a fourth editing operation on the target field library.
5. The method according to claim 2, wherein the encrypting the private data subset by the target encryption method to obtain an encrypted data subset comprises:
the target encryption method is obtained from an encryption algorithm package, wherein the encryption algorithm package comprises at least one of the following components: symmetric encryption algorithm, asymmetric encryption algorithm, and password substitution algorithm;
and carrying out encryption processing on the privacy data subset according to the target encryption method to obtain the encrypted data subset.
6. The method according to claim 2, wherein the encrypting the private data subset by the target encryption method results in an encrypted data subset, comprising at least one of:
Acquiring the privacy part in one piece of data to be processed from the privacy data subset, and encrypting the privacy part according to the target encryption method to obtain a first encryption part; adding the first encrypted portion to the encrypted data subset;
under the condition that the target encryption method comprises a plurality of target encryption algorithms, acquiring the privacy part in one piece of data to be processed from the privacy data subset, and encrypting target data corresponding to different field information in the privacy part by adopting the target encryption algorithms matched with the target data; adding the processed second encrypted portion to the encrypted data subset;
under the condition that the target encryption method comprises a plurality of target encryption algorithms, acquiring the privacy part in one piece of data to be processed, and sequentially acquiring target privacy data corresponding to target privacy field information in the privacy part; encrypting a plurality of numerical subsequences in the numerical sequence by adopting the target encryption algorithm matched with each numerical subsequence under the condition that the target privacy data is the numerical sequence; and adding the processed third encrypted portion to the encrypted data subset.
7. The method according to claim 2, wherein said encrypting the private data subset according to the target encryption method further comprises, before obtaining the encrypted data subset:
acquiring a data type of the target data included in the privacy data subset;
and under the condition that the data type is a text type, converting the target data of the text type into the target data of a numerical value type through a target matching rule.
8. The method of claim 1, wherein after the obtaining the first data set to be processed in the trusted environment, further comprises:
processing the first data set according to a data processing method matched with the service requirement to obtain a second data set;
the privacy processing step in the data processing method is encrypted according to a first encryption method to obtain privacy step data;
and sending the second data set, the privacy step data and the non-privacy processing data corresponding to the non-privacy processing step in the data processing method to the non-trusted environment.
9. The method according to claim 1, wherein the encrypting the private data subset by the target encryption method further comprises, after obtaining the encrypted data subset:
Encrypting the target encryption method according to a second encryption method to obtain method encryption data;
and sending the method encryption data to the non-trusted environment.
10. A data processing apparatus, comprising:
an obtaining unit, configured to obtain a first data set to be processed in a trusted environment, where the trusted environment is configured with a first data security level, the first data set includes a plurality of pieces of data to be processed, and each piece of data to be processed includes at least one piece of field information and target data matched with at least one piece of field information;
the analyzing unit is used for sequentially analyzing the data to be processed according to the field information and the target data, determining a privacy part and a non-privacy part from the data to be processed, aggregating the privacy parts corresponding to the data to be processed into a privacy data subset, and aggregating the non-privacy parts corresponding to the data to be processed into a non-privacy data subset, wherein the privacy data subset is used for processing in the trusted environment, the non-privacy data subset is used for processing in the non-trusted environment, and the second data security level configured by the non-trusted environment is lower than the first data security level;
And the encryption unit is used for carrying out encryption processing on the privacy data subset through a target encryption method to obtain an encryption data subset, and sending the encryption data subset and the non-privacy data subset to the non-trusted environment.
11. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored program, wherein the program when run performs the method of any one of claims 1 to 9.
12. A computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of the method of any one of claims 1 to 9.
13. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method according to any of the claims 1 to 9 by means of the computer program.
CN202311379517.1A 2023-10-23 2023-10-23 Data processing method and device Pending CN117313158A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311379517.1A CN117313158A (en) 2023-10-23 2023-10-23 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311379517.1A CN117313158A (en) 2023-10-23 2023-10-23 Data processing method and device

Publications (1)

Publication Number Publication Date
CN117313158A true CN117313158A (en) 2023-12-29

Family

ID=89281143

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311379517.1A Pending CN117313158A (en) 2023-10-23 2023-10-23 Data processing method and device

Country Status (1)

Country Link
CN (1) CN117313158A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112395630A (en) * 2020-11-26 2021-02-23 平安普惠企业管理有限公司 Data encryption method and device based on information security, terminal equipment and medium
CN115021908A (en) * 2022-05-30 2022-09-06 中电长城网际系统应用有限公司 Privacy removing method and device for triple composite function, computer equipment and medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112395630A (en) * 2020-11-26 2021-02-23 平安普惠企业管理有限公司 Data encryption method and device based on information security, terminal equipment and medium
CN115021908A (en) * 2022-05-30 2022-09-06 中电长城网际系统应用有限公司 Privacy removing method and device for triple composite function, computer equipment and medium

Similar Documents

Publication Publication Date Title
US20200412767A1 (en) Hybrid system for the protection and secure data transportation of convergent operational technology and informational technology networks
CN112132198B (en) Data processing method, device and system and server
EP3509006A1 (en) Information sharing system
CN109815051A (en) The data processing method and system of block chain
CN112183765A (en) Multi-source multi-modal data preprocessing method and system for shared learning
CN110601815B (en) Block chain data processing method and equipment
CN111950030A (en) Data sharing storage method based on block chain, terminal equipment and storage medium
CN106850793A (en) A kind of method that remote trusted towards Android phone is collected evidence
CN108805574B (en) Transaction method and system based on privacy protection
CN113822675A (en) Block chain based message processing method, device, equipment and storage medium
CN117313122A (en) Data sharing and exchanging management system based on block chain
CN114218322A (en) Data display method, device, equipment and medium based on ciphertext transmission
CN117390657A (en) Data encryption method, device, computer equipment and storage medium
CN111639355A (en) Data security management method and system
CN111475690B (en) Character string matching method and device, data detection method and server
CN113239401A (en) Big data analysis system and method based on power Internet of things and computer storage medium
CN117151736A (en) Anti-electricity fraud management early warning method and system
CN107995616A (en) The processing method and device of user behavior data
CN117313158A (en) Data processing method and device
CN115085934A (en) Contract management method based on block chain and combined key and related equipment
CN115001768A (en) Data interaction method, device and equipment based on block chain and storage medium
CN113935874A (en) District chain-based book management system for studying income
CN113360575A (en) Method, device, equipment and storage medium for supervising transaction data in alliance chain
CN112257084A (en) Personal information storage and monitoring method, system and storage medium based on block chain
Tiwari et al. Enhancing the cloud security through RC6 and 3DES algorithms while achieving low-cost encryption

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination