CN110147680B - Method for optimizing data extraction - Google Patents
Method for optimizing data extraction Download PDFInfo
- Publication number
- CN110147680B CN110147680B CN201910456202.XA CN201910456202A CN110147680B CN 110147680 B CN110147680 B CN 110147680B CN 201910456202 A CN201910456202 A CN 201910456202A CN 110147680 B CN110147680 B CN 110147680B
- Authority
- CN
- China
- Prior art keywords
- data
- desensitization
- field
- encryption
- result set
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000013075 data extraction Methods 0.000 title claims abstract description 17
- 238000000586 desensitisation Methods 0.000 claims abstract description 69
- 238000012216 screening Methods 0.000 claims description 10
- 230000000873 masking effect Effects 0.000 claims description 3
- 238000006467 substitution reaction Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 238000013480 data collection Methods 0.000 claims 2
- 230000005540 biological transmission Effects 0.000 claims 1
- 238000012545 processing Methods 0.000 abstract description 2
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/602—Providing cryptographic facilities or services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6227—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computer Security & Cryptography (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Storage Device Security (AREA)
Abstract
The invention discloses a method for optimizing data extraction, which relates to the technical field of data processing and adopts the technical scheme that the method comprises the following steps: collecting data; the method comprises the steps of judging whether collected data are non-private data or private data, if the collected data are the non-private data, directly outputting and generating a result set, if the collected data are the private data, further judging whether the private data have association requirements or not, if the association requirements exist, carrying out encryption operation on the private data in a data encryption mode, if the association requirements do not exist, carrying out desensitization operation on the private data in a data desensitization mode, and after encryption or desensitization operation, outputting the data and generating the result set. And the data in the result set can be output outwards. The invention can encrypt the collected privacy data while collecting the complete data, and avoids the privacy data from being directly output so as to reveal the privacy of the user.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to a method for optimizing data extraction.
Background
The credit information platform needs to use a large amount of enterprise and personal data, and as the data collected by the credit information platform is more and more, a part of privacy data related to personal privacy is inevitably collected. For the part of data, the extracted data is generally directly transmitted to the field needing to be used in the collection and use process, and personal privacy is easily revealed.
Of course, with the development of network technology, the protection of private data becomes more and more obvious. At present, data desensitization operation is usually performed on private data in a credit information platform, the data desensitization operation generally provides a desensitization method for ten common private data in total, such as an identity card, a contact phone, a mailbox, a home address and the like, and if some items have special private data and need desensitization operation, additional data needs to be added in the method. When the encryption field is configured in the data desensitization stage, the input and output of the field need one-to-one maintenance, and if the data volume is large and the number of the fields is large, the maintenance time is long.
Disclosure of Invention
The invention provides a method for optimizing data extraction aiming at the requirements and the defects of the prior art development.
The invention discloses a method for optimizing data extraction, which adopts the following technical scheme for solving the technical problems:
a method of optimizing data extraction, the method comprising:
1) collecting data;
2) judging whether the collected data is non-private data or private data:
2a) acquiring data as non-private data, and executing the step 3);
2b) the collected data is the privacy data, and whether the privacy data has the association requirement or not is further judged:
2b-1) the association requirement exists, the private data is encrypted by adopting a data encryption mode, and the step 3) is executed;
2b-2) no association requirement exists, desensitizing operation is carried out on the private data in a data desensitization mode, and the step 3) is executed;
3) and outputting the data and generating a result set, wherein the data in the result set can be output outwards.
Specifically, the specific process of performing the encryption operation on the private data includes:
reading all field information of the private data through JAVA;
screening fields needing encryption according to the reference;
carrying out encryption operation on fields needing encryption by adopting an SM4 national password symmetric algorithm;
adding the encrypted fields into the specified fields of the result set;
the data contained in the processed result set can be output to the outside.
Specifically, when the SM4 cryptographic symmetric algorithm is used to encrypt a field to be encrypted:
the used encryption key supports two modes of a fixed key and a user-defined key;
storing the fixed secret key in the appointed path;
the custom keys are applicable to specific types of private data.
Optionally, the specific process of performing desensitization operation on the private data includes:
reading all field information of the private data through JAVA;
screening fields needing desensitization and desensitization types;
carrying out corresponding desensitization operation on fields needing desensitization in JAVA according to the desensitization type;
adding the desensitized field to a specified field of the result set;
the data contained in the processed result set can be output to the outside.
Further, during the desensitization operation of the private data:
when fields needing desensitization are screened, the desensitization types are divided according to screening results;
defining a desensitization rule according to the desensitization type, wherein the desensitization rule comprises shielding, deformation, replacement, randomness, format reservation encryption and a data encryption algorithm;
in JAVA, at least one desensitization rule is selected according to the desensitization type of a field needing desensitization, and desensitization operation is carried out on the field needing desensitization.
Optionally, the specific process of performing desensitization operation on the private data includes:
locating a sensitive field in certain private data;
formulating a rule generated by the sensitive field, and storing the rule into a sensitive field generation rule base;
reading original privacy data, calling a sensitive field to generate a rule base when the sensitive field is found, generating a new field different from the sensitive field by using a rule generated by the sensitive field corresponding to the sensitive field generation rule base, replacing the sensitive field by the new field according to a certain transformation rule until all sensitive fields in the original privacy data are replaced, forming a desensitized field, storing the desensitized field in a result set, and outputting data contained in the result set.
Further, rules for sensitive field generation are formulated, including masking, morphing, substitution, randomization, format preserving encryption, and data encryption algorithms.
Specifically, the data can be collected by connecting an external device through a universal interface, or by manually inputting the data.
Compared with the prior art, the method for optimizing data extraction has the following beneficial effects:
the method and the device sequentially collect data, judge the type of the collected data, further judge whether the privacy data has relevance, encrypt or safely output the collected data according to the relevance of the privacy data and the operation of data desensitization, can encrypt the collected privacy data while collecting complete data, and avoid the privacy data from being directly output so as to reveal the privacy of a user.
Drawings
FIG. 1 is a schematic flow diagram of the present invention.
Detailed Description
In order to make the technical solutions, technical problems to be solved, and technical effects of the present invention more clearly apparent, the technical solutions of the present invention are described below in detail and completely with reference to specific embodiments, and it is obvious that the described embodiments are only a part of embodiments of the present invention, but not all embodiments. All embodiments that can be obtained by a person skilled in the art without making any inventive step on the basis of the embodiments of the present invention are within the scope of protection of the present invention.
The first embodiment is as follows:
with reference to fig. 1, the present embodiment provides a method for optimizing data extraction,
a method of optimizing data extraction, the method comprising:
1) collecting data by a credit information platform;
2) judging whether the collected data is non-private data or private data:
2a) acquiring data as non-private data, and executing the step 3);
2b) the collected data is the privacy data, and whether the privacy data has the association requirement or not is further judged:
2b-1), carrying out encryption operation on private data by adopting a data encryption mode and executing the step 3) if correlation requirements exist;
2b-2) no association requirement exists, desensitizing the private data by adopting a data desensitization mode, and executing the step 3);
3) and outputting the data and generating a result set, wherein the data in the result set can be output externally.
In this embodiment, the specific process of performing an encryption operation on private data includes:
reading all field information of the private data through JAVA;
screening fields needing encryption according to the reference;
carrying out encryption operation on fields needing to be encrypted by adopting an SM4 national password symmetric algorithm;
adding the encrypted fields into the specified fields of the result set;
the data contained in the processed result set can be output to the outside.
In this embodiment, when the SM4 cryptographic symmetric algorithm is used to perform an encryption operation on a field to be encrypted:
the used encryption key supports two modes of a fixed key and a user-defined key;
storing the fixed secret key in the appointed path;
the custom key is applicable to a specific type of private data.
In this embodiment, the specific process of performing desensitization operation on the private data includes:
reading all field information of the private data through JAVA;
screening fields needing desensitization and desensitization types;
carrying out corresponding desensitization operation on fields needing desensitization in JAVA according to the desensitization type;
adding the desensitized fields to the specified fields of the result set;
and the data contained in the processed result set can be output to the outside.
In the process of desensitizing the private data:
when fields needing desensitization are screened, the desensitization types are divided according to screening results;
defining desensitization rules according to desensitization types, wherein the desensitization rules comprise shielding, deformation, replacement, randomness, format preserving encryption and data encryption algorithms;
in JAVA, at least one desensitization rule is selected according to the desensitization type of the field needing desensitization, and desensitization operation is carried out on the field needing desensitization.
In this embodiment, the credit information platform may be connected to an external device through a general-purpose interface, so as to collect data.
Example two:
with reference to fig. 1, this embodiment provides a method for optimizing data extraction, where the method includes:
1) collecting data;
2) judging whether the collected data is non-private data or private data:
2a) acquiring data as non-private data, and executing the step 3);
2b) the collected data is the privacy data, and whether the privacy data has the association requirement or not is further judged:
2b-1) the association requirement exists, the private data is encrypted by adopting a data encryption mode, and the step 3) is executed;
2b-2) no association requirement exists, desensitizing operation is carried out on the private data in a data desensitization mode, and the step 3) is executed;
3) and outputting the data and generating a result set, wherein the data in the result set can be output externally.
In this embodiment, the specific process of performing the encryption operation on the private data includes:
reading all field information of the private data through JAVA;
screening fields needing encryption according to the reference;
carrying out encryption operation on fields needing encryption by adopting an SM4 national password symmetric algorithm;
adding the encrypted fields into the specified fields of the result set;
the data contained in the processed result set can be output to the outside.
In this embodiment, when the SM4 cryptographic symmetric algorithm is used to perform an encryption operation on a field to be encrypted:
the used encryption key supports two modes of a fixed key and a user-defined key;
storing the fixed secret key in the appointed path;
the custom key is applicable to a specific type of private data.
In this embodiment, the specific process of performing desensitization operation on the private data includes:
locating a sensitive field in certain private data;
formulating a rule generated by the sensitive field, and storing the rule into a sensitive field generation rule base;
reading original privacy data, calling a sensitive field to generate a rule base when the sensitive field is found, generating a new field different from the sensitive field by using a rule generated by the corresponding sensitive field in the sensitive field generation rule base, replacing the sensitive field by the new field according to a certain transformation rule until all sensitive fields in the original privacy data are replaced, forming a desensitization field, storing the desensitization field in a result set, and outputting data contained in the result set.
Rules for sensitive field generation are formulated, including masking, morphing, substitution, randomization, format preserving encryption, and data encryption algorithms.
In this embodiment, the credit information platform not only connects to the external device through the universal interface to collect data, but also collects data through manual input.
In summary, the method for optimizing data extraction of the invention sequentially collects data, judges the type of the collected data, further judges whether the privacy data has relevance, and outputs the collected data safely in operation of encryption or data desensitization according to the relevance of the privacy data.
The principle and embodiments of the present invention are described in detail by using specific examples, which are only used to help understanding the core technical content of the present invention, and are not used to limit the protection scope of the present invention, and the technical solution of the present invention is not limited to the above specific embodiments. Based on the above embodiments of the present invention, those skilled in the art should make any improvements and modifications to the present invention without departing from the principle of the present invention, and all such modifications and modifications should fall within the scope of the present invention.
Claims (6)
1. A method for optimizing data extraction, the method comprising:
1) collecting data;
2) judging whether the collected data is non-private data or private data:
2a) acquiring data as non-private data, and executing the step 3);
2b) the collected data is the privacy data, and whether the privacy data has the association requirement or not is further judged:
2b-1), the method adopts a data encryption mode to carry out encryption operation on private data, and the specific encryption process comprises the following steps: reading all field information of the private data through JAVA, screening fields needing to be encrypted according to the transmission parameters, carrying out encryption operation on the fields needing to be encrypted by adopting an SM4 national password symmetric algorithm, adding the encrypted fields into the specified fields of the result set, outputting the data contained in the processed result set, and executing the step 3);
2b-2) no association requirement exists, a data desensitization mode is adopted to perform desensitization operation on the private data, and the specific desensitization process comprises the following steps: reading all field information of the private data through JAVA, screening out fields needing desensitization and desensitization types, carrying out corresponding desensitization operation on the fields needing desensitization in JAVA according to the desensitization types, adding the desensitized fields into specified fields of a result set, outputting data contained in the processed result set to the outside, and executing step 3);
3) and outputting the data and generating a result set, wherein the data in the result set can be output externally.
2. The method for optimizing data extraction as claimed in claim 1, wherein when the SM4 algorithm is used to perform encryption operation on the field to be encrypted:
the used encryption key supports two modes of a fixed key and a user-defined key;
storing the fixed secret key in the appointed path;
the custom key is applicable to a specific type of private data.
3. A method of optimising data extraction as claimed in claim 1 wherein, during the desensitization operation on the private data:
when fields needing desensitization are screened, desensitization types are divided according to screening results;
defining desensitization rules according to desensitization types, wherein the desensitization rules comprise shielding, deformation, replacement, randomness, format preserving encryption and data encryption algorithms;
in JAVA, at least one desensitization rule is selected according to the desensitization type of the field needing desensitization, and desensitization operation is carried out on the field needing desensitization.
4. The method for optimizing data extraction according to claim 1, wherein the specific process of desensitizing the private data includes:
locating a sensitive field in certain private data;
formulating a rule generated by the sensitive field, and storing the rule into a sensitive field generation rule base;
reading original privacy data, calling a sensitive field to generate a rule base when the sensitive field is found, generating a new field different from the sensitive field by using a rule generated by the sensitive field corresponding to the sensitive field generation rule base, replacing the sensitive field by the new field according to a certain transformation rule until all sensitive fields in the original privacy data are replaced, forming a desensitized field, storing the desensitized field in a result set, and outputting data contained in the result set.
5. The method of claim 4, wherein rules for sensitive field generation are formulated, the rules including masking, morphing, substitution, randomization, format preserving encryption, and data encryption algorithms.
6. The method for optimizing data extraction as claimed in claim 1, wherein the data collection can be performed by connecting an external device through a universal interface, or by manually inputting the data collection.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910456202.XA CN110147680B (en) | 2019-05-29 | 2019-05-29 | Method for optimizing data extraction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910456202.XA CN110147680B (en) | 2019-05-29 | 2019-05-29 | Method for optimizing data extraction |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110147680A CN110147680A (en) | 2019-08-20 |
CN110147680B true CN110147680B (en) | 2022-07-26 |
Family
ID=67593715
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910456202.XA Active CN110147680B (en) | 2019-05-29 | 2019-05-29 | Method for optimizing data extraction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110147680B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106203145A (en) * | 2016-08-04 | 2016-12-07 | 北京网智天元科技股份有限公司 | Data desensitization method and relevant device |
CN108418676A (en) * | 2018-01-26 | 2018-08-17 | 山东超越数控电子股份有限公司 | A kind of data desensitization method based on permission |
CN109614816A (en) * | 2018-11-19 | 2019-04-12 | 平安科技(深圳)有限公司 | Data desensitization method, device and storage medium |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9467279B2 (en) * | 2014-09-26 | 2016-10-11 | Intel Corporation | Instructions and logic to provide SIMD SM4 cryptographic block cipher functionality |
-
2019
- 2019-05-29 CN CN201910456202.XA patent/CN110147680B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106203145A (en) * | 2016-08-04 | 2016-12-07 | 北京网智天元科技股份有限公司 | Data desensitization method and relevant device |
CN108418676A (en) * | 2018-01-26 | 2018-08-17 | 山东超越数控电子股份有限公司 | A kind of data desensitization method based on permission |
CN109614816A (en) * | 2018-11-19 | 2019-04-12 | 平安科技(深圳)有限公司 | Data desensitization method, device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110147680A (en) | 2019-08-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110457912B (en) | Data processing method and device and electronic equipment | |
CN104660589B (en) | Method, system and terminal for encrypting control and information analysis of information | |
CN109698884B (en) | Fraud call identification method and system | |
Anwar et al. | Forensic SIM card cloning using authentication algorithm | |
CN110933063A (en) | Data encryption method, data decryption method and equipment | |
WO2020233014A1 (en) | Message sending method and apparatus, and computer device and storage medium | |
CN112529586B (en) | Transaction information management method, device, equipment and storage medium | |
CN113836578A (en) | Method and system for maintaining security of sensitive data of big data | |
Kitsaki et al. | A forensic investigation of Android mobile applications | |
Yuliani et al. | Forensic analysis whatsapp mobile application on android-based smartphones using national institute of standard and technology (nist) framework | |
CN113946862A (en) | Data processing method, device and equipment and readable storage medium | |
CN112287371B (en) | Method and device for storing industrial data and computer equipment | |
CN110147680B (en) | Method for optimizing data extraction | |
Al-Mousa et al. | Examining Digital Forensic Evidence for Android Applications | |
CN116644472A (en) | Data encryption and data decryption methods and devices, electronic equipment and storage medium | |
CN116861477A (en) | Data processing method, system, terminal and storage medium based on privacy protection | |
CN115422579A (en) | Data encryption storage and query method and system after storage | |
Sulisdyantoro et al. | Identification of Whatsapp digital evidence on Android smartphones using the Android backup APK (application package kit) downgrade method | |
CN102768671B (en) | File processing method and system | |
CN110990848A (en) | Sensitive word encryption method and device based on hive data warehouse and storage medium | |
CN111914271B (en) | Privacy protection system and method for big data release | |
CN116776365A (en) | Data query method, device and storage medium | |
CN115292729A (en) | Privacy-protecting multi-party data processing method, device and equipment | |
CN113674083A (en) | Internet financial platform credit risk monitoring method, device and computer system | |
CN115080987A (en) | Password management method, device, system, storage medium and computer equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 271000 Langchao science and Technology Park, 527 Dongyue street, Tai'an City, Shandong Province Applicant after: INSPUR SOFTWARE Co.,Ltd. Address before: No. 1036, Shandong high tech Zone wave road, Ji'nan, Shandong Applicant before: INSPUR SOFTWARE Co.,Ltd. |
|
CB02 | Change of applicant information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |