CN108683643B - Data desensitization system based on streaming processing and desensitization method thereof - Google Patents

Data desensitization system based on streaming processing and desensitization method thereof Download PDF

Info

Publication number
CN108683643B
CN108683643B CN201810378506.4A CN201810378506A CN108683643B CN 108683643 B CN108683643 B CN 108683643B CN 201810378506 A CN201810378506 A CN 201810378506A CN 108683643 B CN108683643 B CN 108683643B
Authority
CN
China
Prior art keywords
data
module
desensitization
data packet
packet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810378506.4A
Other languages
Chinese (zh)
Other versions
CN108683643A (en
Inventor
张黎
邹开红
詹金凯
肖增辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Flash It Co ltd
Original Assignee
Flash It Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Flash It Co ltd filed Critical Flash It Co ltd
Priority to CN201810378506.4A priority Critical patent/CN108683643B/en
Publication of CN108683643A publication Critical patent/CN108683643A/en
Application granted granted Critical
Publication of CN108683643B publication Critical patent/CN108683643B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • G06F40/157Transformation using dictionaries or tables

Abstract

The invention relates to the field of network information communication security, in particular to a data desensitization system based on streaming processing and a desensitization method thereof. The invention is realized by the following technical scheme: a data desensitization system based on streaming processing comprises an acquisition module for acquiring data, a desensitization module for desensitizing the data, and a sending module for sending the desensitized data, and further comprises: a cache module; and the judging module is used for judging whether one or more data packets received at present contain complete record lines or not and storing the collected one or more data packets in the cache module. The invention aims to provide a data desensitization system based on stream processing and a desensitization method thereof, which are different from the prior art, and adopt a stream mode to perform desensitization processing on data sent by a server, thereby avoiding the need of large-capacity storage space for data caching, having high data desensitization speed and improving the time delay problem.

Description

Data desensitization system based on streaming processing and desensitization method thereof
Technical Field
The invention relates to the field of network information communication security, in particular to a data desensitization system based on streaming processing and a desensitization method thereof.
Background
With the progress of the era, the internet communication and the application thereof have promoted and carried the big data era. Compared with the traditional data, the big data has the characteristics of large application data traffic, high speed and multiple types, so that the Internet becomes an open complex system, not only brings convenience to communication of people, but also correspondingly bears the complex and unknown problems, and the complex and unknown problems comprise the threat and risk of network security.
In the field of big data processing, data security issues in the storage and distribution links of data have become one of the focuses of people's attention, and in this context, people have begun to use data desensitization technology. Data desensitization refers to data deformation of some sensitive information through desensitization rules, and reliable protection of sensitive private data is achieved. Under the condition of relating to client security data or some business sensitive data, the real data is modified and provided for test use without violating system rules, and data desensitization is needed to be carried out on personal information such as identification numbers, mobile phone numbers, card numbers, client numbers and the like.
In the prior art, the data desensitization is realized by the following methods: firstly, caching a data packet needing desensitization, and if the current data packet is a complete protocol packet, converting data according to a set rule; and then, completing a desensitization process, if the data packet is not a complete protocol packet, continuing to cache the data until the cached data can form the complete protocol packet, and then converting the cached data according to a set rule. However, such a solution has certain drawbacks.
Defect one: the buffer size required during desensitization is not predictable. Since data may need to be cached before desensitization, how much data is cached is dependent on the specific database query method and cannot be expected in advance. When multiple desensitization operations are in progress, the system's existing storage space may be insufficient to allow the system to continue to operate properly.
And defect two: the real-time response is poor. All data needs to be buffered first, and the result can be returned only after data desensitization is performed after the buffering is finished, and the manner of buffering all data and desensitizing a large amount of data needs to consume a large amount of time, so that the returned desensitized data has a delay problem.
Disclosure of Invention
The invention aims to provide a data desensitization system based on stream processing and a desensitization method thereof, which are different from the prior art, and adopt a stream mode to perform desensitization processing on data sent by a server, thereby avoiding the need of large-capacity storage space for data caching, having high data desensitization speed and improving the time delay problem.
The technical purpose of the invention is realized by the following technical scheme: a data desensitization system based on streaming processing comprises an acquisition module for acquiring data, a desensitization module for desensitizing the data, and a sending module for sending the desensitized data, and further comprises:
a cache module;
and the judging module is used for judging whether one or more data packets received at present contain complete record lines or not and storing the collected one or more data packets in the cache module.
Preferably, the desensitization module includes an extraction module, a mapping module and a replacement module, the extraction module is configured to search and extract sensitive data in the data and a target desensitization rule corresponding to the sensitive data, the mapping module is configured to search a target data dictionary corresponding to the target desensitization rule in a mapping relationship, and the replacement module is configured to replace the sensitive data corresponding to the target data dictionary.
Preferably, when the determining module determines that a complete record line exists in a data packet, the desensitizing module directly desensitizes the data packet.
Preferably, when the judging module judges that a complete recording line exists in two or more data packets, the merging module firstly splices the two or more data packets, and then the desensitizing module desensitizes the merged recording line data.
Preferably, when the determining module determines that there is partial data of both the previous record row and the next record row in a data packet, the previous merging module concatenates the data packet with all data packets containing the previous record row before, and then desensitizes the data packet, and after desensitization, the sending module sends out all data packets containing the previous record row before the data packet.
A desensitization method of a streaming based data desensitization system, comprising the steps of:
the method comprises the following steps: a step of grabbing the package,
in the step, the acquisition module acquires the data packets sent from the server one by one and caches the data packets in the cache module;
step two: a step of judgment,
in the step, the judging module judges whether the data packet has a complete record line, if so, the next step is entered, and if not, the data packet is continuously captured until the data packet has the complete record line;
step three: a desensitization step of the pre-treatment step,
in the step, the desensitization module performs desensitization treatment on the data packet which is acquired in the step two and has the complete record row;
step four: a step of sending the data, wherein,
and the sending module sends the desensitized data to the client.
Preferably, in the second step, if a currently acquired data packet contains a complete record row, the desensitization module directly performs desensitization processing on the data packet, and then the data packet is sent to the client by the sending module.
Preferably, in the second step, if a currently acquired data packet does not include a complete record row, the acquiring module continues to capture the next data packet until the determining module determines that all currently captured data packets include a complete record row, and then the merging module performs data desensitization and transmission after splicing all data packets forming a complete record row.
Preferably, in the second step, if a currently acquired data packet does not include a complete record row, the acquiring module continues to capture a next data packet until the determining module determines that all currently captured data packets include a complete record row, and when the last data packet includes both the current record row data and the next record row data, the merging module merges the first data packet into the last data packet for merging and desensitizing, but keeps the last data packet from being sent, and all previous data packets are sent to the client by the sending module.
In conclusion, the invention has the following beneficial effects:
1. the technical scheme adopts the streaming desensitization technology, can operate only by caching to a complete record line, and has low requirement on the cache capacity.
2. The desensitization operation has high efficiency and less time delay.
Description of the drawings:
FIG. 1 is a schematic diagram of example 1;
FIG. 2 is a detailed schematic diagram of the data desensitization system of FIG. 1.
In the figure.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The present embodiment is only for explaining the present invention, and it is not limited to the present invention, and those skilled in the art can make modifications of the present embodiment without inventive contribution as needed after reading the present specification, but all of them are protected by patent law within the scope of the claims of the present invention.
Embodiment 1, as shown in fig. 1, includes an upper server side and a lower client side. The specific implementation form of the client may be a PC computer, a notebook computer, an IPAD, a smart phone, a tablet computer, and the like, which is not limited herein. The data desensitization system sends a data request to the data desensitization system by a client, the data desensitization system directly sends the data request to a server, the server reads the data request and then sends corresponding original data to the data desensitization system, and the data desensitization system performs data desensitization on the data according to a certain rule and then sends the desensitized data to the client.
Specifically, as shown in fig. 2, the data packets sent by the client to the data desensitization system are all sent one by one, such as packet1, packet2, and packet3 … packet n, and the sent data packets are buffered in the buffer module one by one. Different from the prior art, in the technical scheme, the judgment module is required to judge the integrity of the record line, and at least three conditions exist at the moment.
In the first case, the first acquired packet1 itself contains a complete record row, and at this time, desensitization processing is directly performed on this packet1, and then the desensitization processing is sent to the client through the sending module.
In case two, the first obtained packet1 only contains a part of a record row, and the rest of the data of the record row exists in subsequent packets [2-N ], for example, in packets 2, packets 3, and packets 4, at this time, the merge module performs a flat-combining on the data of all the packets, i.e., packets 1 to 4, then performs a desensitization on the merged record data, and finally sends packets 1 to 4 desensitized packets, respectively.
Since a record row usually contains a plurality of fields, these fields do not necessarily exist in a data packet, and the above phenomenon occurs.
The difference between the third case and the second case is that the last packet contains not only the field of the previous record row but also a partial field of the next record row.
For example, data of record line a exists in each of packet1, packet2, and packet 3. In packet4, there is not only a part of the data of recording line a, but also a part of the data of the next recording line, that is, recording line B. In this case, the data of packets 1 to 4 can still be subjected to splicing and desensitization, but at the time of transmission, only the desensitization data of packets 1 to 3 are transmitted, while the data of packet4 is left to participate in the desensitization operation of the next recording line. Since packet4 still contains an incomplete record line B, buffering is still required until a subsequent packet completes the record line B.
Firstly, acquiring a plurality of sensitive data and acquiring a target desensitization rule corresponding to each sensitive data; then, searching a target data dictionary corresponding to each target desensitization rule in a mapping relation, wherein the mapping relation comprises the corresponding relation between the desensitization rule and the data dictionary; and finally, replacing corresponding sensitive data by adopting the searched target data dictionary to realize data desensitization. For example, the phone number 15832321212 would be desensitised to 158 x 1212.
Compared with the prior art, the technical scheme has the advantages that desensitization operation can be carried out only until the data packets can form a finished protocol packet, on one hand, the size of the cache required in the desensitization process is unpredictable, on the other hand, time consumption is long, and the desensitization data sent to a client end has a time delay problem. The technical scheme adopts the streaming desensitization technology, can operate only by caching to a complete record line, and has the advantages of low requirement on cache capacity, high desensitization operation efficiency and low time delay.

Claims (3)

1. A data desensitization system based on stream processing comprises an acquisition module for acquiring data, a desensitization module for desensitizing the data and a sending module for sending the desensitized data, and is characterized in that the data desensitization module is used for sending the desensitized data; also includes: a cache module; a judging module, configured to judge whether one or more currently received data packets include a complete record row, and store the collected one or more data packets in the cache module, when the judging module judges that a complete record row exists in a data packet, the desensitization module directly performs desensitization processing on the data packet, when the judging module judges that a complete record row exists in two or more data packets, the merging module splices the two or more data packets first, then the desensitization module desensitizes the merged record row data, and when the judging module judges that both partial data of a previous record row and partial data of a next record row exist in a data packet, the previous merging module splices the data packet with all data packets that previously include the previous record row, and then desensitizing, wherein the sending module only sends out all the data packets which are before the data packet and contain the previous record row after desensitizing.
2. A streaming-based data desensitization system according to claim 1, wherein: the desensitization module comprises an extraction module, a mapping module and a replacement module, wherein the extraction module is used for searching and extracting sensitive data in the data and a target desensitization rule corresponding to the sensitive data, the mapping module is used for searching a target data dictionary corresponding to the target desensitization rule in a mapping relation, and the replacement module is used for replacing the corresponding sensitive data by the target data dictionary.
3. A desensitization method of a data desensitization system based on streaming processing is characterized by comprising the following steps: a packet capturing step, in which an acquisition module acquires data packets sent from a server one by one and caches the data packets in a cache module; step two: a judging step, wherein a judging module judges whether the data packet has a complete record line, if so, the next step is carried out, and if not, the data packet is continuously captured until the data packet has the complete record line; step three: desensitizing, namely performing desensitization treatment on the data packet with the complete recording line acquired in the step two by a desensitizing module; step four: a data sending step, in which a sending module sends desensitized data to a client; when the judgment module judges that a complete record line exists in a data packet, the desensitization module directly performs desensitization treatment on the data packet; when the judging module judges that a complete recording line exists in two or more data packets, the merging module firstly splices the two or more data packets, and then the desensitization module desensitizes the merged recording line data; when the judging module judges that partial data of a previous recording line and partial data of a next recording line exist in a data packet, the data packet is spliced with all data packets containing the previous recording line by the prior merging module, then desensitization is carried out, and after the desensitization, the sending module only sends all data packets containing the previous recording line before the data packet.
CN201810378506.4A 2018-04-25 2018-04-25 Data desensitization system based on streaming processing and desensitization method thereof Active CN108683643B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810378506.4A CN108683643B (en) 2018-04-25 2018-04-25 Data desensitization system based on streaming processing and desensitization method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810378506.4A CN108683643B (en) 2018-04-25 2018-04-25 Data desensitization system based on streaming processing and desensitization method thereof

Publications (2)

Publication Number Publication Date
CN108683643A CN108683643A (en) 2018-10-19
CN108683643B true CN108683643B (en) 2020-11-13

Family

ID=63801662

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810378506.4A Active CN108683643B (en) 2018-04-25 2018-04-25 Data desensitization system based on streaming processing and desensitization method thereof

Country Status (1)

Country Link
CN (1) CN108683643B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111143880B (en) * 2019-12-27 2022-06-07 中电长城网际系统应用有限公司 Data processing method and device, electronic equipment and readable medium
CN111935081B (en) * 2020-06-24 2022-06-21 武汉绿色网络信息服务有限责任公司 Data packet desensitization method and device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8958562B2 (en) * 2007-01-16 2015-02-17 Voltage Security, Inc. Format-preserving cryptographic systems
US7797341B2 (en) * 2007-04-30 2010-09-14 Hewlett-Packard Development Company, L.P. Desensitizing database information
US8176080B2 (en) * 2009-03-06 2012-05-08 Hewlett-Packard Development Company, L.P. Desensitizing character strings
CN106372071B (en) * 2015-07-20 2019-07-12 阿里巴巴集团控股有限公司 The information acquisition method and device of data warehouse
CN106203145A (en) * 2016-08-04 2016-12-07 北京网智天元科技股份有限公司 Data desensitization method and relevant device

Also Published As

Publication number Publication date
CN108683643A (en) 2018-10-19

Similar Documents

Publication Publication Date Title
US20190222603A1 (en) Method and apparatus for network forensics compression and storage
CN110650128B (en) System and method for detecting digital currency stealing attack of Etheng
CN108712426B (en) Crawler identification method and system based on user behavior buried points
US20220294821A1 (en) Risk control method, computer device, and readable storage medium
CN104125163B (en) Data processing method and device and terminal
CN104615760A (en) Phishing website recognizing method and phishing website recognizing system
CN103885990B (en) Searching method and system
CN108133008A (en) The processing method of business datum, device, equipment and storage medium in database
CN107395782A (en) A kind of IP limitation controlled source information extraction methods based on agent pool
CN102752288A (en) Method and device for identifying network access action
CN110019873B (en) Face data processing method, device and equipment
CN105550222A (en) Distributed storage-based image service system and method
CN106294826A (en) A kind of company-data Query method in real time and system
CN105511812A (en) Method and device for optimizing big data of memory system
CN108683643B (en) Data desensitization system based on streaming processing and desensitization method thereof
KR20180074774A (en) How to identify malicious websites, devices and computer storage media
CN106330963A (en) Cross-network multi-node log collecting method
CN102148805A (en) Feature matching method and device
CN109669795A (en) Crash info processing method and processing device
CN109145040A (en) A kind of data administering method based on double message queues
CN109947729A (en) A kind of real-time data analysis method and device
CN103366008A (en) Resource searching method and device
CN103546829A (en) Method and device for processing video service
CN110134846A (en) Proper noun processing method, device and the computer equipment of text
CN110602059B (en) Method for accurately restoring clear text length fingerprint of TLS protocol encrypted transmission data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Room 612, Building 5, No. 998 Wenyi West Road, Yuhang District, Hangzhou City, Zhejiang Province, 311100

Applicant after: HANGZHOU SECSMART INFORMATION TECHNOLOGY CO.,LTD.

Address before: Room 612, Building 5, No. 998 Wenyi West Road, Yuhang District, Hangzhou City, Zhejiang Province, 311100

Applicant before: HANGZHOU SECSMART INFORMATION TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 310000 Room 608, Building No. 998 Wenyi West Road, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province

Applicant after: HANGZHOU SECSMART INFORMATION TECHNOLOGY CO.,LTD.

Address before: Room 612, Building 5, No. 998 Wenyi West Road, Yuhang District, Hangzhou City, Zhejiang Province, 311100

Applicant before: HANGZHOU SECSMART INFORMATION TECHNOLOGY CO.,LTD.

CB02 Change of applicant information
CB02 Change of applicant information

Address after: 310000 Room 608, building 5, No. 998, Wenyi West Road, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province

Applicant after: Flash it Co.,Ltd.

Address before: 310000 Room 608, building 5, No. 998, Wenyi West Road, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province

Applicant before: HANGZHOU SECSMART INFORMATION TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20181019

Assignee: Hangzhou Jintou Finance Leasing Co.,Ltd.

Assignor: Flash it Co.,Ltd.

Contract record no.: X2022980028282

Denomination of invention: A data desensitization system based on stream processing and its desensitization method

Granted publication date: 20201113

License type: Exclusive License

Record date: 20230112

PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A data desensitization system based on stream processing and its desensitization method

Effective date of registration: 20230115

Granted publication date: 20201113

Pledgee: Hangzhou Jintou Finance Leasing Co.,Ltd.

Pledgor: Flash it Co.,Ltd.

Registration number: Y2023980031389

CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: 311121 Room 101, Building 9, No. 998, Wenyi West Road, Wuchang Subdistrict, Yuhang District, Hangzhou City, Zhejiang Province

Patentee after: Flash it Co.,Ltd.

Address before: 310000 Room 608, Building No. 998 Wenyi West Road, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province

Patentee before: Flash it Co.,Ltd.

EC01 Cancellation of recordation of patent licensing contract

Assignee: Hangzhou Jintou Finance Leasing Co.,Ltd.

Assignor: Flash it Co.,Ltd.

Contract record no.: X2022980028282

Date of cancellation: 20240327

PC01 Cancellation of the registration of the contract for pledge of patent right

Granted publication date: 20201113

Pledgee: Hangzhou Jintou Finance Leasing Co.,Ltd.

Pledgor: Flash it Co.,Ltd.

Registration number: Y2023980031389