CN110377977A - Detection method, device and the storage medium of sensitive information leakage - Google Patents

Detection method, device and the storage medium of sensitive information leakage Download PDF

Info

Publication number
CN110377977A
CN110377977A CN201910579777.0A CN201910579777A CN110377977A CN 110377977 A CN110377977 A CN 110377977A CN 201910579777 A CN201910579777 A CN 201910579777A CN 110377977 A CN110377977 A CN 110377977A
Authority
CN
China
Prior art keywords
sensitive
text information
leakage
information
information leakage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910579777.0A
Other languages
Chinese (zh)
Inventor
陈霖
许爱东
明哲
杨航
陈华军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Southern Power Grid Co Ltd
Research Institute of Southern Power Grid Co Ltd
Original Assignee
China Southern Power Grid Co Ltd
Research Institute of Southern Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Southern Power Grid Co Ltd, Research Institute of Southern Power Grid Co Ltd filed Critical China Southern Power Grid Co Ltd
Priority to CN201910579777.0A priority Critical patent/CN110377977A/en
Publication of CN110377977A publication Critical patent/CN110377977A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Hardware Design (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Bioethics (AREA)
  • Geometry (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a kind of detection methods of sensitive information leakage, comprising: acquires network packet to be detected;Classify to network packet to be detected;Network packet to be detected is respectively sent to several processors, parallel TCP flows reduction treatment is carried out using processor, the text information after being restored;Natural language processing is carried out to the text information after reduction, with the presence or absence of sensitive text information in the text information after judging reduction, and calculates the leakage rate of sensitive text information;Network packet where sensitive text information is determined as sensitive information leakage data packet, Source Tracing, location-sensitive information leakage source are carried out to the source address of sensitive information leakage data packet.A kind of detection method of sensitive information leakage disclosed by the invention can be improved the acquisition speed of data packet, reduce the time that sensitive information identification and detection is carried out to network.The invention also discloses a kind of detection device of sensitive information leakage and storage mediums.

Description

Detection method, device and the storage medium of sensitive information leakage
Technical field
The present invention relates to sensitive information detection technique field more particularly to a kind of detection methods of sensitive information leakage, dress It sets and storage medium.
Background technique
With quickly propelling for IT application in enterprise, more and more OA office systems, internal mail system, Instant Messenger Letter tool is widely applied, this brings great convenience to daily Working Life.But while convenient, also go out Numerous information security issues are showed, various sensitive information leakage events happen occasionally.
Major part enterprise generally takes two kinds of precautionary measures in " sensitive information leakage " problem at present: first is that reinforcing related The security system training of concerning security matters staff;Second is that by secrecy authorities in a manner of inspecting by random samples internally online computer or Subordinate unit carries out censorship.Although these modes solve the problems, such as to a certain extent, still sensitive information leakage Risk.On the one hand be training can not cover the great external coordination unit personnel of all personnel, especially mobility, even if It covers also it is difficult to ensure that all personnel can execute according to security requirements;On the other hand, the mode manually irregularly inspected by random samples needs Want a large amount of human input.In order to effectively guard against the information leakage of enterprises, while personnel's investment is reduced, needs intelligence Sensitive information leakage further apply with studying and judging analytical technology.
In the prior art, carrying out sensitive information identification and the method for detection to network has: (1) it is mixed for using Network card setup Parasitic mode formula and the network packet copy mode combined with libpcap carry out the information collection for high speed network;
(2) it for the data packet captured, needs to be reverted to application layer and carries out content analysis, mostly use at present The method of TCP flow reduction;
The methods of (3) for the content for having reverted to application layer, analyzed using sensitive word and search, Similar Text;
The present inventor has found in the practice of the invention, and following technical problem exists in the prior art:
There are limitations for the packet capture speed of network sensitive information detection technique, are easy to produce packet loss;Due to existing System is called and the data copy of kernel spacing to user's space often brings the decline on acquisition speed, to cause to lose Packet;If the Data Structure Design of processing is improper, space complexity and the time complexity that will lead to TCP flow reduction are excessively high;Net The text data data volume that network restores in real time is very big, and retrieval and parsing are very time-consuming.
The technical program will be acquired the data packet on high speed network, and carry out parallel for collected data packet TCP flow reduction treatment, then the leakage journey that sensitive information is studied and judged in intelligent text analysis is carried out to the application layer content after parsing merging Degree finally carries out Source Tracing to the sensitive information of leakage.
Summary of the invention
The embodiment of the present invention provides a kind of detection method of sensitive information leakage, can be improved the acquisition speed of data packet, Reduce the time that sensitive information identification and detection is carried out to network.
The embodiment of the present invention one provides a kind of detection method of sensitive information leakage, comprising:
Acquire network packet to be detected;
Classify to the network packet to be detected;
The network packet to be detected is respectively sent to several processors, is carried out using the processor parallel TCP flow reduction treatment, the text information after being restored;
Natural language processing is carried out to the text information after the reduction, in the text information after judging the reduction whether There are sensitive text informations, and calculate the leakage rate of the sensitive text information;
Network packet where the sensitive text information is determined as sensitive information leakage data packet, to the sensitivity The source address of information leakage data packet carries out Source Tracing, location-sensitive information leakage source.
As an improvement of the above scheme, the acquisition network packet to be detected, specifically includes:
The network packet to be detected that will be captured is sent in pre-assigned address space;
Wherein, the address space is corresponding with buffer queue;The buffer queue uses the mechanism of first in first out, and head of the queue is used In reading data, tail of the queue, which is used to analyze the received data, to be written.
As an improvement of the above scheme, described to classify to the network packet to be detected, it specifically includes:
Address resolution is carried out to the network packet to be detected collected in certain time period;
Network packet to be detected after parsing is subjected to clustering processing according to the address field of destination address, and according to poly- Class processing result is classified.
It is as an improvement of the above scheme, described that the network packet to be detected is respectively sent to several processors, Parallel TCP flows reduction treatment is carried out using the processor, the text information after being restored specifically includes:
Using TCP connection mark SIP, SPT, DIP and DPT as keyword, then have Hash chained list calculation formula as follows:
The Hash chained list is calculated, all TCP connection points are assigned to each list item in the Hash chained list, is realized TCP flow reduction treatment, the text information after being restored.
As an improvement of the above scheme, further includes:
Tissue is carried out by all tie points of the Splay tree to each list item in the Hash chained list, is obtained corresponding Connection identifier;
The address Hash is calculated by Hash function according to the connection identifier, then corresponding to the address Hash Splay tree is searched, so as to be searched the text information after the reduction, deleted and modified.
As an improvement of the above scheme, the text information to after the reduction carries out natural language processing, judges institute With the presence or absence of sensitive text information in text information after stating reduction, and the leakage rate of the sensitive text information is calculated, specifically Include:
Keyword relevant to preset sensitive information is built into sensitive dictionary;
Text information after the reduction is compared with the keyword in sensitive dictionary;
Text identical with the keyword in the sensitive dictionary in text information after the reduction is then identified, Obtain sensitive text information;
The leakage rate X of the sensitive text information is calculated according to the following formula;
In formula, S is the total quantity of sensitive text information, and S' is the keyword total quantity in sensitive dictionary.
As an improvement of the above scheme, further includes:
Set sensitive information leakage recognition threshold
By sensitive information slip X withIt is compared judgement;
IfThen the text is determined as doubtful information leakage;IfAndThen it is determined as that information is let out Dew;IfAnd X≤100%, then it is determined as serious information leakage.
The correspondence of the embodiment of the present invention two provides a kind of detection device of sensitive information leakage, comprising:
Packet capture unit, for acquiring network packet to be detected;
Packet classification unit, for classifying to the network packet to be detected;
Data packet reduction unit is used for the network packet to be detected to be respectively sent to several processors The processor carries out parallel TCP flows reduction treatment, the text information after being restored;
Computing unit is revealed, for carrying out natural language processing to the text information after the reduction, judges the reduction With the presence or absence of sensitive text information in text information afterwards, and calculate the leakage rate of the sensitive text information;
Source of leakage positioning unit, for the network packet where the sensitive text information to be determined as that sensitive information is let out Reveal data packet, Source Tracing, location-sensitive information leakage source are carried out to the source address of the sensitive information leakage data packet.
The correspondence of the embodiment of the present invention three provides a kind of detection device of sensitive information leakage, comprising: processor, memory And the computer program executed by the processor is stored in the memory and is configured as, the processor executes institute A kind of detection method of sensitive information leakage as described in the embodiment of the present invention one is realized when stating computer program.
The correspondence of the embodiment of the present invention four provides a kind of computer readable storage medium, which is characterized in that the computer Readable storage medium storing program for executing includes the computer program of storage, wherein controlling the computer in computer program operation can Equipment executes a kind of detection method of sensitive information leakage as described in the embodiment of the present invention one where reading storage medium.
A kind of detection method of sensitive information leakage provided in an embodiment of the present invention, has the following beneficial effects:
By the selected of interface circuit, the division of analogue system ensure that the big step-length electromagnetic transient simulation of AC system Model and the PSCAD/EMTDC simulation model of direct current system accurately can smoothly carry out mixing calculating, to play big step The speed advantage of long electromagnetic transient state procedure, improves the electromagnetic transient simulation rate of ac and dc systems, while retaining PSCAD/EMTDC The accuracy that simulation model emulates direct current system;By just being switched over to disconnecting switch after reaching stable state, reduce because of switch Power swing caused by closure;Ideal voltage source is set in the interface section, direct current system start-up course is avoided to destroy exchange The initialization of system;Realize not only accurate but also efficient hybrid simulation.
Detailed description of the invention
Fig. 1 is a kind of flow diagram of the detection method for sensitive information leakage that the embodiment of the present invention one provides.
Fig. 2 is a kind of structural schematic diagram of the detection device of sensitive information leakage provided by Embodiment 2 of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
It is a kind of flow diagram of the detection method for sensitive information leakage that the embodiment of the present invention one provides referring to Fig. 1, Include:
S101, acquisition network packet to be detected;
S102, classify to the network packet to be detected;
S103, the network packet to be detected is respectively sent to several processors, is carried out using the processor Parallel TCP flows reduction treatment, the text information after being restored;
S104, natural language processing is carried out to the text information after the reduction, the text information after judging the reduction In with the presence or absence of sensitive text information, and calculate the leakage rate of the sensitive text information;
S105, the network packet where the sensitive text information is determined as sensitive information leakage data packet, to institute The source address for stating sensitive information leakage data packet carries out Source Tracing, location-sensitive information leakage source.
Further, the acquisition network packet to be detected, specifically includes:
The network packet to be detected that will be captured is sent in pre-assigned address space;
Wherein, the address space is corresponding with buffer queue;The buffer queue uses the mechanism of first in first out, and head of the queue is used In reading data, tail of the queue, which is used to analyze the received data, to be written.
Preferably, network packet to be detected is captured by DMA technology, is realized direct to the memory of data Access.
Further, described to classify to the network packet to be detected, it specifically includes:
Address resolution is carried out to the network packet to be detected collected in certain time period;
Network packet to be detected after parsing is subjected to clustering processing according to the address field of destination address, and according to poly- Class processing result is classified.
Preferably, setting network interface card is that promiscuous mode captures the data packet on network, by BPF packet filtering mechanism come pair Data link layer packets are filtered.But traditional Libpcap acquisition mode is due to data copy, system is called and hardware Interrupt processing will affect the capture rate of data packet.In order to avoid cpu data copy, the system that reduces is called and is interrupted, the present invention For a kind of detection method of the sensitive information leakage provided in the course of work of reading data, DMA technology will store number in queue According to Address space mappinD into user's space, allow user's space directly to access this section of memory, avoid memory copying System call.
Further, described that the network packet to be detected is respectively sent to several processors, using the place It manages device and carries out parallel TCP flows reduction treatment, the text information after being restored specifically includes:
Using TCP connection mark SIP, SPT, DIP and DPT as keyword, since corresponding hash function requirements have source The hash value that mesh symmetry, i.e. tetra- elements of SIP, SPT, DIP in function formula and DPT are calculated after exchanging is identical, then Have Hash chained list calculation formula as follows:
The Hash chained list is calculated, all TCP connection points are assigned to each list item in the Hash chained list, is realized TCP flow reduction treatment, the text information after being restored.
Preferably due to which the identical data packet of destination IP is probably derived from the same message, the identical number of purpose IP network section It is larger according to probability of the packet from the same network segment, if but collected data packet in a period is all given same CPU or thread are handled, then can be due to waiting in line to substantially reduce the efficiency for flowing reduction.Therefore, one kind provided by the invention The detection method of sensitive information leakage will be assigned in identical CPU with the data packet of the same purpose IP network section by cluster Stream reduction treatment is carried out, the data packet of different segment carries out parallel processing.
Preferably due to which the transmitting of the data packet of TCP connection is in following features: (1) when data packet is according to connection identifier After determination is certain TCP connection, then next data packet is likely to be also from the same TCP connection;(2) when certain TCP connection After data packet reaches, next data packet of the link can also reach quickly.
Data packet searches the time of traversal during in order to reduce TCP recombination, needs using the original preferentially accessed recently Then, thus the data structure that is combined with Splay tree of approach application Hash chained list come realize lookup in TCP flow reduction and time It goes through.The data structure is based on principle of locality, will search hit prior node every time, make the hit rate searched in linear list gradually Tend to successively decrease, reduces the number that traversal compares, accelerate search speed.
The lookup and traversal in TCP flow reduction are realized with the data structure that hash chained list is combined with Splay tree The key of Hash chained list is the design of Hash function, divides keyword uniformly by the hash value of Hash function being calculated Cloth is in address section.
Further, further includes: carried out by all tie points of the Splay tree to each list item in the Hash chained list Tissue, obtains corresponding connection identifier;
The address Hash is calculated by Hash function according to the connection identifier, then corresponding to the address Hash Splay tree is searched, so as to be searched the text information after the reduction, deleted and modified.
Splay tree is a kind of binary search tree of self-regulated shaping type, will access frequent node every time and pass through a series of rotations Turn to be moved to top layer as root node, search efficiency can be improved.Since the arrival of network packet has principle of locality, The next data packet connected belonging to this data packet can also reach quickly, therefore node of this access is possible to another in tree Secondary accessed, searching only needs relatively once.After tree experienced a series of access, the node frequently accessed recently will Close to root node, reduces average lookup traversal number, improve lookup rate.
Further, the text information to after the reduction carries out natural language processing, after judging the reduction With the presence or absence of sensitive text information in text information, and the leakage rate of the sensitive text information is calculated, specifically included:
Keyword relevant to preset sensitive information is built into sensitive dictionary;
Text information after the reduction is compared with the keyword in sensitive dictionary;
Text identical with the keyword in the sensitive dictionary in text information after the reduction is then identified, Obtain sensitive text information;
The leakage rate X of the sensitive text information is calculated according to the following formula;
In formula, S is the total quantity of sensitive text information, and S' is the keyword total quantity in sensitive dictionary.
By natural language processing, extracts keyword and establish sensitive dictionary, improve sensitive information recognition efficiency.
Further, further includes: setting sensitive information leakage recognition threshold
By sensitive information slip X withIt is compared judgement;
IfThen the text is determined as doubtful information leakage;IfAndThen it is determined as that information is let out Dew;IfAnd X≤100%, then it is determined as serious information leakage.
The processing such as to trace to the source according to the progress early warning of sensitive information leakage degree.
In a particular embodiment, it is determining there are after sensitive information, further according to the source IP of sensitive information data packet Source of leakage is traced, destination host is repositioned, so that locking information reveals main body, and is taken based on the envelope in source to information leakage It is stifled.
Detection method, device and the storage medium of a kind of sensitive information leakage provided in an embodiment of the present invention have as follows The utility model has the advantages that
Network packet to be detected is captured by DMA technology, the direct memory access (DMA) to data is realized, keeps away The system for having exempted from memory copying is called, and access efficiency is improved;It is clustered, will be had same for the purpose IP network section of data packet The data packet of one purpose IP network section is assigned in identical CPU by cluster and carries out stream reduction treatment, the data of different segment Packet carries out parallel processing, and the hit rate searched in linear list is made gradually to tend to successively decrease, and reduces the number that traversal compares, improves Inquiry velocity;By natural language processing, extracts keyword and establish sensitive dictionary, improve sensitive information recognition efficiency;Sentencing It makes there are after sensitive information, traces source of leakage further according to the source IP of sensitive information data packet, destination host is repositioned, to lock Determine information leakage main body, and is taken based on the closure in source to information leakage.
Referring to fig. 2, be a kind of sensitive information leakage provided by Embodiment 2 of the present invention detection device structural schematic diagram, Include:
Packet capture unit 201, for acquiring network packet to be detected;
Packet classification unit 202, for classifying to the network packet to be detected;
Data packet reduction unit 203 is adopted for the network packet to be detected to be respectively sent to several processors Parallel TCP flows reduction treatment is carried out with the processor, the text information after being restored;
Computing unit 204 is revealed, for carrying out natural language processing to the text information after the reduction, judgement is described also With the presence or absence of sensitive text information in text information after original, and calculate the leakage rate of the sensitive text information;
Source of leakage positioning unit 205, for the network packet where the sensitive text information to be determined as sensitive letter Leak data packet is ceased, Source Tracing, location-sensitive information leakage source are carried out to the source address of the sensitive information leakage data packet.
The correspondence of the embodiment of the present invention three provides a kind of detection device of sensitive information leakage, including processor, memory And the computer program executed by the processor is stored in the memory and is configured as, the processor executes institute The detection method of the sensitive information leakage as described in the embodiment of the present invention one is realized when stating computer program.The sensitive information is let out The detection device of dew can be desktop PC, notebook, palm PC and cloud server etc. and calculate equipment.The sensitivity The detection device of information leakage may include, but be not limited only to, processor, memory.
The correspondence of the embodiment of the present invention four provides a kind of computer readable storage medium, which is characterized in that the computer Readable storage medium storing program for executing includes the computer program of storage, wherein controlling the computer in computer program operation can Equipment executes the detection method of the sensitive information leakage as described in the embodiment of the present invention one where reading storage medium.
Alleged processor can be central processing unit (Central Processing Unit, CPU), can also be it His general processor, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor Deng the processor is the control centre of the detection device of the sensitive information leakage, whole using various interfaces and connection The various pieces of the detection device of a sensitive information leakage.
The memory can be used for storing the computer program and/or module, and the processor is by operation or executes Computer program in the memory and/or module are stored, and calls the data being stored in memory, described in realization The various functions of the detection device of sensitive information leakage.The memory can mainly include storing program area and storage data area, Wherein, storing program area can application program needed for storage program area, at least one function (such as sound-playing function, figure As playing function etc.) etc.;Storage data area, which can be stored, uses created data (such as audio data, phone according to mobile phone This etc.) etc..In addition, memory may include high-speed random access memory, it can also include nonvolatile memory, such as firmly Disk, memory, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) block, flash card (Flash Card), at least one disk memory, flush memory device or other volatile solid-states Part.
Wherein, if the integrated module/unit of the detection device of the sensitive information leakage is with the shape of SFU software functional unit Formula realize and when sold or used as an independent product, can store in a computer readable storage medium.It is based on Such understanding, the present invention realize above-described embodiment method in all or part of the process, can also by computer program come Relevant hardware is instructed to complete, the computer program can be stored in a computer readable storage medium, the computer Program is when being executed by processor, it can be achieved that the step of above-mentioned each embodiment of the method.Wherein, the computer program includes meter Calculation machine program code, the computer program code can be source code form, object identification code form, executable file or certain Intermediate form etc..The computer-readable medium may include: can carry the computer program code any entity or Device, recording medium, USB flash disk, mobile hard disk, magnetic disk, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software Distribution medium etc..
It should be noted that the apparatus embodiments described above are merely exemplary, wherein described be used as separation unit The unit of explanation may or may not be physically separated, and component shown as a unit can be or can also be with It is not physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to actual It needs that some or all of the modules therein is selected to achieve the purpose of the solution of this embodiment.In addition, device provided by the invention In embodiment attached drawing, the connection relationship between module indicate between them have communication connection, specifically can be implemented as one or A plurality of communication bus or signal wire.Those of ordinary skill in the art are without creative efforts, it can understand And implement.
The above is a preferred embodiment of the present invention, it is noted that for those skilled in the art For, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also considered as Protection scope of the present invention.

Claims (10)

1. a kind of detection method of sensitive information leakage characterized by comprising
Acquire network packet to be detected;
Classify to the network packet to be detected;
The network packet to be detected is respectively sent to several processors, parallel TCP flows are carried out using the processor Reduction treatment, the text information after being restored;
Natural language processing is carried out to the text information after the reduction, whether there is in the text information after judging the reduction Sensitive text information, and calculate the leakage rate of the sensitive text information;
Network packet where the sensitive text information is determined as sensitive information leakage data packet, to the sensitive information The source address of leak data packet carries out Source Tracing, location-sensitive information leakage source.
2. a kind of detection method of sensitive information leakage as described in claim 1, which is characterized in that the acquisition is to be detected Network packet specifically includes:
The network packet to be detected that will be captured is sent in pre-assigned address space;
Wherein, the address space is corresponding with buffer queue;The buffer queue uses the mechanism of first in first out, and head of the queue is for counting According to reading, tail of the queue, which is used to analyze the received data, to be written.
3. a kind of detection method of sensitive information leakage as described in claim 1, which is characterized in that described to described to be detected Network packet classify, specifically include:
Address resolution is carried out to the network packet to be detected collected in certain time period;
By the network packet to be detected after parsing according to the address field of destination address carry out clustering processing, and according to cluster at Reason result is classified.
4. a kind of detection method of sensitive information leakage as described in claim 1, which is characterized in that it is described will be described to be detected Network packet be respectively sent to several processors, using the processor carry out parallel TCP flows reduction treatment, restored Text information afterwards, specifically includes:
Using TCP connection mark SIP, SPT, DIP and DPT as keyword, then have Hash chained list calculation formula as follows:
The Hash chained list is calculated, all TCP connection points are assigned to each list item in the Hash chained list, realize TCP flow Reduction treatment, the text information after being restored.
5. a kind of detection method of sensitive information leakage as claimed in claim 4, which is characterized in that further include:
Tissue is carried out by all tie points of the Splay tree to each list item in the Hash chained list, obtains corresponding connection Mark;
The address Hash is calculated by Hash function according to the connection identifier, then to the corresponding Splay tree in the address Hash It is searched, so as to be searched the text information after the reduction, deleted and modified.
6. a kind of detection method of sensitive information leakage as described in claim 1, which is characterized in that it is described to the reduction after Text information carry out natural language processing, with the presence or absence of sensitive text information in the text information after judging the reduction, and The leakage rate for calculating the sensitive text information, specifically includes:
Keyword relevant to preset sensitive information is built into sensitive dictionary;
Text information after the reduction is compared with the keyword in sensitive dictionary;
Text identical with the keyword in the sensitive dictionary in text information after the reduction is then identified, is obtained Sensitive text information;
The leakage rate X of the sensitive text information is calculated according to the following formula;
In formula, S is the total quantity of sensitive text information, and S' is the keyword total quantity in sensitive dictionary.
7. a kind of detection method of sensitive information leakage as claimed in claim 6, which is characterized in that further include:
Set sensitive information leakage recognition threshold
By sensitive information slip X withIt is compared judgement;
IfThen the text is determined as doubtful information leakage;IfAndThen it is determined as information leakage;IfAnd X≤100%, then it is determined as serious information leakage.
8. a kind of detection device of sensitive information leakage characterized by comprising
Packet capture unit, for acquiring network packet to be detected;
Packet classification unit, for classifying to the network packet to be detected;
Data packet reduction unit, for the network packet to be detected to be respectively sent to several processors, using described Processor carries out parallel TCP flows reduction treatment, the text information after being restored;
Computing unit is revealed, for carrying out natural language processing to the text information after the reduction, after judging the reduction With the presence or absence of sensitive text information in text information, and calculate the leakage rate of the sensitive text information;
Source of leakage positioning unit, for the network packet where the sensitive text information to be determined as sensitive information leakage number According to packet, Source Tracing, location-sensitive information leakage source are carried out to the source address of the sensitive information leakage data packet.
9. a kind of detection device of sensitive information leakage, including processor, memory and storage in the memory and by It is configured to the computer program executed by the processor, is realized when the processor executes the computer program as right is wanted A kind of detection method of sensitive information leakage described in asking any one of 1 to 7.
10. a kind of computer readable storage medium, which is characterized in that the computer readable storage medium includes the calculating of storage Machine program, wherein equipment where controlling the computer readable storage medium in computer program operation is executed as weighed Benefit require any one of 1 to 7 described in a kind of detection method of sensitive information leakage.
CN201910579777.0A 2019-06-28 2019-06-28 Detection method, device and the storage medium of sensitive information leakage Pending CN110377977A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910579777.0A CN110377977A (en) 2019-06-28 2019-06-28 Detection method, device and the storage medium of sensitive information leakage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910579777.0A CN110377977A (en) 2019-06-28 2019-06-28 Detection method, device and the storage medium of sensitive information leakage

Publications (1)

Publication Number Publication Date
CN110377977A true CN110377977A (en) 2019-10-25

Family

ID=68251312

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910579777.0A Pending CN110377977A (en) 2019-06-28 2019-06-28 Detection method, device and the storage medium of sensitive information leakage

Country Status (1)

Country Link
CN (1) CN110377977A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112181830A (en) * 2020-09-28 2021-01-05 厦门美柚股份有限公司 Memory leak detection method, device, terminal and medium
CN112597770A (en) * 2020-12-16 2021-04-02 盐城数智科技有限公司 Sensitive information query method based on deep learning
CN113704752A (en) * 2021-08-31 2021-11-26 上海观安信息技术股份有限公司 Data leakage behavior detection method and device, computer equipment and storage medium
CN113765852A (en) * 2020-06-03 2021-12-07 深信服科技股份有限公司 Data packet detection method, system, storage medium and computing device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6631422B1 (en) * 1999-08-26 2003-10-07 International Business Machines Corporation Network adapter utilizing a hashing function for distributing packets to multiple processors for parallel processing
CN102779176A (en) * 2012-06-27 2012-11-14 北京奇虎科技有限公司 System and method for key word filtering
CN103746919A (en) * 2014-01-14 2014-04-23 浪潮电子信息产业股份有限公司 Method for quickly classifying network packets through combining multi-way decision tree and Hash tables
JP2014175781A (en) * 2013-03-07 2014-09-22 Hitachi High-Technologies Corp Parallel packet processing apparatus, method and program
US20170214709A1 (en) * 2009-04-21 2017-07-27 Bandura, Llc Structuring data and pre-compiled exception list engines and internet protocol threat prevention
CN109547389A (en) * 2017-08-08 2019-03-29 中国移动通信集团宁夏有限公司 A kind of method and device of ASCII stream file ASCII recombination
CN109766525A (en) * 2019-01-14 2019-05-17 湖南大学 A kind of sensitive information leakage detection framework of data-driven

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6631422B1 (en) * 1999-08-26 2003-10-07 International Business Machines Corporation Network adapter utilizing a hashing function for distributing packets to multiple processors for parallel processing
US20170214709A1 (en) * 2009-04-21 2017-07-27 Bandura, Llc Structuring data and pre-compiled exception list engines and internet protocol threat prevention
CN102779176A (en) * 2012-06-27 2012-11-14 北京奇虎科技有限公司 System and method for key word filtering
JP2014175781A (en) * 2013-03-07 2014-09-22 Hitachi High-Technologies Corp Parallel packet processing apparatus, method and program
CN103746919A (en) * 2014-01-14 2014-04-23 浪潮电子信息产业股份有限公司 Method for quickly classifying network packets through combining multi-way decision tree and Hash tables
CN109547389A (en) * 2017-08-08 2019-03-29 中国移动通信集团宁夏有限公司 A kind of method and device of ASCII stream file ASCII recombination
CN109766525A (en) * 2019-01-14 2019-05-17 湖南大学 A kind of sensitive information leakage detection framework of data-driven

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113765852A (en) * 2020-06-03 2021-12-07 深信服科技股份有限公司 Data packet detection method, system, storage medium and computing device
CN113765852B (en) * 2020-06-03 2023-05-12 深信服科技股份有限公司 Data packet detection method, system, storage medium and computing device
CN112181830A (en) * 2020-09-28 2021-01-05 厦门美柚股份有限公司 Memory leak detection method, device, terminal and medium
CN112181830B (en) * 2020-09-28 2022-08-09 厦门美柚股份有限公司 Memory leak detection method, device, terminal and medium
CN112597770A (en) * 2020-12-16 2021-04-02 盐城数智科技有限公司 Sensitive information query method based on deep learning
CN112597770B (en) * 2020-12-16 2024-06-11 盐城数智科技有限公司 Sensitive information query method based on deep learning
CN113704752A (en) * 2021-08-31 2021-11-26 上海观安信息技术股份有限公司 Data leakage behavior detection method and device, computer equipment and storage medium
CN113704752B (en) * 2021-08-31 2024-01-26 上海观安信息技术股份有限公司 Method and device for detecting data leakage behavior, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110377977A (en) Detection method, device and the storage medium of sensitive information leakage
CN104239501B (en) Mass video semantic annotation method based on Spark
WO2022134794A1 (en) Method and apparatus for processing public opinions about news event, storage medium, and computer device
CN109815788A (en) A kind of picture clustering method, device, storage medium and terminal device
CN107818077A (en) A kind of sensitive content recognition methods and device
CN105389341B (en) A kind of service calls repeat the text cluster and analysis method of incoming call work order
CN109218223B (en) Robust network traffic classification method and system based on active learning
CN108304442A (en) A kind of text message processing method, device and storage medium
CN108959399A (en) Distributed data deletes flow control method, device, electronic equipment and storage medium
TW201833851A (en) Risk control event automatic processing method and apparatus
CN105550253B (en) Method and device for acquiring type relationship
CN110457481A (en) A kind of method, apparatus, equipment and the storage medium of disaggregated model training
CN108664538A (en) A kind of automatic identification method and system of the doubtful familial defect of power transmission and transforming equipment
CN110147657A (en) A kind of user right configuration method and device
CN109871686A (en) Rogue program recognition methods and device based on icon representation and software action consistency analysis
CN109657063A (en) A kind of processing method and storage medium of magnanimity environment-protection artificial reported event data
CN109885597A (en) Tenant group processing method, device and electric terminal based on machine learning
WO2023035558A1 (en) Anchor point cut-based image processing method and apparatus, device, and medium
CN108537270A (en) Image labeling method, terminal device and storage medium based on multi-tag study
CN115514784A (en) Multisource data acquisition middle platform based on Internet of things
CN107357834A (en) A kind of image search method of view-based access control model conspicuousness fusion
CN107493275A (en) The extracted in self-adaptive and analysis method and system of heterogeneous network security log information
Ding et al. Railway foreign object intrusion detection based on deep learning
CN113297249A (en) Slow query statement identification and analysis method and device and query statement statistical method and device
CN112613362A (en) Article mark identification system based on Internet of things

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191025

RJ01 Rejection of invention patent application after publication