CN108011809A - Anti-data-leakage analysis method and system based on user behavior and document content - Google Patents

Anti-data-leakage analysis method and system based on user behavior and document content Download PDF

Info

Publication number
CN108011809A
CN108011809A CN201711262779.4A CN201711262779A CN108011809A CN 108011809 A CN108011809 A CN 108011809A CN 201711262779 A CN201711262779 A CN 201711262779A CN 108011809 A CN108011809 A CN 108011809A
Authority
CN
China
Prior art keywords
mail
vector
data
document
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711262779.4A
Other languages
Chinese (zh)
Inventor
魏效征
王志海
喻波
安鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Wondersoft Technology Co Ltd
Original Assignee
Beijing Wondersoft Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Wondersoft Technology Co Ltd filed Critical Beijing Wondersoft Technology Co Ltd
Priority to CN201711262779.4A priority Critical patent/CN108011809A/en
Publication of CN108011809A publication Critical patent/CN108011809A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/42Mailbox-related aspects, e.g. synchronisation of mailboxes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general
    • H04L63/205Network architectures or network communication protocols for network security for managing network security; network security policies in general involving negotiation or determination of the one or more network security mechanisms to be used, e.g. by negotiation between the client and the server or between peers or by selection according to the capabilities of the entities involved

Abstract

The invention discloses anti-data-leakage analysis method and system based on user behavior and document content, this method comprises the following steps:The outgoing mail behavior related data of the predetermined long period of user and predetermined short time period is obtained respectively, is averaged by data, normalized, respectively obtains the long-term action data vector and acts and efforts for expediency data vector of the user;According to distance between the vector between user's long-term action data vector and acts and efforts for expediency data vector and the comparative result of predetermined vector distance threshold, determine the behavior of user's outgoing mail with the presence or absence of abnormal;For there are user's outgoing mail of abnormal behaviour, extracting Mail Contents document, and judge the subject categories of document;According to document subject matter classification select with the associated accurate matching of texts policing rule of the category, determine to whether there is sensitive data in document.By technical scheme, sensitive data can be significantly improved and leaked the order of accuarcy of event judgement, effectively reduce the rate of false alarm only judged by content matching.

Description

Anti-data-leakage analysis method and system based on user behavior and document content
Technical field
The present invention relates to data security arts, and in particular to is analyzed based on the anti-data-leakage of user behavior and document content Method and system.
Background technology
The major function of business data leak prevention system is to prevent enterprise staff outgoing sensitive data.Therefore, accurate judgement Whether the data of employee's outgoing are sensitive, are the keys of anti-data-leakage system.Traditional means be by accurate matched means, Such as the hit-count of keyword or regular expression is realized, tends to produce many wrong reports.Therefore anti-data-leakage system System there is an urgent need to consider more factors, come judge the outgoing data behavior of enterprise staff whether security incident.
Documents 1
Publication number:105357217A, denomination of invention:Data based on user behavior analysis steal methods of risk assessment and are System
The prior art is analyzed by the network behavior of internal network termination user, it is found that there are the potential of risk operations Terminal, protects data safety, improves the security of internal network.
The prior art is by obtaining the operation behavior pair of terminal user;According to the operation behavior pair, dangerous behaviour is obtained Make behavior pair and risky operation behavior logarithm, calculate the first dangerous property coefficient;According to the risky operation behavior pair, obtain and access The coupling number and mismatch number of website behavior type of service and registration type of service, calculate the second dangerous property coefficient;According to copy Behavior, obtains dangerous copy behavior and dangerous copied files number, calculates the 3rd dangerous property coefficient and the 4th dangerous property coefficient;According to Described first dangerous property coefficient, the second dangerous property coefficient, the 3rd dangerous property coefficient and the 4th dangerous property coefficient, using default wind Dangerous assessment models computing terminal danger property coefficient.
The above-mentioned prior art according to the operation of terminal to calculating danger coefficient, including:Intercepting network data stream;To the net Network data flow carries out protocol analysis and obtains character stream;Obtain default detection character string corresponding with program language and/or grammer Analyze built-in function;Whether the parsing obtained character stream is judged according to the detection character string and/or syntactic analysis built-in function Comprising source code, if so, then blocking the network data flow.
Above patent document has the following disadvantages:
(1), according to the value after risk assessment, carried out dangerous by operation of the user in terminal to carrying out risk assessment Property judgement, without considering the content of data in itself, easily produce very big rate of false alarm.
(2) exception of real terminal operation behavior, may not be equivalent to the security incident that data are stolen.Operation behavior it is different Often, it is related to the multiple factors such as the mood of operator, the temporary shift to work, therefore does not combine other factors fusion and consider, it is real Must be bad with property.
The content of the invention
In order to solve the above technical problems, the present invention provides the anti-data-leakage analysis based on user behavior and document content Method, it is characterised in that this method comprises the following steps:
1) the outgoing mail behavior related data of the predetermined long period of user and predetermined short time period is obtained respectively, by number According to average, normalized, the long-term action data vector and acts and efforts for expediency data vector of the user are respectively obtained;
2) distance between the vector between calculating user's long-term action data vector and acts and efforts for expediency data vector, according to meter Distance and the comparative result of predetermined vector distance threshold, determine that user's outgoing mail behavior whether there is between the obtained vector It is abnormal, if there is exception, step 3) is jumped to, otherwise jumps to step 5;
3) for there are user's outgoing mail of abnormal behaviour, extracting Mail Contents document, and judge the theme class of document Not;
4) according to document subject matter classification select with the associated accurate matching of texts policing rule of the category, and use the matching Policing rule determines to whether there is sensitive data in document;
5) terminate.
According to an embodiment of the invention, it is preferred that the outgoing mail behavior related data in the step 1) includes:Mail Sending time, e-mail sender address, e-mail sender domain, mail recipient address, mail recipient domain, mail recipient top Level domain name, mail matter topics type, the number of mail sent, received number of mail, the size of mail, Mail Clients IP Location, mail server IP address.
According to an embodiment of the invention, it is preferred that user's long-term action data vector described in the step 2) and short-term Distance is mahalanobis distance (Mahalanobis Distance) between vector between behavioral data vector, and vector distance threshold value is by card Square method of calibration determines, if distance is more than the vector distance threshold value between the vector, judges user's outgoing mail behavior There are exception.
According to an embodiment of the invention, it is preferred that in the step 3), the mail document content of extraction is segmented, Then using linear discriminent analysis LDA (Linear Discriminant Analysis) method, the word included according to document Word content, judges the subject categories of document.
According to an embodiment of the invention, it is preferred that the accurate matching strategy rule in the step 4) includes regular expressions Formula matching strategy rule and Keywords matching policing rule.
In order to solve the above technical problems, the present invention provides a kind of anti-data-leakage based on user behavior and document content Analysis system, it is characterised in that the system includes:
Data vector establishes module, obtains the outgoing mail behavior of the predetermined long period of user and predetermined short time period respectively Related data, is averaged, normalized by data, respectively obtains the long-term action data vector and acts and efforts for expediency number of the user According to vector;
Abnormal determining module, calculates between the vector between user's long-term action data vector and acts and efforts for expediency data vector Distance, according to distance between the vector being calculated and the comparative result of predetermined vector distance threshold, determines user's outgoing mail Behavior is with the presence or absence of abnormal;
Document subject matter kind judging module, for there are user's outgoing mail of abnormal behaviour, extracting Mail Contents document, And judge the subject categories of document;
Accurate Analysis module, according to document subject matter classification select with the associated accurate matching of texts policing rule of the category, And determine to whether there is sensitive data in document using matching strategy rule.
According to an embodiment of the invention, it is preferred that the outgoing mail behavior related data includes:Post time, E-mail sender address, e-mail sender domain, mail recipient address, mail recipient domain, mail recipient's top level domain, postal Part type of theme, the number of mail sent, received number of mail, the size of mail, Mail Clients IP address, mail service Device IP address.
According to an embodiment of the invention, it is preferred that user's long-term action data vector and acts and efforts for expediency data vector Between vector between distance be mahalanobis distance (Mahalanobis Distance), and vector distance threshold value is true by card side's method of calibration It is fixed;
If distance is more than the vector distance threshold value between abnormal determining module determines the vector, user's outgoing is judged Mail behavior exists abnormal.
According to an embodiment of the invention, it is preferred that document subject matter kind judging module first unites mail document to be detected One is converted to txt text document forms, and the mail document content of extraction is segmented, is then analyzed using linear discriminent LDA (Linear Discriminant Analysis) method, the words content included according to document, judges the theme class of document Not.
In order to solve the above technical problems, the present invention provides a kind of computer-readable recording medium, it is characterised in that Jie Matter includes computer program instructions, and one of above-mentioned method is realized by performing the computer program execution.
Technical solution using the present invention, leaks detection method in the Dual Sensitive data of user behavior and content matching, Can significantly improve sensitive data leak event judgement order of accuarcy, enhancing enterprise for source code data safety management and control energy Power.This method can effectively reduce the rate of false alarm only judged by content matching.
Brief description of the drawings
Fig. 1 is the analysis process figure of the present invention.
Embodiment
The present invention proposes and realizes data that are a kind of while considering data content and user behavior to leak detection method.Should Method can consider user behavior on the basis of matched data content, so as to greatly reduce the wrong report of anti-data-leakage system Number.
Below in conjunction with the accompanying drawings and specific embodiment the present invention is further illustrated, but protection scope of the present invention is simultaneously Not limited to this.
<Compounding analysis method>
Double monitoring mechanism proposed by the present invention based on user behavior and data content, for the susceptibility of business data Detection demand, effectively reduces the rate of false alarm of business data leak prevention system security incident.Monitoring of this patent to data content, Carried out according to the matic mould and accurate profile matching pattern;Monitoring to user behavior, mainly from time, quantity, outgoing group Fall relation etc. to be analyzed;Finally by the relation of logical combination, the result of content detection and behavioral value is combined Come.
The first order detects:The anomaly analysis of user behavior.Outgoing mail behavior to each user of enterprise is analyzed, bag Include following aspect, sending time, e-mail sender address, e-mail sender domain, mail recipient address, mail recipient domain, Mail recipient's top level domain, mail matter topics type, the number of mail sent, received number of mail, the size of mail, mail Client ip address, mail server IP address etc..By analyzing prolonged user data (being typically larger than three months), statistics The average data of the aspects above of each user is obtained, and is normalized, so as to obtain the daily behavior of the user Data vector.Specifically, can then divided by standard deviation by the way that the value for the data item that need to be counted is subtracted average data values, Obtained value is taken to the index of e, the calculating of softmax functions is finally done, obtains the daily behavior data vector of the user.By user The data of daily data, either three days or one week, the acts and efforts for expediency vector of user is obtained according to same method for normalizing.
By calculating the distance (being proposed with mahalanobis distance) between long-term average user behavior vector sum acts and efforts for expediency vector, And obtain distance threshold using card side's method of calibration.If the distance value of acts and efforts for expediency vector sum long-term action vector is more than threshold Value, then assert the mail outgoing abnormal behavior on the day of the user.Acts and efforts for expediency do not ensure that extremely be anti-data-leakage peace Total event, it is therefore desirable to data are analyzed in itself again.
Chi-square Test is a kind of very wide hypothesis testing method of purposes, its application in grouped data statistical inference, Including:Two rates or two form frequently compared with Chi-square Test;Multiple rates or it is multiple form frequently compared with Chi-square Test and point Correlation analysis of class data etc..Chi-square Test is exactly the deviation journey between the actual observed value of statistical sample and theoretical implications value Degree, the departure degree between actual observed value and theoretical implications value just determine the size of chi-square value, and chi-square value is bigger, is not inconsistent more Close;Chi-square value is smaller, and deviation is smaller, more tends to meet, if two values are essentially equal, chi-square value is just 0, shows that theoretical value is complete Meet entirely.
By each vector value for the sample data that need to be counted, the distance with average is calculated, obtained distance value all maps Onto chi square function, the value of card taking side's zero point can obtain distance threshold.
Detect the second level:The matic mould analysis of data content.The annex or text of mail, comprising substantial amounts of words, Differentiate which kind of type document (referring mainly to Email attachment) is from the angle of subject mode analysis, follow-up precise contents are matched It is significant.After being segmented to the content of document, using LDA analysis methods, the words content included according to document, judges text The subject categories of shelves.
The third level detects:The accurate the matching analysis of description of data content.It is related to the document of sensitive data, it is necessary to comprising definite Sensitive features, the whether crucial numeric string feature such as words or regular expression.The detection in three above stage can expire Foot, then document to be detected necessarily contains sensitive data.
With reference to attached drawing 1, to the treated of the double check method proposed by the present invention based on user behavior and data content Journey is described in detail, which mainly includes abnormal behavior analysis, and subject analysis and precise contents match three mistakes Journey.
(1) anomaly analysis based on user behavior
For specific sender, the history outbox information of sender is first counted, particularly and the associated hair of the sender Number of packages amount, mail size, address of the addressee information, mail domain name classification etc., finally obtain normalized mail behavior vector (hair Number of packages amount, number of mail, addressee's quantity, mail domain quantity ...);Then the hair in the same day or current time interval is calculated The associated mail behavior vector of part people;Finally calculate two vectorial mahalanobis distances or included angle cosine.If mahalanobis distance surpasses Threshold value is crossed, then assert that the mail sends behavior and belongs to abnormal behavior.
Behavioural analysis can have many methods, including the mahalanobis distance of long-term action vector sum acts and efforts for expediency vector judges, Or the Distance Judgment of population mean vector sum individual behavior vector, the various vector distance computational methods comprising the present invention are not Depart from the essence of the present invention, within protection scope of the present invention.
(2) theme of document content judges
The analysis of subject categories and training process, are carried out using LDA methods, should just be established and be completed before mail outgoing, LDA models should establish in advance.During mail outgoing, first document (such as doc, xls, pdf form) to be detected is united One is converted to txt text document forms;Then, word segmentation processing is done to the content in text according to dictionary, using LDA methods, sentences Determine the subject categories belonging to text.
(3) precise contents match
Judge that the policing rule includes as a result, selecting the accurate matching strategy rule of category associations according to subject categories: Matching regular expressions strategy and keyword threshold value matching strategy.For matching regular expressions, if after successful match, also needed Scripts match is carried out, if it fails to match, illustrates that document content is normal, not including sensitive data;If pass through canonical table Have found that it is likely that there are sensitive data up to formula matching and after handling scripts match, it is also necessary to Keywords matching is further carried out, if It fails to match, then illustrates that document content is normal, not including sensitive data;If Keywords matching success, illustrates the document bag Containing sensitive data, judgement is exported accordingly as a result, such as sending warning prompt to user and administrator, and carry out log recording, This as just output citing, and and it is non-limiting, other various result way of outputs are within the protection domain of the invention.
The content rule of anti-data-leakage, the frequency of occurrence for being usually some keywords exceed some threshold value, some canonicals The appearance species of expression formula feature exceedes specific threshold, or the certain logic combination of both the above situation.Precise contents Method of completing the square is the common method of anti-data-leakage, it is easy to accomplish.
If during the outgoing mail of some user, first it is detected that be distributed as being abnormal behavior outside the mail of the user, Such as the change dramatically of mail outgoing quantity, either outgoing frequency increased dramatically or the group of purpose addressee significantly has Difference, then need to carry out Content inspection.If by subject analysis in Content inspection, the subject content of the document can determine that, And matched by the precise contents of the theme, matched rule can be hit, then can assert that this outer is distributed as being that data leak.
<Multiple analysis system>
The present invention provides the anti-data-leakage analysis system based on user behavior and document content, it is characterised in that should System includes:
Data vector establishes module, obtains the outgoing mail behavior of the predetermined long period of user and predetermined short time period respectively Related data, is averaged, normalized by data, respectively obtains the long-term action data vector and acts and efforts for expediency number of the user According to vector;
Abnormal determining module, calculates between the vector between user's long-term action data vector and acts and efforts for expediency data vector Distance, according to distance between the vector being calculated and the comparative result of predetermined vector distance threshold, determines user's outgoing mail Behavior is with the presence or absence of abnormal;
Document subject matter kind judging module, for there are user's outgoing mail of abnormal behaviour, extracting Mail Contents document, And judge the subject categories of document;
Accurate Analysis module, according to document subject matter classification select with the associated accurate matching of texts policing rule of the category, And determine to whether there is sensitive data in document using matching strategy rule.
The outgoing mail behavior related data includes:Post time, e-mail sender address, e-mail sender Domain, mail recipient address, mail recipient domain, mail recipient's top level domain, mail matter topics type, the mail number sent Amount, received number of mail, the size of mail, Mail Clients IP address, mail server IP address.
Distance is mahalanobis distance between vector between user's long-term action data vector and acts and efforts for expediency data vector (Mahalanobis Distance), and vector distance threshold value is determined by card side's method of calibration;
If distance is more than the vector distance threshold value between abnormal determining module determines the vector, user's outgoing is judged Mail behavior exists abnormal.
Mail document to be detected is first uniformly converted to txt text document forms by document subject matter kind judging module, right The mail document content of extraction is segmented, then using linear discriminent analysis LDA (Linear Discriminant Analysis) method, the words content included according to document, judges the subject categories of document.
The accurate matching strategy rule that Accurate Analysis module uses includes matching regular expressions policing rule and keyword Matching strategy rule.
<Applicating example>
Certain bank personnel is shortly before it will apply for leaving office, by the frequent outgoing sensitive data document of mailbox in row, either The quantity of outgoing mail, or the byte number of outgoing mail, all obvious increase.
The double monitoring method described through this patent, judges that the outer sensitive data for being distributed as being to determine of the user leaks peace Total event, therefore the management and control measures of blocking are taken, effectively protect the data assets of bank.
The Dual Sensitive data based on user behavior and content matching proposed by this method, which leak, detection method and is System, can significantly improve sensitive data leak event judgement order of accuarcy, enhancing enterprise for source code data safety management and control Ability.This method can effectively reduce the rate of false alarm only judged by content matching.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the scope of the present invention.It is all Within the spirit and principles in the present invention, any modification, equivalent substitution and improvement for being made etc., should all protect the guarantor in the present invention Within the scope of shield.

Claims (10)

1. the anti-data-leakage analysis method based on user behavior and document content, it is characterised in that this method includes following step Suddenly:
1) the outgoing mail behavior related data of the predetermined long period of user and predetermined short time period is obtained respectively, is put down by data , normalized, respectively obtains the long-term action data vector and acts and efforts for expediency data vector of the user;
2) distance between the vector between user's long-term action data vector and acts and efforts for expediency data vector is calculated, according to calculating Distance and the comparative result of predetermined vector distance threshold between the vector arrived, determine the behavior of user's outgoing mail with the presence or absence of different Often, if there is exception, step 3) is jumped to, otherwise jumps to step 5;
3) for there are user's outgoing mail of abnormal behaviour, extracting Mail Contents document, and judge the subject categories of document;
4) according to document subject matter classification select with the associated accurate matching of texts policing rule of the category, and use the matching strategy It whether there is sensitive data in the definite document of rule;
5) terminate.
2. according to the method described in claim 1, the outgoing mail behavior related data in the step 1) includes:Mail is sent Time, e-mail sender address, e-mail sender domain, mail recipient address, mail recipient domain, mail recipient's top level domain Name, mail matter topics type, the number of mail sent, received number of mail, the size of mail, Mail Clients IP address, postal Part server ip address.
3. according to the method described in claim 1, user's long-term action data vector and acts and efforts for expediency described in the step 2) Distance is mahalanobis distance (Mahalanobis Distance) between vector between data vector, and vector distance threshold value is by card side school Proved recipe method determines, if distance is more than the vector distance threshold value between the vector, judges that the behavior of user's outgoing mail exists It is abnormal.
4. according to the method described in claim 1, in the step 3), the mail document content of extraction is segmented, then Using linear discriminent analysis LDA (Linear Discriminant Analysis) method, in the words included according to document Hold, judge the subject categories of document.
5. according to the method described in claim 1, the accurate matching strategy rule in the step 4) includes regular expression With policing rule and Keywords matching policing rule.
6. the anti-data-leakage analysis system based on user behavior and document content, it is characterised in that the system includes:
Data vector establishes module, and it is related to the outgoing mail behavior of predetermined short time period to obtain the predetermined long period of user respectively Data, by data are average, normalized, respectively obtain the user long-term action data vector and acts and efforts for expediency data to Amount;
Abnormal determining module, calculates the vectorial spacing between user's long-term action data vector and acts and efforts for expediency data vector From according to the comparative result of distance between the vector being calculated and predetermined vector distance threshold, determining user's outgoing mail row For with the presence or absence of exception;
Document subject matter kind judging module, for there are user's outgoing mail of abnormal behaviour, extracting Mail Contents document, and sentence Determine the subject categories of document;
Accurate Analysis module, according to document subject matter classification select with the associated accurate matching of texts policing rule of the category, and adopt It whether there is sensitive data with the definite document of matching strategy rule.
7. system according to claim 6, the outgoing mail behavior related data includes:Post time, mail Sender address, e-mail sender domain, mail recipient address, mail recipient domain, mail recipient's top level domain, mail master Inscribe type, the number of mail sent, received number of mail, the size of mail, Mail Clients IP address, mail server IP Address.
8. system according to claim 6, between user's long-term action data vector and acts and efforts for expediency data vector Distance is mahalanobis distance (Mahalanobis Distance) between vector, and vector distance threshold value is determined by card side's method of calibration;
If distance is more than the vector distance threshold value between abnormal determining module determines the vector, user's outgoing mail is judged Behavior exists abnormal.
9. system according to claim 6, document subject matter kind judging module first turns mail document unification to be detected Txt text document forms are changed to, the mail document content of extraction is segmented, then analyze LDA using linear discriminent (Linear Discriminant Analysis) method, the words content included according to document, judges the subject categories of document.
10. a kind of computer-readable recording medium, it is characterised in that the medium includes computer program instructions, by performing State computer program and perform the method realized described in one of claim 1-5.
CN201711262779.4A 2017-12-04 2017-12-04 Anti-data-leakage analysis method and system based on user behavior and document content Pending CN108011809A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711262779.4A CN108011809A (en) 2017-12-04 2017-12-04 Anti-data-leakage analysis method and system based on user behavior and document content

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711262779.4A CN108011809A (en) 2017-12-04 2017-12-04 Anti-data-leakage analysis method and system based on user behavior and document content

Publications (1)

Publication Number Publication Date
CN108011809A true CN108011809A (en) 2018-05-08

Family

ID=62056374

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711262779.4A Pending CN108011809A (en) 2017-12-04 2017-12-04 Anti-data-leakage analysis method and system based on user behavior and document content

Country Status (1)

Country Link
CN (1) CN108011809A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108809813A (en) * 2018-06-14 2018-11-13 北京明朝万达科技股份有限公司 File processing method and system using the mail of anti-data-leakage technology
CN109040110A (en) * 2018-08-31 2018-12-18 新华三信息安全技术有限公司 A kind of outgoing behavioral value method and device
CN109101574A (en) * 2018-07-18 2018-12-28 北京明朝万达科技股份有限公司 A kind of the task measures and procedures for the examination and approval and system of anti-data-leakage system
CN109218168A (en) * 2018-09-26 2019-01-15 江苏神州信源系统工程有限公司 The blocking-up method and device of sensitive e-mail messages
CN109766715A (en) * 2018-12-24 2019-05-17 贵州航天计量测试技术研究所 One kind is towards the leakage-preventing automatic identifying method of big data environment privacy information and system
CN110519150A (en) * 2018-05-22 2019-11-29 深信服科技股份有限公司 Mail-detection method, apparatus, equipment, system and computer readable storage medium
CN110601904A (en) * 2019-09-27 2019-12-20 苏州浪潮智能科技有限公司 E-mail fault identification method and device and computer readable storage medium
CN113132297A (en) * 2019-12-30 2021-07-16 北京国双科技有限公司 Data leakage detection method and device
CN113343227A (en) * 2021-06-28 2021-09-03 深信服科技股份有限公司 Method, device, equipment and medium for identifying divulgence behavior
WO2022040398A1 (en) * 2020-08-20 2022-02-24 Saudi Arabian Oil Company System and method to extend data loss prevention (dlp) to leverage sensitive outbound emails investigations - (antileaks)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140006347A1 (en) * 2011-10-11 2014-01-02 Zenprise, Inc. Secure container for protecting enterprise data on a mobile device
CN104506545A (en) * 2014-12-30 2015-04-08 北京奇虎科技有限公司 Data leakage prevention method and data leakage prevention device
CN104794176A (en) * 2015-04-02 2015-07-22 中国科学院信息工程研究所 Multiattribute-based detection method for missent e-mail
CN105357217A (en) * 2015-12-02 2016-02-24 北京北信源软件股份有限公司 User behavior analysis-based data theft risk assessment method and system
CN106156628A (en) * 2015-04-16 2016-11-23 阿里巴巴集团控股有限公司 A kind of user behavior analysis method and device
CN106682527A (en) * 2016-12-25 2017-05-17 北京明朝万达科技股份有限公司 Data security control method and system based on data classification and grading
CN106778259A (en) * 2016-12-28 2017-05-31 北京明朝万达科技股份有限公司 A kind of abnormal behaviour based on big data machine learning finds method and system
CN106845272A (en) * 2017-01-19 2017-06-13 浙江中都信息技术有限公司 The leakage-preventing method and system of threat monitoring and data based on terminal agent

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140006347A1 (en) * 2011-10-11 2014-01-02 Zenprise, Inc. Secure container for protecting enterprise data on a mobile device
CN104506545A (en) * 2014-12-30 2015-04-08 北京奇虎科技有限公司 Data leakage prevention method and data leakage prevention device
CN104794176A (en) * 2015-04-02 2015-07-22 中国科学院信息工程研究所 Multiattribute-based detection method for missent e-mail
CN106156628A (en) * 2015-04-16 2016-11-23 阿里巴巴集团控股有限公司 A kind of user behavior analysis method and device
CN105357217A (en) * 2015-12-02 2016-02-24 北京北信源软件股份有限公司 User behavior analysis-based data theft risk assessment method and system
CN106682527A (en) * 2016-12-25 2017-05-17 北京明朝万达科技股份有限公司 Data security control method and system based on data classification and grading
CN106778259A (en) * 2016-12-28 2017-05-31 北京明朝万达科技股份有限公司 A kind of abnormal behaviour based on big data machine learning finds method and system
CN106845272A (en) * 2017-01-19 2017-06-13 浙江中都信息技术有限公司 The leakage-preventing method and system of threat monitoring and data based on terminal agent

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王珏: ""基于异常检测的入侵检测系统设计"", 《万方》 *
马俊: ""面向内部威胁的数据泄漏防护关键技术研究"", 《中国博士学位论文全文数据库信息科技辑》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110519150B (en) * 2018-05-22 2022-09-30 深信服科技股份有限公司 Mail detection method, device, equipment, system and computer readable storage medium
CN110519150A (en) * 2018-05-22 2019-11-29 深信服科技股份有限公司 Mail-detection method, apparatus, equipment, system and computer readable storage medium
CN108809813A (en) * 2018-06-14 2018-11-13 北京明朝万达科技股份有限公司 File processing method and system using the mail of anti-data-leakage technology
CN109101574A (en) * 2018-07-18 2018-12-28 北京明朝万达科技股份有限公司 A kind of the task measures and procedures for the examination and approval and system of anti-data-leakage system
CN109101574B (en) * 2018-07-18 2020-09-25 北京明朝万达科技股份有限公司 Task approval method and system of data leakage prevention system
CN109040110A (en) * 2018-08-31 2018-12-18 新华三信息安全技术有限公司 A kind of outgoing behavioral value method and device
CN109218168A (en) * 2018-09-26 2019-01-15 江苏神州信源系统工程有限公司 The blocking-up method and device of sensitive e-mail messages
CN109766715A (en) * 2018-12-24 2019-05-17 贵州航天计量测试技术研究所 One kind is towards the leakage-preventing automatic identifying method of big data environment privacy information and system
CN110601904A (en) * 2019-09-27 2019-12-20 苏州浪潮智能科技有限公司 E-mail fault identification method and device and computer readable storage medium
CN113132297A (en) * 2019-12-30 2021-07-16 北京国双科技有限公司 Data leakage detection method and device
CN113132297B (en) * 2019-12-30 2023-04-18 北京国双科技有限公司 Data leakage detection method and device
WO2022040398A1 (en) * 2020-08-20 2022-02-24 Saudi Arabian Oil Company System and method to extend data loss prevention (dlp) to leverage sensitive outbound emails investigations - (antileaks)
CN113343227A (en) * 2021-06-28 2021-09-03 深信服科技股份有限公司 Method, device, equipment and medium for identifying divulgence behavior

Similar Documents

Publication Publication Date Title
CN108011809A (en) Anti-data-leakage analysis method and system based on user behavior and document content
CN107577939B (en) Data leakage prevention method based on keyword technology
CN110399925B (en) Account risk identification method, device and storage medium
US20210250320A1 (en) Method and system for analyzing electronic communications and customer information to recognize and mitigate message-based attacks
US7594277B2 (en) Method and system for detecting when an outgoing communication contains certain content
US7665140B2 (en) Fraudulent message detection
US7903549B2 (en) Content-based policy compliance systems and methods
US8254698B2 (en) Methods for document-to-template matching for data-leak prevention
US8051187B2 (en) Methods for automatic categorization of internal and external communication for preventing data loss
US7444403B1 (en) Detecting sexually predatory content in an electronic communication
US20150180896A1 (en) Collaborative phishing attack detection
US20120215853A1 (en) Managing Unwanted Communications Using Template Generation And Fingerprint Comparison Features
CN105337993B (en) It is a kind of based on the mail security detection device being association of activity and inertia and method
US11627152B2 (en) Real-time classification of content in a data transmission
Aggarwal et al. Identification and detection of phishing emails using natural language processing techniques
US20240048514A1 (en) Method for electronic impersonation detection and remediation
CN108011881B (en) Sensitive data slow leakage detection method and system based on self-adaptive sensing
CN110866108A (en) Sensitive data detection system and detection method thereof
US9171171B1 (en) Generating a heat map to identify vulnerable data users within an organization
CN113408281A (en) Mailbox account abnormity detection method and device, electronic equipment and storage medium
CN109918638B (en) Network data monitoring method
US10586046B1 (en) Automated security feed analysis for threat assessment
KR20220117187A (en) Security compliance automation method
US11757816B1 (en) Systems and methods for detecting scam emails
US11240266B1 (en) System, device and method for detecting social engineering attacks in digital communications

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180508