CN107196942A - A kind of inside threat detection method based on user language feature - Google Patents
A kind of inside threat detection method based on user language feature Download PDFInfo
- Publication number
- CN107196942A CN107196942A CN201710374486.9A CN201710374486A CN107196942A CN 107196942 A CN107196942 A CN 107196942A CN 201710374486 A CN201710374486 A CN 201710374486A CN 107196942 A CN107196942 A CN 107196942A
- Authority
- CN
- China
- Prior art keywords
- user
- mrow
- msub
- data
- personality
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/552—Detecting local intrusion or implementing counter-measures involving long-term monitoring or reporting
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
Abstract
The invention discloses a kind of inside threat detection method based on user language feature, it analyzes the language data of user first, the characteristic vector that quantizes of user's personality feature can be characterized by extracting language feature and setting up, then build grader and carry out classifier training to recognize the user of anomalous personality's psychological characteristics, the characteristic vector drift rate of ultimate analysis anomalous personality psychological characteristics user reports safety officer's progress analysis reply to filter out wrong report user using remaining user as internal potential malicious user.The present invention, which has taken into full account, internals attack the middle attacker psychological characteristics of itself, psychological modeling has been carried out from personality angle, and abnormality detection grader is constructed with this, it compensate for existing detection method and only focus on the deficiency that attack process ignores attack main body, "abnormal" and " malice " are distinguished so as to fine granularity, detection inside threat can be analyzed comprehensively, and effectively the high wrong report of reduction conventional interior threat detection method is with failing to report problem.
Description
Technical field
The present invention relates to a kind of inside threat detection method based on user language feature, belong to Information Security Construction/net
Network security technology area.
Background technology
With the development of network, the safety of the network information increasingly causes the attention of society, various anti-virus softwares, fire prevention
The safety products such as wall, intrusion detection are widely used.But these safety information products are outside just for the sake of defence
Invasion and steal, with people to network security cognition and technology development, find due to divulging a secret that internal staff causes
Account for significant proportion with intrusion event, such as 2013 Snowdons " prism door " event, be exactly together typical internal staff let out
Close safe example.So reply inside threat should with resist outside invasion must in the same manner as be taken seriously, it is but real
In there is no effective inside threat testing mechanism.
Because inside threat attacker is usually employee's (on-job or leaving office), contractor and the business partner of enterprise or tissue
With etc., and the access right of the organized system of tool, network and data, thus inside threat be generally configured with high disguise with
Harmfulness, traditional Defense in depth system based on safety means such as fire wall, IDS can not successfully manage inside threat.
The key of detection inside threat is perfect internal security audit, and its core is customer-centric, records it
All key operations and behavior in system and network, so as to form action trail of the user internally in network.In current
The emphasis of portion's security audit is following behavior:
Document is audited:Write-in, establishment, duplication, deletion of audit portfolio etc. are operated;
Print auditing:Client-initiated of auditing prints event and file content etc.;
Log in audit:The behavior of audit logging in system by user, and nullify, restart, closing the operation of system;
Process is audited:The process that the user that audits creates, closed;
Network monitoring:The WEB that audits accesses behavior, including access target IP/Port, page request etc.;
Equipment is audited:The movable memory equipment usage behaviors such as audit USB, the file for such as replicating, deleting;
Mail is audited:Part people, mail header, part body are transmitted/received in audit user mail behavior, such as mail header
And annex number (type) etc..
Various dimensions, fine-grained internal security audit necessarily cause huge data volume, thing followed sharp increase
Detection complexity is that inside threat detection proposes challenge.Therefore, user behavior is modeled with reference to big data analytical technology, especially
It is that the big data security study for being directed to internal security audit daily record has become current study hotspot.But inside prestige in practice
Side of body detecting system causes detection rate of false alarm higher because its data source portrays the deficiencies such as dimension is unilateral, detecting system framework is single,
Practicality is poor, therefore necessary inside threat detecting system of the design with good practicality.
The emphasis of existing inside threat detecting system design research and development is to use method for detecting abnormality, the inside peace based on user
Full audit log sets up inside threat grader, and its key step is as follows:
Internal security audit is gathered:Internal security audit system is disposed, the built-in systems such as the document access of collection user are received
With network behavior, pass to grader after formatting processing and build module;
Abnormality detection grader:With method for detecting abnormality from the data learning personal behavior model of reception, build different
Often detect grader;
User behavior is detected:Abnormality detection grader is detected to the User action log of special time window, is judged
Whether it is inside threat;
The above-mentioned inside threat detection method based on abnormality detection can tackle major part in practice and internal attack situation, so
And its hypotheses has the defect that can not ignore, i.e., it is assumed:The malicious act of internal malicious user is necessarily different from normal
Work behavior, therefore malicious act can be distinguished by abnormality detection;In practice, the malicious act of above-mentioned hypothesis and abnormal behaviour is simultaneously
Not exclusively of equal value, i.e. two class behavior set are simultaneously unequal, if therefore only consider unusual checking, necessarily cause high rate of false alarm
(normal users are identified as malicious user) fails to report (malicious user is identified as normal users) with height, specifically, may be referred to following
Two examples:
Frequently by the progress matters of e-mail communications collaborative project when 1. project manager A and B is flat, certain day A will by mail
The Project Technical material that should be maintained secrecy has issued B (malicious act is not abnormal);
2. purchasing agent A is always purchased when flat at supply and marketing company B, once purchased suddenly from supply and marketing company C, but can not
Thus judge that A has received C returning (abnormal behaviour not necessarily malice)
As described above, the core of existing inside threat detecting system is to check and user's row in attack process by strategy
For abnormality detection build the grader that detection is abnormal.But the hypotheses of simple analytical attack process feature are obscured
Perhaps, the boundary of "abnormal" and " malice ", the in practice behavior of user's malice is not belonging to exception, and abnormal behaviour may not also belong to evil
Meaning.The custom system relied solely in the internal security audit daily record of collection e insufficient to fine granularity differentiation with network behavior data
The boundary of "abnormal" and " malice ", therefore the inside threat detecting system based on available data dimension is inevitably present high miss
Report and fail to report problem.Height wrong report cause alarm quality it is relatively low, one side analysis personnel can not analyze comprehensively, on the other hand cause be
Availability of uniting reduction, as a result detecting system performs practically no function;Height, which is failed to report, then directly causes Prevention-Security failure, causes enterprise or tissue
Assets are sunk among excessive risk.It is the key factor for restricting inside threat detecting system practicality that height wrong report is failed to report with height, is also
The subject matter that current internal threat detection system is present.
The content of the invention
Check that the existing inside threat detection method with behavioral data abnormality detection is present by strategy for prior art
High wrong report, the high deficiency failed to report, the invention provides a kind of inside threat detection method based on user language feature, its energy
Enough detection inside threats of analysis comprehensively, effectively the high wrong report of reduction conventional interior threat detection method is with failing to report problem.
The present invention solves its technical problem and adopted the technical scheme that:A kind of inside threat inspection based on user language feature
Survey method, it is characterized in that, the language data of user is analyzed first, and user's personality can be characterized by extracting language feature and setting up
The characteristic vector that quantizes of feature, then builds grader and carries out classifier training to recognize the use of anomalous personality's psychological characteristics
Family, the characteristic vector drift rate of ultimate analysis anomalous personality psychological characteristics user is used to filter out wrong report user by remaining
Family reports safety officer as internal potential malicious user and carries out analysis reply.
Preferably, the inside threat detection method based on user language feature comprises the following steps:
1), data prediction:User language data to internal audit system are carried out including at least automation audit, automatically
Change the analyzing and processing of contents processing and the aspect of automation polymerization three;
2), personality characteristic vector is built:User language data first to each user are analyzed, and will obtain phase
The word frequency result for the important part of speech answered as Chinese word LIWC analysis result, then by the spy of LIWC parts of speech and five-factor model personality
Association is levied, 18 sub- dimensional characteristics numerical value of five-factor model personality will be calculated as the personality characteristic vector of the user;
3), classifier training:Grader is built first, and selects the user language number of audit in some initial period
According to then the personality characteristic vector of each user of calculating obtaining initial customer group using one-class support vector machines training
The mental model of group, finally calculates the personality modeled based on user language data content after in any one new period
Psychological characteristics vector, and judge whether exception using groups of users mental model, judge that abnormal groups of users set is designated as
AbnormalUsers;
4) confidence calculations, are threatened:To being judged as that abnormal groups of users set AbnormalUsers carries out calculating threat
Confidence level further screens user;The threat confidence calculations process includes step in detail below:
41), for the user in abnormal user cluster set AbnormalUsers, by its corresponding 18 dimensional characteristics to
Amount constitutes a matrix Matrix_1, and line number is AbnormalUsers number of users, is classified as 18;
42) Martrix_2, Martrix_2 calculating, are obtained according to column count matrix Martix_1 often capable Z score
Formula is as follows:
Wherein, for i-th of user in Matrix_1, XijIts j-th of dimension numerical value is represented,Represent its square
The numerical value average that jth is arranged in battle array, σjRepresent the standard deviation of jth row;
After each user calculates Z score in Matrix_1, new matrix Matrix_2 is constituted;
43), the average of calculating matrix Martrix_2 every column data, obtains the mean vector Mean_value of 18 dimensions;
44) its 18 dimension, is compared each user in abnormal user cluster set AbnormalUsers successively first special
Levy the number for exceeding correspondence numerical value in mean vector Mean_value in vector, then using 18 new dimension binary vectors of gained as
It threatens confidence level TCD, if threatening in confidence level TCD ' 1 ' number to exceed threshold k, it is normal users to mark the user,
And delete the user from abnormal user cluster set AbnormalUsers;
45), repeat the above steps 41) to step 44) until institute in all abnormal user cluster set AbnormalUsers
Useful to pass through judgement per family, finally user is used as inside is potential to dislike in remaining abnormal user cluster set AbnormalUsers
Reporting of user of anticipating is analyzed to safety officer tackles.
Preferably, the user language data include work mail data, data for electronic documents and social networking application data, institute
State work mail data for user send work mail content of text, the data for electronic documents for user writing and work
Make content of text that is related and in electronic form storing, the social networking application data crawled for the social status of user after text
This content.
Preferably, the analyzing and processing process of described pair of work mail data comprises the following steps:
111), automation audit:Collect the work mail data in certain period;
112) contents processing, is automated:The mail that only analysis user sends, mail head's letter is weeded out for each envelope mail
Breath, only extracts content of text;For the transmission mail with multiple time tags, the postal of the last time time transmission is only considered
Part;
113), automation polymerization:The work mail data of each user is carried out at automation audit and automation content
The content of text of reason aggregates into one big text and stored.
Preferably, the analyzing and processing process to data for electronic documents comprises the following steps:
121), automation audit:Collect the data for electronic documents in being worked in certain period;
122) contents processing, is automated:Remove title datas at different levels, formatted data and the picture sound in electronic document
Data, only extract the plain text content in electronic document;
123), automation polymerization:The data for electronic documents of each user is carried out at automation audit and automation content
The content of text of reason aggregates into one big text and stored.
Preferably, the analyzing and processing process to social networking application data comprises the following steps:
131), automation audit:Collect the social networking application status data of internal user in certain period;
132) contents processing, is automated:Picture, sound and the hyperlink data in social networking application status data are removed,
The content of text only write in processing state by the user;
133), automation polymerization:The social networking application data of each user are carried out at automation audit and automation content
The content of text data aggregate of reason is into one big text and is stored.
Preferably, in personality characteristic vector building process, in the minds of the text of Institute of Developed Organisms, Academia Sinica
Literary Psychoanalysis System is analyzed the mail text of each user, obtains the word frequency result of corresponding important part of speech, as
Chinese word LIWC analysis result;By LIWC parts of speech and the feature association of five-factor model personality, 18 sons of five-factor model personality are calculated
Dimensional characteristics numerical value, is used as the personality characteristic vector of the user.
Preferably, the sub- dimension of 18 five-factor model personalities is respectively:Anxiety speciality, angry speciality, depressed speciality, self
Realize speciality, impulsion speciality, fragile speciality, trust speciality, moral speciality, profit his speciality, cooperation speciality, modest speciality, sympathy
Speciality, self efficacy, order speciality, responsibility speciality, sense of accomplishment, self-discipline speciality and careful speciality.
Preferably, the calculating process of described 118 sub- dimensional characteristics numerical value is as follows:
For i-th of dimension in 18 sub- dimensions, the sub- dimension and the statistic correlation of LIWC parts of speech are:
Wherein, FeatiRepresent i-th of sub- dimension, and (qi,j,ci,j) represent corresponding LIWC parts of speech qi,jAnd its it is corresponding
Statistic correlation ci,j, and NiFor the LIWC part of speech number related to i-th of sub- dimension statistically significant;
On the basis of formula (1), the personality characteristic vector of user is calculated by formula (2):
Wherein, FeatiRepresent the personality characteristic vector of any one in 18 dimensions of user, qjWith cjRespectively
Represent the word frequency value and corresponding statistic correlation on j-th of the part of speech for the LIWC that the user associates in i-th of dimension.
Preferably, the calculation formula for threatening confidence level TCD is as follows:
TCDi={ 1,1,0,1,0,1,1,1,0,1,1,1,1,0,1,1,1,1 } (5)
Wherein, ZijRepresent the i row j column datas in Matrix_2, i.e., the Z score of i-th user's jth row dimensional characteristics, MVj
Represent j-th of value in mean vector Mean_value;The number of numerical value ' 1 ' in the threat confidence level of user in formula (5)
For 14, if the number 14 of numerical value ' 1 ' is more than given threshold k, the user is corrected for normal users, and by from
Rejected in AbnormalUsers set.
The beneficial effects of the invention are as follows:The present invention by analyze user language data extraction language feature and set up can
The characteristic vector that quantizes of user's personality feature is characterized, then trains grader to identify anomalous personality's heart from user
The user of feature is managed, and further analyzes the characteristic vector drift rate of these users, user is reported by mistake so as to filter out, will be remaining
User reports safety officer's analysis reply as internal potential malicious user.The present invention has taken into full account internal attack in attack
The psychological characteristics of the person of hitting itself, psychological modeling has been carried out from personality angle, and constructs abnormality detection grader with this, compensate for
Existing detection method only focuses on the deficiency that attack process ignores attack main body, and "abnormal" is distinguished so as to fine granularity with " disliking
Meaning ", analysis detection inside threat, effectively prevent the high wrong report of conventional interior threat detection method with failing to report problem comprehensively.
Compared with prior art, the invention has the characteristics that:
Model attacker's feature:The deficiency that existing detection method is concerned only with attack process feature is compensate for, attacker is modeled
Feature, so that there is provided analytical attack motivation and the possibility of Forecast attack;By taking the mail that works as an example, by analyzing user in mail
Language feature, with reference to LIWC parts of speech and the statistical correlation Journal of Sex Research of personality characteristics, constructs and characterizes user's personality feature
18 dimensional characteristics vector, with this carry out machine learning training obtain grader.
The confidence level that poses a threat TCD:If relying solely on language feature modeling personality psychological characteristics judges malicious user, must
So there is higher wrong report, therefore the abnormal user that the present invention goes out for detection of classifier, further analyze these users at 18
Mean deviation change in personality dimension, finally identifies that it is normal users to offset larger judgement, from
Deleted in AbnormalUsers, so as to reduce the rate of false alarm of detection method.
Except it is above-mentioned it is main a little, this invention also solves the deficiency of traditional psychological detection method.Traditional psychology detection
Method relies primarily on the test of user psychology questionnaire, colleague or leader's evaluation etc. and realized, wherein not only needing to pay the more time
With financial cost, it is often more important that user's self-assessment is difficult to avoid that subjective bias with third party evaluation, but also may touch
Violate the laws and regulations such as secret protection.Detection method in the present invention bases oneself upon internal audit system, whole analysis process prosthetic ginseng
With automating and carrying out, original content file is automatically deleted after the analysis of LIWC parts of speech, while effectively protection employee's privacy, in fact
The detection of existing internal malicious user, finally not only reduces the time financial cost of traditional detection, reduces legal ethics risk,
Also effectively reduce the inside threat risk that enterprise faces with tissue.
Brief description of the drawings
With reference to Figure of description, the present invention will be described.
Fig. 1 is flow chart of the method for the present invention;
Fig. 2 carries out the method flow diagram of inside threat detection for the present invention.
Embodiment
For the technical characterstic for illustrating this programme can be understood, below by embodiment, and its accompanying drawing is combined, to this hair
It is bright to be described in detail.Following disclosure provides many different embodiments or example is used for realizing the different knots of the present invention
Structure.In order to simplify disclosure of the invention, hereinafter the part and setting of specific examples are described.In addition, the present invention can be with
Repeat reference numerals and/or letter in different examples.This repetition is that for purposes of simplicity and clarity, itself is not indicated
Relation between various embodiments are discussed and/or set.It should be noted that part illustrated in the accompanying drawings is not necessarily to scale
Draw.Present invention omits the description to known assemblies and treatment technology and process to avoid being unnecessarily limiting the present invention.
It is that insider is initiated in enterprise or tissue to internal attack (or inside threat), is different from legacy network invasion and attacks
The new threat hit.Insider is located inside legacy network secure border, and possesses the key of Prevention-Security and target of attack and know
Know, therefore insider can bypass existing Prevention-Security mechanism, implement network attack from enterprise or organization internal and (such as steal technology
Patent, client's list etc.), so as to bring about great losses.It is an object of the present invention to:Internal information based on enterprise or NGO
Auditing system, collects the language data of user to analyze its feature, and sets up mental model for all users, therefrom distinguishes
Internal potential malicious user, i.e., very likely turn into the high-risk user for the person of internaling attack.High-risk user name is submitted on this basis
It is single to be analyzed for internal security administrative staff, and take reply action to prevent or stop to internal attack behavior.
A kind of inside threat detection method based on user language feature of the present invention, it is characterized in that, user is analyzed first
Language data, the characteristic vector that quantizes of user's personality feature can be characterized by extracting language feature and setting up, then structure
Build grader and carry out classifier training to recognize the user of anomalous personality's psychological characteristics, ultimate analysis anomalous personality's psychological characteristics
The characteristic vector drift rate of user reports user by mistake to filter out, and is reported remaining user as internal potential malicious user
Safety officer carries out analysis reply.
Preferably, as shown in figure 1, the inside threat detection method based on user language feature comprises the following steps:
1), data prediction:User language data to internal audit system are carried out including at least automation audit, automatically
Change the analyzing and processing of contents processing and the aspect of automation polymerization three;
2), personality characteristic vector is built:User language data first to each user are analyzed, and will obtain phase
The word frequency result for the important part of speech answered as Chinese word LIWC analysis result, then by the spy of LIWC parts of speech and five-factor model personality
Association is levied, 18 sub- dimensional characteristics numerical value of five-factor model personality will be calculated as the personality characteristic vector of the user;
3), classifier training:Grader is built first, and selects the user language number of audit in some initial period
According to then the personality characteristic vector of each user of calculating obtaining initial customer group using one-class support vector machines training
The mental model of group, finally calculates the personality modeled based on user language data content after in any one new period
Psychological characteristics vector, and judge whether exception using groups of users mental model, judge that abnormal groups of users set is designated as
AbnormalUsers;
4) confidence calculations, are threatened:To being judged as that abnormal groups of users set AbnormalUsers carries out calculating threat
Confidence level further screens user;The threat confidence calculations process includes step in detail below:
41), for the user in abnormal user cluster set AbnormalUsers, by its corresponding 18 dimensional characteristics to
Amount constitutes a matrix Matrix_1, and line number is AbnormalUsers number of users, is classified as 18;
42) Martrix_2, Martrix_2 calculating, are obtained according to column count matrix Martix_1 often capable Z score
Formula is as follows:
Wherein, for i-th of user in Matrix_1, XijIts j-th of dimension numerical value is represented,Represent its square
The numerical value average that jth is arranged in battle array, σjRepresent the standard deviation of jth row;
After each user calculates Z score in Matrix_1, new matrix Matrix_2 is constituted;
43), the average of calculating matrix Martrix_2 every column data, obtains the mean vector Mean_value of 18 dimensions;
44) its 18 dimension, is compared each user in abnormal user cluster set AbnormalUsers successively first special
Levy the number for exceeding correspondence numerical value in mean vector Mean_value in vector, then using 18 new dimension binary vectors of gained as
It threatens confidence level TCD, if threatening in confidence level TCD ' 1 ' number to exceed threshold k, it is normal users to mark the user,
And delete the user from abnormal user cluster set AbnormalUsers;
45), repeat the above steps 41) to step 44) until institute in all abnormal user cluster set AbnormalUsers
Useful to pass through judgement per family, finally user is used as inside is potential to dislike in remaining abnormal user cluster set AbnormalUsers
Reporting of user of anticipating is analyzed to safety officer tackles.
Preferably, the user language data include work mail data, data for electronic documents and social networking application data, institute
State work mail data for user send work mail content of text, the data for electronic documents for user writing and work
Make content of text that is related and in electronic form storing, the social networking application data crawled for the social status of user after text
This content.
Preferably, the sub- dimension of 18 five-factor model personalities is respectively:Anxiety speciality, angry speciality, depressed speciality, self
Realize speciality, impulsion speciality, fragile speciality, trust speciality, moral speciality, profit his speciality, cooperation speciality, modest speciality, sympathy
Speciality, self efficacy, order speciality, responsibility speciality, sense of accomplishment, self-discipline speciality and careful speciality.
The main thought of the present invention is to analyze the language data of user, and user's personality can be characterized by extracting language feature foundation
The characteristic vector that quantizes of psychological characteristics, then trains grader to identify the use of anomalous personality's psychological characteristics from user
Family, and further analyze the characteristic vector drift rate of these users, so as to filter out wrong report user, using remaining user as interior
The potential malicious user in portion reports safety officer's analysis reply.As shown in Fig. 2 the overall technical architecture of the present invention can be divided into
Data processing, personality characteristic vector structure, classifier training and threat four key steps of confidence calculations, divide below
Do not elaborate.
First, data prediction
The user language data of analysis come from internal audit system, mainly include three classes:
1st, work mail audit:The content of text for the work mail that the user of audit sends;
2nd, electronic document content is audited:Prospectus, working report of work correlation of the user writing etc. are with electronic edition shape
The content of text of the multimedia forms such as job documentation, form and the PPT of formula audit;
3rd, social networking application content auditing:The social status such as microblogging, the wechat circle of friends of the user crawl after content of text
Audit.
Analyzing and processing for above-mentioned three speech like sounds data source is similar method, and for convenience of description, the present invention connects down
Illustrated respectively come the pretreatment work to three speech like sound data sources.
For work mail, data processing work mainly has:
11) automation audit:Collect the work mail data of certain period (some months or 1 year);
12) contents processing is automated:The transmission mail of user is only analyzed, for each envelope mail, mail head is weeded out
(title, sender, recipient etc.) information, only extracts content of text;For the transmission mail with multiple time tags, only examine
The worry time is recent (such as:For forwarding and replied mail, text when only considering to reply text or forwarding);
13) automation polymerization:For each user, by its all text according to above-mentioned steps automatic business processing
Appearance aggregates into one big text, is analyzed after storage for lower step.
For electronic document:
21) automation audit:Collect the data for electronic documents in being worked in certain period (some months or 1 year);
22) contents processing is automated:Title datas at different levels, formatted data and picture sound in removal electronic document etc.
Multi-medium data, only extracts the plain text content in electronic document;
23) automation polymerization:For each user, by all corresponding electronics according to above-mentioned steps automatic business processing
Document content aggregates into one big text, is analyzed after storage for lower step.
For social networking application:
31) automation audit:Collect the social networking application status data of internal user in certain period (some months or 1 year)
(such as microblogging, circle of friends);
32) contents processing is automated:The nonformats such as picture, sound and hyperlink in removal social networking application status data
Change data, the content of text only write in processing state by the user, the i.e. content of text not comprising forwarding type;
33) automation polymerization:, will be all social according to the correspondence of above-mentioned steps automatic business processing for each user
Application state content of text data aggregate is analyzed into one big text after storage for lower step.
2nd, psychological characteristics is built
The following Treatment Analysis process of the present invention is applied to work mail, electronic document and the class number of social networking application state three
According to source, explanation is no longer distinguished one by one.
Use literary Psychoanalysis System (http in the minds of the text of Institute of Developed Organisms, Academia Sinica://
ccpl.psych.ac.cn/textmind/)【1】Mail text analysis to each user, obtains corresponding important part of speech
Word frequency result, be used as Chinese word LIWC【2】Analysis result.Wherein LIWC (Linguistic Inquiry and Word
Count, language acquirement and vocabulary count storehouse) be one be widely used be used for from language analyze thought, emotion, personality etc.
Literary analysis system is the science to former english system on Chinese language lexicon in the minds of the opening analysis system of subjective factor, text
Extension.After this step terminates, the original content file of each user is deleted, to ensure privacy safety;
By LIWC parts of speech and five-factor model personality【3】Feature association, calculate 18 sub- dimensional characteristics numbers of five-factor model personality
Value, is used as the personality characteristic vector of the user【4】.
The sub- dimension of 18 five-factor model personalities is respectively:
Below by taking fragile speciality as an example, illustrate specific how each to calculate each user according to LIWC parts of speech analysis result
The method of sub- dimension speciality numerical value.From【4】In can obtain the statistic correlation of sub- dimension speciality and LIWC parts of speech, it is such as fragile special
Matter is:Sensation class vocabulary (0.18), anxiety class vocabulary (0.16), article (- 0.16), first person singular word (0.14), instead
Body pronoun class (0.13), cause and effect word (0.11), gap word (0.11), cognitive process word-congnitive processes
(0.1), qualifier (0.1), second person class vocabulary-(- 0.1).
It is correlation system wherein to list the numerical value in 10 LIWC parts of speech stronger with fragile speciality correlation, bracket
Number, can obtain the numerical score of fragile speciality of user to make according to the LIWC part of speech analysis results of these correlations and user
For a numerical value in 18 sub- dimensions.
For i-th of dimension in 18 sub- dimensions, studied by searching【4】Obtain the sub- dimension and LIWC parts of speech
Statistic correlation be:
Wherein, FeatiRepresent i-th of sub- dimension, and (qi,j,ci,j) represent corresponding LIWC parts of speech qi,jAnd its it is corresponding
Statistic correlation ci,j, and NiFor the LIWC part of speech number related to i-th of sub- dimension statistically significant.Basis in formula (1)
On, our calculation formula (2):
Above-mentioned formula is represented for any one user, the side of the personality characteristic vector calculating of its 18 dimensions
Method.qjWith cjThe word frequency value and corresponding system on j-th of the part of speech for the LIWC that the user associates in i-th of dimension are represented respectively
Count correlation.Its remainder values is according to such method combination correlation calculations.For each user, represented
The characteristic vector of 18 dimensions of its personality feature, each numerical value is to combine LIWC part of speech analysis results according to the method described above
Obtained with personal traits correlation weighted sum.
3rd, grader is trained
Grader is built in order to application machine learning algorithm, it is proposed that some initial period (such as 1 of selection
Individual month), according to the user job mail of audit in time period, the personality of each user is calculated according to psychological characteristics building process
Psychological characteristics vector, is then trained using one-class support vector machines (One Class SVM, sklearn-0.19 versions algorithms library)
Obtain the mental model PsyModel of initial groups of users.
When after any one new period (some moon of time as after), according to the method for psychological characteristics building process
Calculate the personality characteristic vector modeled in the period based on user job Mail Contents, the customer group obtained using upper step
Group mental model PsyModel judges whether exception, judges that abnormal groups of users set is designated as AbnormalUsers.
4th, calculate and threaten confidence level
It is judged as abnormal groups of users set for the grader obtained during training grader
AbnormalUsers, wherein certain normal users may be included, it is therefore desirable to calculate and threaten confidence level to be used with further screening
Family.Specifically:
1) for the user in AbnormalUsers, its corresponding 18 dimensional characteristics vector is constituted into a matrix
Matrix_1, line number is AbnormalUsers number of users, is classified as 18;
2) Martrix_2, i.e. formula are obtained according to column count matrix Martix_1 often capable Z score:
Wherein, for i-th of user in Matrix_1, XijIts j-th of dimension numerical value is represented,Represent its square
The numerical value average that jth is arranged in battle array, σjRepresent the standard deviation of jth row.Each user's (i.e. each row of data) meter in Matrix_1
Calculate after Z score, constitute new matrix Matrix_2;
3) average of calculating matrix Martrix_2 every column data, obtains the mean vector Mean_value of 18 dimensions;
4) for each user in AbnormalUsers, compare exceed in its 18 dimensional characteristics vector successively
It is worth the number of correspondence numerical value in vector M ean_value, 18 new dimension binary vectors of gained is then threatened into confidence level as it
(TCD);Number such as in TCD ' 1 ' exceedes threshold k, then it is normal users to mark the user, and is deleted from AbnormalUsers
Except the user;Here K suggestions are 12, according to circumstances can be specifically adjusted flexibly between (12~16), specific formula is as follows:
TCDi={ 1,1,0,1,0,1,1,1,0,1,1,1,1,0,1,1,1,1 } (5)
Wherein, ZijRepresent the i row j column datas in Matrix_2, i.e., the Z score of i-th user's jth row dimensional characteristics, MVj
Represent j-th of value in mean vector Mean_value.The number of numerical value ' 1 ' in the threat confidence level of user in formula (5)
For 14, more than given threshold k=12, therefore the user is corrected for normal users, and picks in being gathered from AbnormalUsers
Remove.
5) repeat the above steps 1) to step 4) used per family by judging, most until all in all AbnormalUsers
User reports safety officer's analysis reply as internal potential malicious user in remaining AbnormalUsers afterwards
The present invention is exactly directed to checks that the existing inside threat detection method with behavioral data abnormality detection is deposited by strategy
High wrong report, the high deficiency failed to report, propose the language data feature based on user (employee), construct sign user personality special
The psychological characteristics vector levied, overall groups of users mental model is set up by machine learning algorithm, is finally therefrom identified interior
The psychology modeling detection method of portion's abnormal user.On this basis, the present invention analyzes it to the abnormal user that previous step is identified
Overall offset degree in characteristic dimension, so as to remove the normal users that may be reported by mistake, finally obtains internal potential malicious user,
Submit to safety officer further analysis and reply.The present invention has taken into full account that the psychology for internaling attack middle attacker itself is special
Point, psychological modeling has been carried out from personality angle, and constructs abnormality detection grader with this, be compensate for existing detection method and is only closed
The deficiency that attack process ignores attack main body is noted, "abnormal" and " malice " are distinguished so as to fine granularity, comprehensively analysis detection
The high wrong report of inside threat, effectively reduction conventional interior threat detection method is with failing to report problem.
The present invention applies (microblogging, friend for coming from audit work mail, audit work document and audit social media
Friend's circle etc.) user language data, delete after identity identification information (such as mail head, job documentation metadata, social activity ID etc.), general
Text data convergence is big file, then by literary Psychoanalysis System in the minds of text【1】LIWC part of speech knots are obtained to Chinese analysis
Really;
The present invention is according to LIWC parts of speech【2】With personality feature【3】Statistic correlation achievement in research【4】, set up with
The personality characteristic vector of 18 dimensions headed by anxiety speciality;
The present invention is judged as abnormal user set AbnormalUsers, the Z of the analysis wherein row of user for grader
Fraction, and calculate the row Characteristic Number work for exceeding correspondence average in each user as reference vector by column count average value
Confidence level is threatened for it, such as exceedes previously given threshold k, then is determined as normal, the rejecting from AbnormalUsers;
The present invention relates to bibliography:
【1】Literary Psychoanalysis System in the minds of text:http://ccpl.psych.ac.cn/textmind/
【2】LIWC Program:http://liwc.wpengine.com/
【3】Five-factor model personality model:
http://www.baike.com/wiki/%E5%A4%A7%E4%BA%94%E4%BA%BA%E6%
A0%BC%E7%90%86%E8%AE%BA
【4】LIWC parts of speech and five-factor model personality model interaction:
https://www.researchgate.net/publication/
44687893Personalityin100000Word sAlarge-
scaleanalysisofpersonalityandworduseamongbloggers。
Simply the preferred embodiment of the present invention described above, for those skilled in the art,
Without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications are also regarded as this hair
Bright protection domain.
Claims (10)
1. a kind of inside threat detection method based on user language feature, it is characterized in that, the language data of user is analyzed first,
The characteristic vector that quantizes of user's personality feature can be characterized by extracting language feature and setting up, and then built grader and gone forward side by side
Row classifier training recognizes the user of anomalous personality's psychological characteristics, the feature of ultimate analysis anomalous personality psychological characteristics user to
Measure drift rate to filter out wrong report user, and remaining user reported into safety officer as internal potential malicious user
Row analysis reply.
2. a kind of inside threat detection method based on user language feature according to claim 1, it is characterized in that, it is described
Inside threat detection method based on user language feature comprises the following steps:
1), data prediction:User language data to internal audit system are carried out including at least in automation audit, automation
Hold the analyzing and processing of processing and the aspect of automation polymerization three;
2), personality characteristic vector is built:User language data first to each user are analyzed, and will obtain corresponding
Then the word frequency result of important part of speech closes as Chinese word LIWC analysis result by the feature of LIWC parts of speech and five-factor model personality
Connection, will calculate 18 sub- dimensional characteristics numerical value of five-factor model personality as the personality characteristic vector of the user;
3), classifier training:The user language data of audit in grader, and some initial period of selection are built first,
The personality characteristic vector of each user is calculated, then initial groups of users is obtained using one-class support vector machines training
Mental model, finally calculates the personality modeled based on user language data content after in any one new period
Characteristic vector, and judge whether exception using groups of users mental model, judge that abnormal groups of users set is designated as
AbnormalUsers;
4) confidence calculations, are threatened:Confidence is threatened to being judged as that abnormal groups of users set AbnormalUsers calculate
Spend further to screen user;The threat confidence calculations process includes step in detail below:
41), for the user in abnormal user cluster set AbnormalUsers, by the vectorial structure of its corresponding 18 dimensional characteristics
Into a matrix Matrix_1, line number is AbnormalUsers number of users, is classified as 18;
42) Martrix_2, Martrix_2 calculation formula, are obtained according to column count matrix Martix_1 often capable Z score
It is as follows:
<mrow>
<msub>
<mi>ZScore</mi>
<mrow>
<mi>i</mi>
<mo>,</mo>
<mi>j</mi>
</mrow>
</msub>
<mo>=</mo>
<mfrac>
<mrow>
<mo>|</mo>
<msub>
<mi>X</mi>
<mrow>
<mi>i</mi>
<mo>,</mo>
<mi>j</mi>
</mrow>
</msub>
<mo>-</mo>
<msub>
<mover>
<mi>X</mi>
<mo>&OverBar;</mo>
</mover>
<mi>j</mi>
</msub>
<mo>|</mo>
</mrow>
<msub>
<mi>&sigma;</mi>
<mi>j</mi>
</msub>
</mfrac>
<mo>,</mo>
<mi>j</mi>
<mo>=</mo>
<mn>1</mn>
<mo>~</mo>
<mn>18</mn>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>3</mn>
<mo>)</mo>
</mrow>
</mrow>
Wherein, for i-th of user in Matrix_1, XijIts j-th of dimension numerical value is represented,Represent in its matrix
The numerical value average of j row, σjRepresent the standard deviation of jth row;
After each user calculates Z score in Matrix_1, new matrix Matrix_2 is constituted;
43), the average of calculating matrix Martrix_2 every column data, obtains the mean vector Mean_value of 18 dimensions;
44), each user in abnormal user cluster set AbnormalUsers is compared successively first its 18 dimensional characteristics to
Exceed the number of correspondence numerical value in mean vector Mean_value in amount, then regard 18 new dimension binary vectors of gained as its prestige
Confidence level TCD is coerced, if threatening in confidence level TCD ' 1 ' number to exceed threshold k, it is normal users to mark the user, and from
The user is deleted in abnormal user cluster set AbnormalUsers;
45), repeat the above steps 41) to step 44) until in all abnormal user cluster set AbnormalUsers are useful
Per family by judging, user uses as internal potential malice in last remaining abnormal user cluster set AbnormalUsers
Family reports safety officer's analysis reply.
3. a kind of inside threat detection method based on user language feature according to claim 2, it is characterized in that, it is described
User language data include work mail data, data for electronic documents and social networking application data, and the work mail data is use
The content of text for the work mail that family is sent, the data for electronic documents is for the related to work of user writing and with electronic edition shape
Formula storage content of text, the social networking application data for user social status crawl after content of text.
4. a kind of inside threat detection method based on user language feature according to claim 3, it is characterized in that, it is described
Analyzing and processing process to the mail data that works comprises the following steps:
111), automation audit:Collect the work mail data in certain period;
112) contents processing, is automated:The mail that only analysis user sends, weeds out mail header, only for each envelope mail
Extract content of text;For the transmission mail with multiple time tags, the mail of the last time time transmission is only considered;
113), automation polymerization:The work mail data of each user is subjected to automation audit and contents processing is automated
Content of text aggregates into one big text and stored.
5. a kind of inside threat detection method based on user language feature according to claim 3, it is characterized in that, it is described
Analyzing and processing process to data for electronic documents comprises the following steps:
121), automation audit:Collect the data for electronic documents in being worked in certain period;
122) contents processing, is automated:Remove title datas at different levels, formatted data and the picture sound number in electronic document
According to, only extract electronic document in plain text content;
123), automation polymerization:The data for electronic documents of each user is subjected to automation audit and contents processing is automated
Content of text aggregates into one big text and stored.
6. a kind of inside threat detection method based on user language feature according to claim 3, it is characterized in that, it is described
Analyzing and processing process to social networking application data comprises the following steps:
131), automation audit:Collect the social networking application status data of internal user in certain period;
132) contents processing, is automated:Picture, sound and the hyperlink data in social networking application status data are removed, is only located
The content of text write in reason state by the user;
133), automation polymerization:The social networking application data of each user are subjected to automation audit and contents processing is automated
Content of text data aggregate is into one big text and is stored.
7. a kind of inside threat detection method based on user language feature according to claim 4, it is characterized in that, in people
In lattice psychological characteristics vector building process, using literary Psychoanalysis System in the minds of the text of Institute of Developed Organisms, Academia Sinica to each
The mail text analysis of user, obtains the word frequency result of corresponding important part of speech, is used as Chinese word LIWC analysis result;
By LIWC parts of speech and the feature association of five-factor model personality, 18 sub- dimensional characteristics numerical value of five-factor model personality are calculated, the use is used as
The personality characteristic vector at family.
8. a kind of inside threat detection method based on user language feature according to claim 2, it is characterized in that, it is described
The sub- dimension of 18 five-factor model personalities is respectively:Anxiety speciality, angry speciality, depressed speciality, self-consciousness speciality, impulsion speciality,
Fragile speciality, trust speciality, moral speciality, profit his speciality, cooperation speciality, modest speciality, sympathize with speciality, self efficacy, order
Speciality, responsibility speciality, sense of accomplishment, self-discipline speciality and careful speciality.
9. a kind of inside threat detection method based on user language feature according to claim 2, it is characterized in that, it is described
The calculating process of 118 sub- dimensional characteristics numerical value is as follows:
For i-th of dimension in 18 sub- dimensions, the sub- dimension and the statistic correlation of LIWC parts of speech are:
<mrow>
<msub>
<mi>Feat</mi>
<mi>i</mi>
</msub>
<mo>&RightArrow;</mo>
<mo>{</mo>
<mrow>
<mo>(</mo>
<msub>
<mi>q</mi>
<mrow>
<mi>i</mi>
<mo>,</mo>
<mn>1</mn>
</mrow>
</msub>
<mo>,</mo>
<msub>
<mi>c</mi>
<mrow>
<mi>i</mi>
<mo>,</mo>
<mn>1</mn>
</mrow>
</msub>
<mo>)</mo>
</mrow>
<mo>,</mo>
<mrow>
<mo>(</mo>
<msub>
<mi>q</mi>
<mrow>
<mi>i</mi>
<mo>,</mo>
<mn>2</mn>
</mrow>
</msub>
<mo>,</mo>
<msub>
<mi>c</mi>
<mrow>
<mi>i</mi>
<mo>,</mo>
<mn>2</mn>
</mrow>
</msub>
<mo>)</mo>
</mrow>
<mo>,</mo>
<mo>...</mo>
<mo>,</mo>
<mrow>
<mo>(</mo>
<msub>
<mi>q</mi>
<mrow>
<mi>i</mi>
<mo>,</mo>
<msub>
<mi>N</mi>
<mi>i</mi>
</msub>
<mo>-</mo>
<mn>1</mn>
</mrow>
</msub>
<mo>,</mo>
<msub>
<mi>c</mi>
<mrow>
<mi>i</mi>
<mo>,</mo>
<msub>
<mi>N</mi>
<mi>i</mi>
</msub>
<mo>-</mo>
<mn>1</mn>
</mrow>
</msub>
<mo>)</mo>
</mrow>
<mo>,</mo>
<mrow>
<mo>(</mo>
<msub>
<mi>q</mi>
<mrow>
<mi>i</mi>
<mo>,</mo>
<msub>
<mi>N</mi>
<mi>i</mi>
</msub>
</mrow>
</msub>
<mo>,</mo>
<msub>
<mi>c</mi>
<mrow>
<mi>i</mi>
<mo>,</mo>
<msub>
<mi>N</mi>
<mi>i</mi>
</msub>
</mrow>
</msub>
<mo>)</mo>
</mrow>
<mo>}</mo>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>)</mo>
</mrow>
</mrow>
Wherein, FeatiRepresent i-th of sub- dimension, and (qi,j,ci,j) represent corresponding LIWC parts of speech qi,jAnd its corresponding statistics phase
Closing property ci,j, and NiFor the LIWC part of speech number related to i-th of sub- dimension statistically significant;
On the basis of formula (1), the personality characteristic vector of user is calculated by formula (2):
<mrow>
<msub>
<mi>Feat</mi>
<mi>i</mi>
</msub>
<mo>=</mo>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>j</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<msub>
<mi>N</mi>
<mi>i</mi>
</msub>
</munderover>
<msub>
<mi>q</mi>
<mrow>
<mi>i</mi>
<mo>,</mo>
<mi>j</mi>
</mrow>
</msub>
<mo>&times;</mo>
<msub>
<mi>c</mi>
<mrow>
<mi>i</mi>
<mo>,</mo>
<mi>j</mi>
</mrow>
</msub>
<mo>,</mo>
<mrow>
<mo>(</mo>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
<mo>~</mo>
<mn>18</mn>
<mo>,</mo>
<mi>j</mi>
<mo>=</mo>
<mn>1</mn>
<mo>~</mo>
<msub>
<mi>N</mi>
<mi>i</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>2</mn>
<mo>)</mo>
</mrow>
</mrow>
Wherein, FeatiRepresent the personality characteristic vector of any one in 18 dimensions of user, qjWith cjRepresent respectively
Word frequency value and corresponding statistic correlation on j-th of the part of speech for the LIWC that the user associates in i-th of dimension.
10. a kind of inside threat detection method based on user language feature according to claim 2, it is characterized in that, institute
State and threaten confidence level TCD calculation formula as follows:
<mrow>
<msub>
<mi>TCD</mi>
<mrow>
<mi>i</mi>
<mo>,</mo>
<mi>j</mi>
</mrow>
</msub>
<mo>=</mo>
<mtable>
<mtr>
<mtd>
<mrow>
<mi>i</mi>
<mi>f</mi>
<mo>:</mo>
<msub>
<mi>Z</mi>
<mrow>
<mi>i</mi>
<mo>,</mo>
<mi>j</mi>
</mrow>
</msub>
<mo>></mo>
<msub>
<mi>MV</mi>
<mi>j</mi>
</msub>
<mo>&RightArrow;</mo>
<mn>1</mn>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<mi>e</mi>
<mi>l</mi>
<mi>s</mi>
<mi>e</mi>
<mo>:</mo>
<mn>0</mn>
</mrow>
</mtd>
</mtr>
</mtable>
<mo>,</mo>
<mi>j</mi>
<mo>=</mo>
<mn>1</mn>
<mo>~</mo>
<mn>18</mn>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>4</mn>
<mo>)</mo>
</mrow>
</mrow>
TCDi={ 1,1,0,1,0,1,1,1,0,1,1,1,1,0,1,1,1,1 } (5)
Wherein, ZijRepresent the i row j column datas in Matrix_2, i.e., the Z score of i-th user's jth row dimensional characteristics, MVjRepresent
J-th of value in mean vector Mean_value;The number of numerical value ' 1 ' is 14 in the threat confidence level of user in formula (5),
If the number 14 of numerical value ' 1 ' is more than given threshold k, the user is corrected for normal users, and by from
Rejected in AbnormalUsers set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710374486.9A CN107196942B (en) | 2017-05-24 | 2017-05-24 | Internal threat detection method based on user language features |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710374486.9A CN107196942B (en) | 2017-05-24 | 2017-05-24 | Internal threat detection method based on user language features |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107196942A true CN107196942A (en) | 2017-09-22 |
CN107196942B CN107196942B (en) | 2020-05-15 |
Family
ID=59874365
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710374486.9A Active CN107196942B (en) | 2017-05-24 | 2017-05-24 | Internal threat detection method based on user language features |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107196942B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110290155A (en) * | 2019-07-23 | 2019-09-27 | 北京邮电大学 | The defence method and device of social engineering attack |
CN110837604A (en) * | 2019-10-16 | 2020-02-25 | 贝壳技术有限公司 | Data analysis method and device based on housing monitoring platform |
WO2021051530A1 (en) * | 2019-09-19 | 2021-03-25 | 平安科技(深圳)有限公司 | Method, apparatus and device for detecting abnormal mail, and storage medium |
CN115022052A (en) * | 2022-06-07 | 2022-09-06 | 山东省计算中心(国家超级计算济南中心) | User binary analysis-based internal user abnormal behavior fusion detection method and system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120137367A1 (en) * | 2009-11-06 | 2012-05-31 | Cataphora, Inc. | Continuous anomaly detection based on behavior modeling and heterogeneous information analysis |
WO2014205421A1 (en) * | 2013-06-21 | 2014-12-24 | Arizona Board Of Regents For The University Of Arizona | Automated detection of insider threats |
CN105005594A (en) * | 2015-06-29 | 2015-10-28 | 嘉兴慧康智能科技有限公司 | Abnormal Weibo user identification method |
CN105138570A (en) * | 2015-07-26 | 2015-12-09 | 吉林大学 | Calculation method of crime degree of speech data |
-
2017
- 2017-05-24 CN CN201710374486.9A patent/CN107196942B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120137367A1 (en) * | 2009-11-06 | 2012-05-31 | Cataphora, Inc. | Continuous anomaly detection based on behavior modeling and heterogeneous information analysis |
WO2014205421A1 (en) * | 2013-06-21 | 2014-12-24 | Arizona Board Of Regents For The University Of Arizona | Automated detection of insider threats |
CN105005594A (en) * | 2015-06-29 | 2015-10-28 | 嘉兴慧康智能科技有限公司 | Abnormal Weibo user identification method |
CN105138570A (en) * | 2015-07-26 | 2015-12-09 | 吉林大学 | Calculation method of crime degree of speech data |
Non-Patent Citations (1)
Title |
---|
杨光 等: "内部威胁检测研究", 《内部威胁检测研究》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110290155A (en) * | 2019-07-23 | 2019-09-27 | 北京邮电大学 | The defence method and device of social engineering attack |
WO2021051530A1 (en) * | 2019-09-19 | 2021-03-25 | 平安科技(深圳)有限公司 | Method, apparatus and device for detecting abnormal mail, and storage medium |
CN110837604A (en) * | 2019-10-16 | 2020-02-25 | 贝壳技术有限公司 | Data analysis method and device based on housing monitoring platform |
CN115022052A (en) * | 2022-06-07 | 2022-09-06 | 山东省计算中心(国家超级计算济南中心) | User binary analysis-based internal user abnormal behavior fusion detection method and system |
Also Published As
Publication number | Publication date |
---|---|
CN107196942B (en) | 2020-05-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kuang et al. | Crime topic modeling | |
Legg et al. | Automated insider threat detection system using user and role-based profile assessment | |
Sun et al. | Detecting anomalous user behavior using an extended isolation forest algorithm: an enterprise case study | |
Ferguson | Policing predictive policing | |
Quinsaat | Competing news frames and hegemonic discourses in the construction of contemporary immigration and immigrants in the United States | |
Krahmann | Beck and beyond: Selling security in the world risk society | |
Legg et al. | Towards a conceptual model and reasoning structure for insider threat detection | |
Steblay et al. | Seventy-two tests of the sequential lineup superiority effect: A meta-analysis and policy discussion. | |
Agrafiotis et al. | Identifying attack patterns for insider threat detection | |
Shaw | The role of behavioral research and profiling in malicious cyber insider investigations | |
CN107196942A (en) | A kind of inside threat detection method based on user language feature | |
Chandra et al. | A taxonomy of cybercrime: Theory and design | |
CN112149749B (en) | Abnormal behavior detection method, device, electronic equipment and readable storage medium | |
Goode et al. | Detecting complex account fraud in the enterprise: The role of technical and non-technical controls | |
Hinduja | Computer crime investigations in the United States: Leveraging knowledge from the past to address the future | |
Jiang et al. | Prediction and detection of malicious insiders' motivation based on sentiment profile on webpages and emails | |
Cockbain et al. | Crime science | |
GB2533289A (en) | System for and method for detection of insider threats | |
Pfleeger | Reflections on the insider threat | |
Scrivens et al. | Triggered by defeat or victory? Assessing the impact of presidential election results on extreme right-wing mobilization online | |
Morris | Identity thieves and levels of sophistication: Findings from a national probability sample of American newspaper articles 1995–2005 | |
Whitty | Developing a conceptual model for insider threat | |
Thompson | Weak models for insider threat detection | |
Holt et al. | An exploratory analysis of the characteristics of ideologically motivated cyberattacks | |
Santos et al. | Intelligence analyses and the insider threat |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |