CN109063721A - A kind of method and device that behavioural characteristic data are extracted - Google Patents
A kind of method and device that behavioural characteristic data are extracted Download PDFInfo
- Publication number
- CN109063721A CN109063721A CN201810576742.7A CN201810576742A CN109063721A CN 109063721 A CN109063721 A CN 109063721A CN 201810576742 A CN201810576742 A CN 201810576742A CN 109063721 A CN109063721 A CN 109063721A
- Authority
- CN
- China
- Prior art keywords
- characteristic
- data
- feature
- weight
- feature set
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/211—Selection of the most significant subset of features
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses the method and devices that a kind of behavioural characteristic data are extracted, wherein method includes: to extract characteristic in the user behavior data got, according to characteristic construction feature collection;It selects a characteristic for benchmark characteristic one by one in feature set, the characteristic weighting in feature set in addition to reference characteristic data is divided into neighborhood and remote domain according to pre-determined distance threshold value, the weight of reference characteristic data is assessed by mutual information;By weight feedback into the corresponding weight vector of reference characteristic data, and the smallest characteristic of weight is deleted in feature set;The smallest characteristic of weight is gradually deleted, the corresponding feature set of characteristic and the preset corresponding weight vector of threshold when the number and weight vector of feature set character pair data are stablized, after output is stable.Thus it solves abnormality detection technical characteristic in the prior art and extracts difficult, low to scale sexual abnormality detection efficiency technical problem.
Description
Technical field
The method and dress extracted the present invention relates to big data processing technology field more particularly to a kind of behavioural characteristic data
It sets.
Background technique
The research of recent domestic related network behavior is analyzed and researched generally by a large number of services data, is mentioned
Produce the mathematical model of the reflection certain genuine properties of network.Traditional simple feature extraction and abnormality detection based on traffic statistics
There are higher rate of failing to report and rate of false alarm, it has been not enough to cope with the dynamic network environment that becomes increasingly complex.
The business datum amount of network behavior is big, processing difficulty is high, and the abnormality detection technology of existing Behavior-based control analysis is big
More are confined to a certain behavior level, individual data source.It is individually detected from a behavior level, individual data source different
Frequentation toward having one-sidedness, and cannot make user be fully understood by it is abnormal there is a phenomenon where and essence.It is different in agreement behavior layer
Often in detection, most of technology all only focuses on transport layer and network layer feature, and the operation of certain application protocols is at these
Level is difficult to embody, and application layer protocol abnormality detection needs further to be studied.
Since the business datum amount of network behavior is big, processing difficulty is high, abnormality detection technology is mainly used for individual course, does not have
Each layer is comprehensively considered.Existing abnormality detection technology existing characteristics extract difficulty, low to scale sexual abnormality detection efficiency
Technical problem.
Summary of the invention
The present invention provides a kind of method, apparatus, computer readable storage medium and equipment that behavioural characteristic data are extracted, and uses
Difficult, low to scale sexual abnormality detection efficiency technical problem is extracted to solve abnormality detection technical characteristic in the prior art.
According to one aspect of the present invention, a kind of method that behavioural characteristic data are extracted is provided, which comprises
Characteristic is extracted in the user behavior data got, according to characteristic construction feature collection;
Select a characteristic for benchmark characteristic one by one in feature set, it will be in feature set according to pre-determined distance threshold value
Characteristic weighting in addition to reference characteristic data is divided into neighborhood and remote domain, assesses reference characteristic data by mutual information
Weight;
By weight feedback into the corresponding weight vector of reference characteristic data, and the smallest spy of weight is deleted in feature set
Levy data;
The smallest characteristic of weight is gradually deleted, is stablized in the number and weight vector of feature set character pair data
When, the corresponding feature set of characteristic and the preset corresponding weight vector of threshold after output is stable.
Optionally, according to characteristic construction feature collection, comprising:
According to each characteristic to the disturbance degree of characteristic population entropy, the characteristic extracted is screened,
Candidate characteristic set is constructed by the characteristic after screening.
Optionally, the user behavior data got includes: traffic behavior layer data, agreement behavior layer data, Yi Jiyong
Family behavior layer data.
Optionally, before according to characteristic construction feature collection, further includes:
The characteristic extracted is normalized, section registration process.
Optionally, the weight of reference characteristic data is assessed by mutual information, comprising:
The feature correlation and feature redundancy of reference characteristic data are assessed by mutual information;
The weight of reference characteristic is obtained according to feature correlation and feature redundancy.
Two aspects according to the present invention, provide a kind of device that behavioural characteristic data are extracted, and described device includes:
Feature set module is constructed for extracting characteristic in the user behavior data got according to characteristic
Feature set;
Weight computing module, for selecting a characteristic for benchmark characteristic one by one in feature set, according to default
Characteristic weighting in feature set in addition to reference characteristic data is divided into neighborhood and remote domain by distance threshold, passes through mutual information
Assess the weight of reference characteristic data;
Screening module is used for by weight feedback into the corresponding weight vector of reference characteristic data, and deletes in feature set
Except the smallest characteristic of weight;
Characteristic output module, for gradually deleting the smallest characteristic of weight, in feature set character pair data
Number and weight vector stablize when, output stablize after the corresponding feature set of characteristic and the preset corresponding weight of threshold to
Amount.
Optionally, feature set module includes:
Screening unit, for the disturbance degree according to each characteristic to characteristic population entropy, to the feature extracted
Data are screened, and construct candidate characteristic set by the characteristic after screening.
Optionally, the user behavior data got includes: traffic behavior layer data, agreement behavior layer data, Yi Jiyong
Family behavior layer data.
Optionally, feature set module further include:
Characteristic integral unit, for the characteristic extracted to be normalized, section registration process.
Optionally, weight computing module includes:
Characteristics unit, for assessing the feature correlation and feature redundancy of reference characteristic data by mutual information;
Weight unit, for obtaining the weight of reference characteristic according to feature correlation and feature redundancy.
The method and device that a kind of behavioural characteristic data according to the present invention are extracted, by selecting one one by one in feature set
Characteristic is benchmark characteristic, according to pre-determined distance threshold value by the characteristic in feature set in addition to reference characteristic data
Weighting is divided into neighborhood and remote domain, the weight of reference characteristic data is assessed by mutual information, by weight feedback to reference characteristic number
According in corresponding weight vector, and the smallest characteristic of weight is deleted in feature set, gradually delete the smallest feature of weight
Data, the corresponding spy of characteristic when the number and weight vector of feature set character pair data are stablized, after output is stable
Collection and the preset corresponding weight vector of threshold.It is assessed by the weight to the characteristic got, it is too small to delete weight
Number of features, it is known that corresponding characteristic weight tends towards stability in feature set, output stablize after characteristic it is corresponding
Feature set and the preset corresponding weight vector of threshold solve abnormality detection technical characteristic extraction difficulty in the prior art, to scale
The low technical problem of sexual abnormality detection efficiency improves the efficiency to the detection of scale sexual abnormality.
Above description is only the general introduction of technical solution of the embodiment of the present invention, in order to better understand the embodiment of the present invention
Technological means, and can be implemented in accordance with the contents of the specification, and in order to allow above and other mesh of the embodiment of the present invention
, feature and advantage can be more clearly understood, the special specific embodiment for lifting the embodiment of the present invention below.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field
Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention
The limitation of embodiment.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 is the flow chart for the method that a kind of behavioural characteristic data that first embodiment of the invention provides are extracted;
Fig. 2 is the flow chart for the method that a kind of behavioural characteristic data that second embodiment of the invention provides are extracted;
Fig. 3 is the flow chart for the method that a kind of behavioural characteristic data that third embodiment of the invention provides are extracted;
Fig. 4 is the flow chart for the method that a kind of behavioural characteristic data that fourth embodiment of the invention provides are extracted;
Fig. 5 is the functional module signal for the device that a kind of behavioural characteristic data that fifth embodiment of the invention provides are extracted
Figure;
Fig. 6 is the functional module signal for the device that a kind of behavioural characteristic data that sixth embodiment of the invention provides are extracted
Figure;
Fig. 7 is the functional module signal for the device that a kind of behavioural characteristic data that seventh embodiment of the invention provides are extracted
Figure;
Fig. 8 is the functional module signal for the device that a kind of behavioural characteristic data that eighth embodiment of the invention provides are extracted
Figure.
Specific embodiment
Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here
It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure
It is fully disclosed to those skilled in the art.
Referring to Fig. 1, implementing the flow chart for the method that a kind of behavioural characteristic data provided are extracted for the present invention first.This
In embodiment, the method that the behavioural characteristic data are extracted includes the following steps:
Step S101 extracts characteristic in the user behavior data got, according to characteristic construction feature collection.
Wherein, the user behavior data got includes the characteristic got under user's normal use and makes in user
With the characteristic got under abnormal behaviour, according to the characteristic construction feature collection got.
Optionally, the user behavior data got includes: traffic behavior layer data, agreement behavior layer data, Yi Jiyong
Family behavior layer data.
, it will be clear that obtaining traffic behavior layer data, agreement row to promote the accuracy to Network anomaly detection
For layer data and user behavior layer data.Wherein, user behavior layer data includes user's login time, search sensitive word, uses
Open the data such as webpage in family;Agreement behavior layer data include the data such as IP address, port numbers, IP packet header length, function code;Stream
Amount behavior layer data include the data such as SYN packet ratio, IP comentropy, IP correlation, TTL distribution.By to three level uplinks
With the characteristics of and the relevant technologies extraction, by detecting to stage construction abnormal behaviour, the running for coordinating each behavior level is closed
System, improves the comprehensive and accuracy of abnormality detection.
Step S102 selects a characteristic for benchmark characteristic, according to pre-determined distance threshold value one by one in feature set
Characteristic weighting in feature set in addition to reference characteristic data is divided into neighborhood and remote domain, benchmark is assessed by mutual information
The weight of characteristic.
When it is implemented, selecting a characteristic for benchmark characteristic one by one in feature set, according to preset threshold
Characteristic weighting in feature set in addition to reference characteristic data is divided into field and remote domain by value, is specifically included:
Processing is weighted to each characteristic in the feature set got;Such as, according to each in feature set
Characteristic is weighted processing to characteristic to the influence degree of network behavior.For example, hacker attacks the malice of buying net
Hitting is that corresponding operation is executed by function code, and the corresponding characteristic of selection function code weights x1;For selection IP agreement net
Network communication selects protocol identifier to be corresponding characteristic weighting x2, to prevent malicious attack from changing agreement, for source and mesh
IP address can disclose the main body of malicious attack, so two IP address of selection make corresponding characteristic weighting x3 and weighting
X4, attacker attack the message that may generate deformity using IP agreement, and data length can disclose this variation, select
Dendron characteristic weighing x5 of the length as data, may be due to accessing unknown data address when attack access equipment data
It shows attack signature, selects the corresponding characteristic of data address to weight x6, obtain weight vector w (t)t→∞={ w1,w2,
Λ,wom, om is characterized the number of data.
NeighborhoodIndicate xiWeighted distance at weight vector w (t) is less than the set of all the points of given threshold value:
Wherein, d (xi,xj| w (t)) it is Weighted distance under w (t), r is pre-determined distance threshold value.
Remote domainIndicate xiWeighted distance at weight vector w (t) is greater than the set of all the points of given threshold value:
Further, x can be removed for network data flow is interioriOuter any object xjTag field label cj, it is as follows:
Training data flow point can be divided into object xiOn the basis of neighborhood, remote domain.By the characteristic data flow in feature set
After weighting is divided into neighborhood, remote domain, the weight of the reference characteristic data is assessed using mutual information, and then calculate the important journey of feature
Degree.
Step S103 by weight feedback into the corresponding weight vector of reference characteristic data, and deletes power in feature set
It is worth the smallest characteristic.
When it is implemented, needing to screen characteristic since characteristic amount is big, by weight feedback to benchmark spy
It levies in the corresponding weight vector of data, the smallest characteristic of weight is deleted in feature set.
Step S104 gradually deletes the smallest characteristic of weight, in the number and weight of feature set character pair data
The corresponding feature set of characteristic and the preset corresponding weight vector of threshold when vector is stablized, after output is stable.
In the present embodiment, by selecting a characteristic for benchmark characteristic one by one in feature set, according to it is default away from
The characteristic weighting in feature set in addition to reference characteristic data is divided into neighborhood and remote domain from threshold value, is commented by mutual information
The weight for estimating reference characteristic data by weight feedback into the corresponding weight vector of reference characteristic data, and is deleted in feature set
Except the smallest characteristic of weight, gradually delete the smallest characteristic of weight, feature set character pair data number and
The corresponding feature set of characteristic and the preset corresponding weight vector of threshold when weight vector is stablized, after output is stable.By right
The weight of the characteristic got is assessed, and deletes the too small number of features of weight, it is known that corresponding feature in feature set
Data weight tends towards stability, and the corresponding feature set of characteristic and the preset corresponding weight vector of threshold after output is stable solve
Abnormality detection technical characteristic extracts difficult, to scale sexual abnormality detection efficiency low technical problem in the prior art, is promoted
To the efficiency of scale sexual abnormality detection.
Referring to Fig. 2, implementing the flow chart for the method that a kind of behavioural characteristic data provided are extracted for the present invention second.This
In embodiment, the method that the behavioural characteristic data are extracted includes the following steps:
Step S201 extracts characteristic in the user behavior data got.
Step S202, according to each characteristic to the disturbance degree of characteristic population entropy, to the characteristic extracted
It is screened, candidate characteristic set is constructed by the characteristic after screening.
When it is implemented, the corresponding characteristic characteristic dimension of the feature set actually got is huge, by whole characteristics
Entering feature set according to the consideration and carrying out operation can take a substantial amount of time and space.Therefore before weighted feature selection, by commenting
Feature is estimated to the influence degree of feature set population entropy, is removed inessential or unrelated attribute, is constructed candidate characteristic set Sm:
The similarity in feature set between any two object is calculated, to obtain the similarity matrix S=of feature set
(Sij)n*n, define any two characteristic xi, xjBetween comentropy.
Data flow entirety entropy is the comentropy average value of all characteristics in this section of network data flow, calculation formula are as follows:
To each characteristic S in primitive character collection Sk, removal this feature data S is calculated using formula abovekIt is whole afterwards
The incrementss of entropy, the Pre-Evaluation function as characteristic:
preM(sk)=E (S-sk)-E (S) w (t)=[1 ..., 1]
Wherein, preM (sk) value range be [- 1,1].If preM (sk) > 0, then prove that removal this feature data can make spy
It collects population entropy to increase, illustrates that it is conducive to cluster process, preM (sk) preM (sk) < 0 when, it was demonstrated that removal this feature
Feature set population entropy can be made to reduce, illustrate that this feature is detrimental to the unrelated or noise characteristic data of cluster, preM (sk) value
Closer -1, indicate that this feature data are bigger to the negative interaction of cluster.It follows that can be by Pre-Evaluation function to feature set
In all characteristics be ranked up, unrelated or noise characteristic data are rejected by the wealthy value of setting, filter out characteristic structure
Build candidate characteristic set Sm。
Step S203 selects a characteristic for benchmark characteristic, according to pre-determined distance threshold value one by one in feature set
Characteristic weighting in feature set in addition to reference characteristic data is divided into neighborhood and remote domain, benchmark is assessed by mutual information
The weight of characteristic.
Step S204 by weight feedback into the corresponding weight vector of reference characteristic data, and deletes power in feature set
It is worth the smallest characteristic.
Step S205 gradually deletes the smallest characteristic of weight, in the number and weight of feature set character pair data
The corresponding feature set of characteristic and the preset corresponding weight vector of threshold when vector is stablized, after output is stable.
Wherein, step S202 to step S205 has been described in detail in the first embodiment, and this will not be repeated here.
In the present embodiment, by calculating each characteristic to the disturbance degree of characteristic population entropy, to the spy extracted
Sign data are screened, and are constructed candidate characteristic set by the characteristic after screening, are rejected unrelated or noise characteristic, filter out time
Select feature set.The characteristic dimension for reducing the corresponding characteristic of the feature set actually got passes through the feature after screening
Data construction feature collection, reduces the time handled unessential characteristic and space.
Referring to Fig. 3, implementing the flow chart for the method that a kind of behavioural characteristic data provided are extracted for third of the present invention.This
In embodiment, the method that the behavioural characteristic data are extracted includes the following steps:
Step S301 extracts characteristic in the user behavior data got, carries out to the characteristic extracted
Normalization, section registration process, according to treated characteristic construction feature collection.
Optionally, the user behavior data got includes: traffic behavior layer data, agreement behavior layer data, Yi Jiyong
Family behavior layer data.
When it is implemented, since the data got include traffic behavior layer data, agreement behavior layer data and user
Behavior layer data, characteristic include multiple types.It needs that characteristic is normalized and section registration process, with to obtaining
The data got carry out characteristic extraction.It is as follows, normalization and section registration process are illustrated:
Normalized: different server users normal behaviour in same time period has differences, identical service
The normal behaviour in device user stage in different times is there is also difference, this is to differentiating that it is difficult that abnormal belt is come.In view of such
Situation: although different server different time stage Normal appearances should be different, in the same server in the period, one
Whether a user is abnormal only to need to compare with the Normal appearances in the server period.Therefore we change absolute value and are
Relative value proposes normalization characteristic processing method.Specific practice is: calculating all user characteristics every day in the same server
Average value, obtain average value characteristic time sequenceNormal behaviour benchmark as the server;For with
Family u, by its date t feature vectorDivided by the benchmark of this day of server where itThis ratio is exactly after normalizing
Numerical value.
Section registration process: the abnormality detection of user behavior includes that horizontal and vertical comparison detects, and laterally detection refers to
User to be detected is compared with other users behavior, and longitudinal detect refers to user's current behavior and historical behavior progress
Compare detection.Abnormal user is generally not all to show abnormal behaviour always, and the important sampling section to its behavior is abnormal
Tendency and abnormal stage of development, but normal users are without this stage.If selected according to method same as abnormal user
Sampling interval, then the sampling interval of all normal users is all last T days on data set.It is so formed such a existing
As: the feature of normal users is extracted from the stage at the same time, and the feature of abnormal user carrys out self-dispersed time phase.Such as preceding institute
It states, time difference meeting classification of disturbance virtually increases the difference of normal users and abnormal user.
For exclusive PCR difference, it is proposed that section registration process: for normal users Ua, we are it same at random
The sampling interval of abnormal user a Ub, Ub are selected in one server as abnormal tendency and abnormal stage of development, the sampling of Ua
Section is aligned with Ub, chooses same a period of time window.We carry out repeatedly random alignment, and the comparative feature value finally chosen is
The average value being repeatedly aligned at random.
Step S302 selects a characteristic for benchmark characteristic, according to pre-determined distance threshold value one by one in feature set
Characteristic weighting in feature set in addition to reference characteristic data is divided into neighborhood and remote domain, benchmark is assessed by mutual information
The weight of characteristic
Step S303 by weight feedback into the corresponding weight vector of reference characteristic data, and deletes power in feature set
It is worth the smallest characteristic
Step S304 gradually deletes the smallest characteristic of weight, in the number and weight of feature set character pair data
The corresponding feature set of characteristic and the preset corresponding weight vector of threshold when vector is stablized, after output is stable.
Wherein, step S302 to step S305 has been described in detail in the first embodiment, and this will not be repeated here.
In the present embodiment, by the way that the characteristic extracted is normalized, section registration process, to the user of input
The corresponding weight vector of behavioral data is handled, the influence of exclusive PCR difference, and it is sparse, intelligent to solve user behavior data collection
Detection algorithm performance is bad, the problem of abnormal behaviour user cannot have correctly been filtered out from a large number of users.Further solve
Abnormality detection technical characteristic extracts difficult, to scale sexual abnormality detection efficiency low technical problem in the prior art, is promoted
To the efficiency of scale sexual abnormality detection.
Referring to Fig. 4, implementing the flow chart for the method that a kind of behavioural characteristic data provided are extracted for the present invention the 4th.This
In embodiment, the method that the behavioural characteristic data are extracted includes the following steps:
Step S401 extracts characteristic in the user behavior data got, carries out to the characteristic extracted
Normalization, section registration process, according to treated characteristic construction feature collection.
Optionally, the user behavior data got includes: traffic behavior layer data, agreement behavior layer data, Yi Jiyong
Family behavior layer data.
Step S402 selects a characteristic for benchmark characteristic, according to pre-determined distance threshold value one by one in feature set
Characteristic weighting in feature set in addition to reference characteristic data is divided into neighborhood and remote domain, benchmark is assessed by mutual information
The feature correlation and feature redundancy of characteristic;The power of reference characteristic is obtained according to feature correlation and feature redundancy
Value.
When it is implemented, mutual information is a kind of Information Meter for being used to indicate correlation between two event sets in information theory
Amount, is defined as I (X, Y)=H (X)+H (Y)-H (X, Y), and wherein H (X, Y) is combination entropy, is defined as:It can be derived from:
Arbitrary characteristics data S is provided according to mutual information conceptkFeature correlation and feature redundancy definition:
Feature correlation indicates characteristic SkWith the relevance of domain label c, feature correlation is bigger, and feature is more outstanding,
It can be indicated with family information between the two, calculation formula are as follows:
D(sk, c) and=I (sk,c)
Wherein c is the domain label that formula provides.
Feature redundancy indicates characteristic SkWith feature set SmThe repeatability of middle other feature, redundancy is smaller, characteristic
According to more outstanding, available feature data SkWith feature set SmThe Average Mutual expression of middle other feature, calculation formula are as follows:
Wherein suIt is characterized collection SmIn remove characteristic SkArbitrary characteristics data.
Theoretically an outstanding feature needs to have two conditions: higher feature correlation and lower feature redundancy
Degree.Accordingly, the characteristic evaluating factor is provided:
ρj=D (sj,c)-λR(sj)
Wherein, λ is a balance parameters, and the relative intensity of redundancy is cured for controlling feature data dependence.It is found that working as
When λ value is 0, redundancy influences to disappear, and feature evaluation maximumlly concentrates in correlation;When λ value is 1, feature is commented
Estimate and concentrates in redundancy.Due to requirement of the different feature sets under different clustered demands, for redundancy and correlation
It is different, if redundancy is more demanding, it is relatively preferable to cluster efficiency, if correlation requirement is higher, clustering precision is often
Preferably, λ facilitates the final Clustering Effect of team to introducing and efficiency is balanced and controls.
In the t times iteration, characteristic evaluating set can be obtainedWherein
T is the number of iterations, and w (t) is feature weight vector used by current iteration, due to defaulting each characteristic under original state
It is identical for the influence degree of data segmentation, therefore w (1)=[1, Λ, 1, Λ, 1].Indicate j-th of spy under the t times iteration
The assessment magnitude of sign.Feature weight vector is updated, w (t+1) can be obtained, renewal function is as follows:
Wherein TmFor preset iteration maximum times,The renewal intensity for being is decayed at any time, guarantees convergence rate.w
(t+1) tend towards stability or iterate to maximum times TmWhen, feature selecting can stop and return the result.
It should be noted that iteration requires to randomly select characteristic object x every timei, the randomness of selection leads to neighbour
The effect of domain analysis is different.In general, if selected xiNeighborhood in data object it is more, and be distributed it is compacter, then selection get over
Success, the reference value of neighbor analysis is just bigger, and in contrast, object is fewer, is distributed sparse neighbor analysis reference value and gets over
It is small, or even feature selecting can be guided into the direction of mistake, cause efficiency to decline.Join it can be seen that analyzing each iteration neighborhood
It is very necessary for examining the assessment of value.
Accordingly, the neighbor analysis reference factor p of the t times iterationl(xi) it is defined as follows:
Wherein, m xiThe number of data object in neighborhood, δ are to be used to standardize neighborhood distance apart from generalized parameter,
Obviously to different data sets, δ is different.It is found that when m value is bigger, and neighborhood object and xiAverage distance it is smaller, then pl
(xi) value more levels off to 0 closer to 1.
By reference factor pl(xi) as the t times iteration when feature weight vector update probability so that selecting object xiCompared with
Update probability when success, on the contrary probability is small, the convergence rate of iterative algorithm can be improved in this way, when feature weight vector w (t) tends to
After stabilization, S is weeded outmThe lower feature of middle evaluation magnitude exports SmWith corresponding feature weight vector wopt。
Step S403 by weight feedback into the corresponding weight vector of reference characteristic data, and deletes power in feature set
It is worth the smallest characteristic.
Step S404 gradually deletes the smallest characteristic of weight, in the number and weight of feature set character pair data
The corresponding feature set of characteristic and the preset corresponding weight vector of threshold when vector is stablized, after output is stable.
Wherein, step S401, step S403 and step S404 are described in detail in the first embodiment,
This is not repeated them here.
In the present embodiment, the feature correlation and feature redundancy of reference characteristic data are assessed by mutual information;According to spy
Sign correlation and feature redundancy obtain the weight of reference characteristic.The convergence rate that iterative algorithm can be improved, makes weight faster
Vector tends towards stability, and the corresponding feature set of characteristic and the preset corresponding weight vector of threshold after output is stable solve existing
There is in technology abnormality detection technical characteristic extract difficult, low to scale sexual abnormality detection efficiency technical problem, improves pair
The efficiency of scale sexual abnormality detection.
Referring to Fig. 5, the function of the device 100 extracted for a kind of behavioural characteristic data that fifth embodiment of the invention provides
Module diagram.Applied to computer equipment, the device 100 that behavior characteristic is extracted includes feature set module 110, weight
Computing module 120, screening module 130 and characteristic output module 140.The device is mainly used to solve in the prior art
Abnormality detection technical characteristic extracts difficult, low to scale sexual abnormality detection efficiency technical problem.
Wherein, which includes but is not limited to mobile phone, mobile phone, smart phone, tablet computer, personal electricity
Brain, personal digital assistant, media player, server and other electronic equipments.
Feature set module 110, for extracting characteristic in the user behavior data got, according to characteristic structure
Build feature set.
Wherein, the user behavior data got includes the characteristic got under user's normal use and makes in user
With the characteristic got under abnormal behaviour, according to the characteristic construction feature collection got.
Optionally, the user behavior data got includes: traffic behavior layer data, agreement behavior layer data, Yi Jiyong
Family behavior layer data.
, it will be clear that obtaining traffic behavior layer data, agreement row to promote the accuracy to Network anomaly detection
For layer data and user behavior layer data.Wherein, user behavior layer data includes user's login time, search sensitive word, uses
Open the data such as webpage in family;Agreement behavior layer data include the data such as IP address, port numbers, IP packet header length, function code;Stream
Amount behavior layer data include the data such as SYN packet ratio, IP comentropy, IP correlation, TTL distribution.By to three level uplinks
With the characteristics of and the relevant technologies extraction, by detecting to stage construction abnormal behaviour, the running for coordinating each behavior level is closed
System, improves the comprehensive and accuracy of abnormality detection.
Weight computing module 120, for selecting a characteristic for benchmark characteristic one by one in feature set, according to pre-
If the characteristic weighting in feature set in addition to reference characteristic data is divided into neighborhood and remote domain by distance threshold, pass through mutual trust
The weight of breath assessment reference characteristic data.
When it is implemented, selecting a characteristic for benchmark characteristic one by one in feature set, according to preset threshold
Characteristic weighting in feature set in addition to reference characteristic data is divided into field and remote domain by value, is specifically included:
Processing is weighted to each characteristic in the feature set got;Such as, according to each in feature set
Characteristic is weighted processing to characteristic to the influence degree of network behavior.For example, hacker attacks the malice of buying net
Hitting is that corresponding operation is executed by function code, and the corresponding characteristic of selection function code weights x1;For selection IP agreement net
Network communication selects protocol identifier to be corresponding characteristic weighting x2, to prevent malicious attack from changing agreement, for source and mesh
IP address can disclose the main body of malicious attack, so two IP address of selection make corresponding characteristic weighting x3 and weighting
X4, attacker attack the message that may generate deformity using IP agreement, and data length can disclose this variation, select
Dendron characteristic weighing x5 of the length as data, may be due to accessing unknown data address when attack access equipment data
It shows attack signature, selects the corresponding characteristic of data address to weight x6, obtain weight vector w (t)t→∞={ w1,w2,
Λ,wom, om is characterized the number of data.
NeighborhoodIndicate xiWeighted distance at weight vector w (t) is less than the set of all the points of given threshold value:
Wherein, d (xi,xj| w (t)) it is Weighted distance under w (t), r is pre-determined distance threshold value.
Remote domainIndicate xiWeighted distance at weight vector w (t) is greater than the set of all the points of given threshold value:
Further, x can be removed for network data flow is interioriOuter any object xjTag field label cj, it is as follows:
Training data flow point can be divided into object xiOn the basis of neighborhood, remote domain.By the characteristic data flow in feature set
After weighting is divided into neighborhood, remote domain, the weight of the reference characteristic data is assessed using mutual information, and then calculate the important journey of feature
Degree.
Screening module 130 is used for by weight feedback into the corresponding weight vector of reference characteristic data, and in feature set
Delete the smallest characteristic of weight.
When it is implemented, needing to screen characteristic since characteristic amount is big, by weight feedback to benchmark spy
It levies in the corresponding weight vector of data, the smallest characteristic of weight is deleted in feature set.
Characteristic output module 140, for gradually deleting the smallest characteristic of weight, in feature set character pair number
According to number and weight vector stablize when, output stablize after the corresponding feature set of characteristic and the preset corresponding weight of threshold to
Amount.
In the present embodiment, by selecting a characteristic for benchmark characteristic one by one in feature set, according to it is default away from
The characteristic weighting in feature set in addition to reference characteristic data is divided into neighborhood and remote domain from threshold value, is commented by mutual information
The weight for estimating reference characteristic data by weight feedback into the corresponding weight vector of reference characteristic data, and is deleted in feature set
Except the smallest characteristic of weight, gradually delete the smallest characteristic of weight, feature set character pair data number and
The corresponding feature set of characteristic and the preset corresponding weight vector of threshold when weight vector is stablized, after output is stable.By right
The weight of the characteristic got is assessed, and deletes the too small number of features of weight, it is known that corresponding feature in feature set
Data weight tends towards stability, and the corresponding feature set of characteristic and the preset corresponding weight vector of threshold after output is stable solve
Abnormality detection technical characteristic extracts difficult, to scale sexual abnormality detection efficiency low technical problem in the prior art, is promoted
To the efficiency of scale sexual abnormality detection.
Referring to Fig. 6, the functional module of the device 100 extracted for the behavioural characteristic data that sixth embodiment of the invention provides
Schematic diagram.Applied to computer equipment, which includes but is not limited to mobile phone, mobile phone, smart phone, plate electricity
Brain, PC, personal digital assistant, media player, server and other electronic equipments.What behavior characteristic was extracted
Device 100 includes feature set module 110, weight computing module 120, screening module 130 and characteristic output module 140.
On the basis of five embodiments, feature set module 110 includes:
Screening unit 111, for the disturbance degree according to each characteristic to characteristic population entropy, to the spy extracted
Sign data are screened, and construct candidate characteristic set by the characteristic after screening.
When it is implemented, the corresponding characteristic characteristic dimension of the feature set actually got is huge, by whole characteristics
Entering feature set according to the consideration and carrying out operation can take a substantial amount of time and space.Therefore before weighted feature selection, by commenting
Feature is estimated to the influence degree of feature set population entropy, is removed inessential or unrelated attribute, is constructed candidate characteristic set Sm:
The similarity in feature set between any two object is calculated, to obtain the similarity matrix S=of feature set
(Sij)n*n, define any two characteristic xi, xjBetween comentropy.
Data flow entirety entropy is the comentropy average value of all characteristics in this section of network data flow, calculation formula are as follows:
To each characteristic S in primitive character collection Sk, removal this feature data S is calculated using formula abovekIt is whole afterwards
The incrementss of entropy, the Pre-Evaluation function as characteristic:
preM(sk)=E (S-sk)-E (S) w (t)=[1 ..., 1]
Wherein, preM (sk) value range be [- 1,1].If preM (sk) > 0, then prove that removal this feature data can make spy
It collects population entropy to increase, illustrates that it is conducive to cluster process, preM (sk) preM (sk) < 0 when, it was demonstrated that removal this feature
Feature set population entropy can be made to reduce, illustrate that this feature is detrimental to the unrelated or noise characteristic data of cluster, preM (sk) value
Closer -1, indicate that this feature data are bigger to the negative interaction of cluster.It follows that can be by Pre-Evaluation function to feature set
In all characteristics be ranked up, unrelated or noise characteristic data are rejected by the wealthy value of setting, filter out characteristic structure
Build candidate characteristic set Sm。
Referring to Fig. 7, the functional module of the device 100 extracted for the behavioural characteristic data that seventh embodiment of the invention provides
Schematic diagram.Applied to computer equipment, which includes but is not limited to mobile phone, mobile phone, smart phone, plate electricity
Brain, PC, personal digital assistant, media player, server and other electronic equipments.What behavior characteristic was extracted
Device 100 includes feature set module 110, weight computing module 120, screening module 130 and characteristic output module 140.
On the basis of sixth embodiment, feature set module 110 further include:
Characteristic integral unit 112, for the characteristic extracted to be normalized, section registration process.
Optionally, the user behavior data got includes: traffic behavior layer data, agreement behavior layer data, Yi Jiyong
Family behavior layer data.
When it is implemented, since the data got include traffic behavior layer data, agreement behavior layer data and user
Behavior layer data, characteristic include multiple types.It needs that characteristic is normalized and section registration process, with to obtaining
The data got carry out characteristic extraction.It is as follows, normalization and section registration process are illustrated:
Normalized: different server users normal behaviour in same time period has differences, identical service
The normal behaviour in device user stage in different times is there is also difference, this is to differentiating that it is difficult that abnormal belt is come.In view of such
Situation: although different server different time stage Normal appearances should be different, in the same server in the period, one
Whether a user is abnormal only to need to compare with the Normal appearances in the server period.Therefore we change absolute value and are
Relative value proposes normalization characteristic processing method.Specific practice is: calculating all user characteristics every day in the same server
Average value, obtain average value characteristic time sequenceNormal behaviour benchmark as the server;For with
Family u, by its date t feature vectorDivided by the benchmark of this day of server where itThis ratio is exactly after normalizing
Numerical value.
Section registration process: the abnormality detection of user behavior includes that horizontal and vertical comparison detects, and laterally detection refers to
User to be detected is compared with other users behavior, and longitudinal detect refers to user's current behavior and historical behavior progress
Compare detection.Abnormal user is generally not all to show abnormal behaviour always, and the important sampling section to its behavior is abnormal
Tendency and abnormal stage of development, but normal users are without this stage.If selected according to method same as abnormal user
Sampling interval, then the sampling interval of all normal users is all last T days on data set.It is so formed such a existing
As: the feature of normal users is extracted from the stage at the same time, and the feature of abnormal user carrys out self-dispersed time phase.Such as preceding institute
It states, time difference meeting classification of disturbance virtually increases the difference of normal users and abnormal user.
For exclusive PCR difference, it is proposed that section registration process: for normal users Ua, we are it same at random
The sampling interval of abnormal user a Ub, Ub are selected in one server as abnormal tendency and abnormal stage of development, the sampling of Ua
Section is aligned with Ub, chooses same a period of time window.We carry out repeatedly random alignment, and the comparative feature value finally chosen is
The average value being repeatedly aligned at random.
Referring to Fig. 8, the functional module of the device 100 extracted for the behavioural characteristic data that eighth embodiment of the invention provides
Schematic diagram.Applied to computer equipment, which includes but is not limited to mobile phone, mobile phone, smart phone, plate electricity
Brain, PC, personal digital assistant, media player, server and other electronic equipments.What behavior characteristic was extracted
Device 100 includes feature set module 110, weight computing module 120, screening module 130 and characteristic output module 140.
On the basis of five embodiments, weight computing module 120 includes:
Characteristics unit 121, for assessing the feature correlation and feature redundancy of reference characteristic data by mutual information;
Weight unit 122, for obtaining the weight of reference characteristic according to feature correlation and feature redundancy.
When it is implemented, mutual information is a kind of Information Meter for being used to indicate correlation between two event sets in information theory
Amount, is defined as I (X, Y)=H (X)+H (Y)-H (X, Y), and wherein H (X, Y) is combination entropy, is defined as:It can be derived from:
Arbitrary characteristics data S is provided according to mutual information conceptkFeature correlation and feature redundancy definition:
Feature correlation indicates characteristic SkWith the relevance of domain label c, feature correlation is bigger, and feature is more outstanding,
It can be indicated with family information between the two, calculation formula are as follows:
D(sk, c) and=I (sk,c)
Wherein c is the domain label that formula provides.
Feature redundancy indicates characteristic SkWith feature set SmThe repeatability of middle other feature, redundancy is smaller, characteristic
According to more outstanding, available feature data SkWith feature set SmThe Average Mutual expression of middle other feature, calculation formula are as follows:
Wherein suIt is characterized collection SmIn remove characteristic SkArbitrary characteristics data.
Theoretically an outstanding feature needs to have two conditions: higher feature correlation and lower feature redundancy
Degree.Accordingly, the characteristic evaluating factor is provided:
ρj=D (sj,c)-λR(sj)
Wherein, λ is a balance parameters, and the relative intensity of redundancy is cured for controlling feature data dependence.It is found that working as
When λ value is 0, redundancy influences to disappear, and feature evaluation maximumlly concentrates in correlation;When λ value is 1, feature is commented
Estimate and concentrates in redundancy.Due to requirement of the different feature sets under different clustered demands, for redundancy and correlation
It is different, if redundancy is more demanding, it is relatively preferable to cluster efficiency, if correlation requirement is higher, clustering precision is often
Preferably, λ facilitates the final Clustering Effect of team to introducing and efficiency is balanced and controls.
In the t times iteration, characteristic evaluating set can be obtainedWherein
T is the number of iterations, and w (t) is feature weight vector used by current iteration, due to defaulting each characteristic under original state
It is identical for the influence degree of data segmentation, therefore w (1)=[1, Λ, 1, Λ, 1].Indicate j-th of spy under the t times iteration
The assessment magnitude of sign.Feature weight vector is updated, w (t+1) can be obtained, renewal function is as follows:
Wherein TmFor preset iteration maximum times,The renewal intensity for being is decayed at any time, guarantees convergence rate.w
(t+1) tend towards stability or iterate to maximum times TmWhen, feature selecting can stop and return the result.
It should be noted that iteration requires to randomly select characteristic object x every timei, the randomness of selection leads to neighbour
The effect of domain analysis is different.In general, if selected xiNeighborhood in data object it is more, and be distributed it is compacter, then selection get over
Success, the reference value of neighbor analysis is just bigger, and in contrast, object is fewer, is distributed sparse neighbor analysis reference value and gets over
It is small, or even feature selecting can be guided into the direction of mistake, cause efficiency to decline.Join it can be seen that analyzing each iteration neighborhood
It is very necessary for examining the assessment of value.
Accordingly, the neighbor analysis reference factor p of the t times iterationl(xi) it is defined as follows:
Wherein, m xiThe number of data object in neighborhood, δ are to be used to standardize neighborhood distance apart from generalized parameter,
Obviously to different data sets, δ is different.It is found that when m value is bigger, and neighborhood object and xiAverage distance it is smaller, then pl
(xi) value more levels off to 0 closer to 1.
By reference factor pl(xi) as the t times iteration when feature weight vector update probability so that selecting object xiCompared with
Update probability when success, on the contrary probability is small, the convergence rate of iterative algorithm can be improved in this way, when feature weight vector w (t) tends to
After stabilization, S is weeded outmThe lower feature of middle evaluation magnitude exports SmWith corresponding feature weight vector wopt。
The embodiment of the invention also provides a kind of computer equipments, comprising: processor, memory and communication bus;Communication
Bus is for realizing the connection communication between processor and memory;
Processor is used to execute the program that the behavioural characteristic data stored in memory are extracted, to realize following steps:
Step S101 extracts characteristic in the user behavior data got, according to characteristic construction feature collection.
Step S102 selects a characteristic for benchmark characteristic, according to pre-determined distance threshold value one by one in feature set
Characteristic weighting in feature set in addition to reference characteristic data is divided into neighborhood and remote domain, benchmark is assessed by mutual information
The weight of characteristic
Step S103 by weight feedback into the corresponding weight vector of reference characteristic data, and deletes power in feature set
It is worth the smallest characteristic
Step S104 gradually deletes the smallest characteristic of weight, in the number and weight of feature set character pair data
The corresponding feature set of characteristic and the preset corresponding weight vector of threshold when vector is stablized, after output is stable.
Optionally, the step of execution can be replaced step S201 to step S205, step S301 to step S304, Yi Jibu
Rapid S401 to step S404.
Due to first embodiment into fourth embodiment to behavioural characteristic data extract method implementation process into
It has gone detailed description, has been repeated no more in the present embodiment.
Computer equipment includes but is not limited to mobile phone, mobile phone, smart phone, tablet computer, individual in the present embodiment
Computer, personal digital assistant, media player, server and other electronic equipments.
The embodiment of the invention also provides a kind of computer readable storage medium, which has
Behavioural characteristic data extract method, when behavioural characteristic data extract method by least one processor execute when, cause to
A few processor executes following steps:
Step S101 extracts characteristic in the user behavior data got, according to characteristic construction feature collection.
Step S102 selects a characteristic for benchmark characteristic, according to pre-determined distance threshold value one by one in feature set
Characteristic weighting in feature set in addition to reference characteristic data is divided into neighborhood and remote domain, benchmark is assessed by mutual information
The weight of characteristic
Step S103 by weight feedback into the corresponding weight vector of reference characteristic data, and deletes power in feature set
It is worth the smallest characteristic
Step S104 gradually deletes the smallest characteristic of weight, in the number and weight of feature set character pair data
The corresponding feature set of characteristic and the preset corresponding weight vector of threshold when vector is stablized, after output is stable.
Optionally, the step of execution can be replaced step S201 to step S205, step S301 to step S304, Yi Jibu
Rapid S401 to step S404.
Due to first embodiment into fourth embodiment to behavioural characteristic data extract method implementation process into
It has gone detailed description, has been repeated no more in the present embodiment.
The present embodiment computer readable storage medium includes but is not limited to are as follows: ROM, RAM, disk or CD etc..
In conclusion the embodiment of the invention discloses the method and devices that a kind of behavioural characteristic data are extracted, by spy
It selects a characteristic for benchmark characteristic in collection one by one, reference characteristic number will be removed in feature set according to pre-determined distance threshold value
Outer characteristic weighting accordingly is divided into neighborhood and remote domain, the weight of reference characteristic data is assessed by mutual information, by weight
It feeds back in the corresponding weight vector of reference characteristic data, and deletes the smallest characteristic of weight in feature set, gradually delete
Except the smallest characteristic of weight, when the number and weight vector of feature set character pair data are stablized, after output is stablized
The corresponding feature set of characteristic and the preset corresponding weight vector of threshold.It is commented by the weight to the characteristic got
Estimate, delete the too small number of features of weight, it is known that corresponding characteristic weight tends towards stability in feature set, after output is stablized
The corresponding feature set of characteristic and the preset corresponding weight vector of threshold, solve abnormality detection technical characteristic in the prior art and mention
Difficulty is taken, the technical problem low to scale sexual abnormality detection efficiency improves the efficiency to the detection of scale sexual abnormality.
In embodiment provided herein, it should be understood that disclosed device and method, it can also be by other
Mode realize.The apparatus embodiments described above are merely exemplary, for example, the flow chart and block diagram in attached drawing are shown
Device, the architectural framework in the cards of method and computer program product, function of multiple embodiments according to the present invention
And operation.In this regard, each box in flowchart or block diagram can represent one of a module, section or code
Point, a part of the module, section or code includes one or more for implementing the specified logical function executable
Instruction.It should also be noted that function marked in the box can also be attached to be different from some implementations as replacement
The sequence marked in figure occurs.For example, two continuous boxes can actually be basically executed in parallel, they sometimes may be used
To execute in the opposite order, this depends on the function involved.It is also noted that each of block diagram and or flow chart
The combination of box in box and block diagram and or flow chart can be based on the defined function of execution or the dedicated of movement
The system of hardware is realized, or can be realized using a combination of dedicated hardware and computer instructions.
In addition, each functional module in each embodiment of the present invention can integrate one independent portion of formation together
Point, it is also possible to modules individualism, an independent part can also be integrated to form with two or more modules.
In short, the foregoing is merely illustrative of the preferred embodiments of the present invention, it is not intended to limit the scope of the present invention.
All within the spirits and principles of the present invention, any modification, equivalent replacement, improvement and so on should be included in of the invention
Within protection scope.
Claims (10)
1. a kind of method that behavioural characteristic data are extracted, which is characterized in that the described method includes:
Characteristic is extracted in the user behavior data got, according to the characteristic construction feature collection;
Select a characteristic for benchmark characteristic one by one in the feature set, according to pre-determined distance threshold value by the feature
The characteristic weighting in addition to reference characteristic data is concentrated to be divided into neighborhood and remote domain, it is special to assess the benchmark by mutual information
Levy the weight of data;
By the weight feedback into the corresponding weight vector of the reference characteristic data, and weight is deleted in the feature set
The smallest characteristic;
The smallest characteristic of weight is gradually deleted, it is steady in the number of the feature set character pair data and the weight vector
Periodically, the corresponding feature set of characteristic and the preset corresponding weight vector of threshold after output is stablized.
2. the method as described in claim 1, which is characterized in that according to the characteristic construction feature collection, comprising:
According to each characteristic to the disturbance degree of characteristic population entropy, the characteristic extracted is sieved
Choosing constructs candidate characteristic set by the characteristic after screening.
3. the method as described in claim 1, which is characterized in that the user behavior data got includes: the traffic behavior number of plies
According to, agreement behavior layer data and user behavior layer data.
4. method as claimed in claim 3, which is characterized in that before the characteristic construction feature collection, further includes:
The characteristic extracted is normalized, section registration process.
5. the method as described in claim 1, which is characterized in that the weight of the reference characteristic data is assessed by mutual information,
Include:
The feature correlation and feature redundancy of the reference characteristic data are assessed by mutual information;
The weight of the reference characteristic is obtained according to the feature correlation and the feature redundancy.
6. the device that a kind of behavioural characteristic data are extracted, which is characterized in that described device includes:
Feature set module is constructed for extracting characteristic in the user behavior data got according to the characteristic
Feature set;
Weight computing module, for selecting a characteristic for benchmark characteristic one by one in the feature set, according to default
Characteristic weighting in the feature set in addition to reference characteristic data is divided into neighborhood and remote domain by distance threshold, by mutual
The weight of reference characteristic data described in information evaluation;
Screening module is used for by the weight feedback into the corresponding weight vector of the reference characteristic data, and in the spy
The smallest characteristic of weight is deleted in collection;
Characteristic output module, for gradually deleting the smallest characteristic of weight, in the feature set character pair data
Number and the weight vector stablize when, output stablize after the corresponding feature set of characteristic and the preset corresponding weight of threshold
Vector.
7. device as claimed in claim 6, which is characterized in that the feature set module includes:
Screening unit, for the disturbance degree according to each characteristic to characteristic population entropy, described in extracting
Characteristic is screened, and constructs candidate characteristic set by the characteristic after screening.
8. device as claimed in claim 6, which is characterized in that the user behavior data got includes: the traffic behavior number of plies
According to, agreement behavior layer data and user behavior layer data.
9. device as claimed in claim 8, which is characterized in that the feature set module further include:
Characteristic integral unit, for the characteristic extracted to be normalized, section registration process.
10. device as described in claim 1, which is characterized in that weight computing module includes:
Characteristics unit, for assessing the feature correlation and feature redundancy of the reference characteristic data by mutual information;
Weight unit, for obtaining the weight of the reference characteristic according to the feature correlation and the feature redundancy.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810576742.7A CN109063721A (en) | 2018-06-05 | 2018-06-05 | A kind of method and device that behavioural characteristic data are extracted |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810576742.7A CN109063721A (en) | 2018-06-05 | 2018-06-05 | A kind of method and device that behavioural characteristic data are extracted |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109063721A true CN109063721A (en) | 2018-12-21 |
Family
ID=64820507
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810576742.7A Pending CN109063721A (en) | 2018-06-05 | 2018-06-05 | A kind of method and device that behavioural characteristic data are extracted |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109063721A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111459990A (en) * | 2020-03-31 | 2020-07-28 | 腾讯科技(深圳)有限公司 | Object processing method, system, computer readable storage medium and computer device |
CN111724278A (en) * | 2020-06-11 | 2020-09-29 | 国网吉林省电力有限公司 | Fine classification method and system for power multi-load users |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040225627A1 (en) * | 1999-10-25 | 2004-11-11 | Visa International Service Association, A Delaware Corporation | Synthesis of anomalous data to create artificial feature sets and use of same in computer network intrusion detection systems |
US20170134404A1 (en) * | 2015-11-06 | 2017-05-11 | Cisco Technology, Inc. | Hierarchical feature extraction for malware classification in network traffic |
CN106973047A (en) * | 2017-03-16 | 2017-07-21 | 北京匡恩网络科技有限责任公司 | A kind of anomalous traffic detection method and device |
-
2018
- 2018-06-05 CN CN201810576742.7A patent/CN109063721A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040225627A1 (en) * | 1999-10-25 | 2004-11-11 | Visa International Service Association, A Delaware Corporation | Synthesis of anomalous data to create artificial feature sets and use of same in computer network intrusion detection systems |
US20170134404A1 (en) * | 2015-11-06 | 2017-05-11 | Cisco Technology, Inc. | Hierarchical feature extraction for malware classification in network traffic |
CN106973047A (en) * | 2017-03-16 | 2017-07-21 | 北京匡恩网络科技有限责任公司 | A kind of anomalous traffic detection method and device |
Non-Patent Citations (2)
Title |
---|
刘帅: "面向网络数据流的多层面异常行为分析检测技术研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑(月刊)》 * |
过岩巍等: "网络游戏案例研究:用户行为分析和流失预测", 《中文信息学报》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111459990A (en) * | 2020-03-31 | 2020-07-28 | 腾讯科技(深圳)有限公司 | Object processing method, system, computer readable storage medium and computer device |
CN111724278A (en) * | 2020-06-11 | 2020-09-29 | 国网吉林省电力有限公司 | Fine classification method and system for power multi-load users |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cai et al. | Structural temporal graph neural networks for anomaly detection in dynamic graphs | |
EP3477906B1 (en) | Systems and methods for identifying and mitigating outlier network activity | |
Imran et al. | An intelligent and efficient network intrusion detection system using deep learning | |
Song et al. | Toward a more practical unsupervised anomaly detection system | |
Ali Alheeti et al. | Intelligent intrusion detection in external communication systems for autonomous vehicles | |
Selvarajan et al. | Mining of intrusion attack in SCADA network using clustering and genetically seeded flora‐based optimal classification algorithm | |
US20150188941A1 (en) | Method and system for predicting victim users and detecting fake user accounts in online social networks | |
WO2019175880A1 (en) | Method and system for classifying data objects based on their network footprint | |
CN103189836A (en) | Method for classification of objects in a graph data stream | |
CN113094707B (en) | Lateral movement attack detection method and system based on heterogeneous graph network | |
CN109753797B (en) | Dense subgraph detection method and system for stream graph | |
Podder et al. | Artificial neural network for cybersecurity: A comprehensive review | |
CN109325232A (en) | A kind of user behavior exception analysis method, system and storage medium based on LDA | |
Wang et al. | Network traffic analysis over clustering-based collective anomaly detection | |
Jalali et al. | Social network sampling using spanning trees | |
CN112685272B (en) | Interpretable user behavior abnormity detection method | |
Wang et al. | Detecting shilling groups in online recommender systems based on graph convolutional network | |
CN108961019A (en) | A kind of detection method and device of user account | |
CN114124460A (en) | Industrial control system intrusion detection method and device, computer equipment and storage medium | |
CN113609394A (en) | Information flow-oriented safety recommendation system | |
CN117172875A (en) | Fraud detection method, apparatus, device and storage medium | |
Li et al. | Anomaly detection by discovering bipartite structure on complex networks | |
Wang et al. | Phishing scams detection via temporal graph attention network in Ethereum | |
CN109063721A (en) | A kind of method and device that behavioural characteristic data are extracted | |
Liu et al. | Heterogeneous graphs neural networks based on neighbor relationship filtering |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181221 |