CN109936582A - Construct the method and device based on the PU malicious traffic stream detection model learnt - Google Patents
Construct the method and device based on the PU malicious traffic stream detection model learnt Download PDFInfo
- Publication number
- CN109936582A CN109936582A CN201910333902.XA CN201910333902A CN109936582A CN 109936582 A CN109936582 A CN 109936582A CN 201910333902 A CN201910333902 A CN 201910333902A CN 109936582 A CN109936582 A CN 109936582A
- Authority
- CN
- China
- Prior art keywords
- sample data
- assessment
- traffic stream
- malicious traffic
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of method and device of the building based on the PU malicious traffic stream detection model learnt, relate to network technique field, main purpose be to construct it is a kind of can be based on the detection model of the malicious traffic stream of machine learning.The main technical solution of the present invention are as follows: obtain data on flows as sample data set;Multiple candidate families are obtained based on sample data set training;Assessment collection is constructed based on the sample data set;Each candidate family is assessed respectively according to assessment collection and default evaluation condition, obtains the assessment result for corresponding to each candidate family;Selection assessment result meets the candidate family of preset condition;Selected model is integrated according to preset integrated approach, obtains malicious traffic stream detection model.The present invention is for realizing to the process constructed in malicious traffic stream detection process to malicious traffic stream detection model.
Description
Technical field
The present invention relates to network technique fields more particularly to a kind of building based on the PU malicious traffic stream detection model learnt
Method and device and a kind of malicious traffic stream detection method and device.
Background technique
With the continuous development of network technology and the work and life of people is more and more closely bound up with network, in network
Flow also gradually increase, wherein be usually present malicious traffic stream in a network, will affect the normal fortune of site databases or system
Row, such as the network flows such as the fraud of network attack, flow, malice crawler are common malicious traffic stream, and such malicious traffic stream is logical
Often unauthorized business datum or information are invaded, interfere or grabbed by unauthorized mode.Danger based on malicious traffic stream
Property is done harm to, expert has more paid attention to the detection of malicious traffic stream in domain.
Currently, in the detection process of existing malicious traffic stream, commonly used by detection mode be all based on preset rules into
Row detection, for example, the feature by extracting malicious traffic stream detects network flow and is judged as judgment basis, but
Under the premise of the network flow of magnanimity instantly, by existing mode in the detection process, either detection effect also
It is that artificial intervention is all excessively relied on detection efficiency, this allows for needing when in face of the current network data for facing magnanimity
Spend the resources such as more human and material resources.And with the continuous progress of technology, artificial intelligence technology also gradually develops.Wherein, machine
Device study is the inevitable outcome that artificial intelligence study develops to certain phase, is dedicated to the means by calculating, utilizes experience
To improve the performance of system itself.In computer systems, " experience " exists usually in the form of " data ", is calculated by machine learning
Method can generate " model " from data, that is to say, that empirical data is supplied to machine learning algorithm, it can be based on these warps
It tests data and generates model, when facing news, model can provide corresponding judgement, that is, prediction result.Therefore, based on existing
Some detection modes are difficult to meet the needs of current malicious traffic stream detection, how to realize that one kind can be based on the malice of machine learning
The detection of flow becomes urgent problem to be solved in the industry.
Summary of the invention
In view of the above problems, the invention proposes a kind of method of the building based on the PU malicious traffic stream detection model learnt and
Device, main purpose are to realize that one kind can be based on the malicious traffic stream detection method that machine learning is automated, to subtract
Few artificial consumption.
In order to achieve the above objectives, present invention generally provides following technical solutions:
On the one hand, the present invention provides a kind of method of the building based on the PU malicious traffic stream detection model learnt, specifically includes:
Obtain data on flows and be used as sample data set, sample data concentration include the positive sample data with positive label with
Unmarked sample data without label, wherein positive tag representation malicious traffic stream;
Multiple candidate families are obtained based on sample data set training;
Assessment collection is constructed based on the sample data set;
Each candidate family is assessed respectively according to assessment collection and default evaluation condition, is obtained corresponding every
The assessment result of a candidate family;
Selection assessment result meets the candidate family of preset condition;According to preset integrated approach to selected model into
Row is integrated, obtains malicious traffic stream detection model.
Optionally, obtaining multiple candidate families based on sample data set training includes:
Multiple training sets are constructed based on the sample data set;
It is selected respectively from the set and the multiple training set that the set of machine learning algorithm, hyper parameter combine
It selects, training obtains multiple candidate families;Wherein, a kind of machine learning algorithm, one group of hyper parameter and a training set determine one
Candidate family.
Optionally, described to include: based on the multiple training sets of sample data set building
A positive sample training subset is constructed based on at least partly positive sample data that the sample data is concentrated, to described
The unmarked sample data that sample data is concentrated carries out multiple repairing weld operation and constructs multiple negative sample training subsets, by the positive sample
This training subset and the multiple negative sample training subset are respectively combined to obtain multiple training sets;
Alternatively,
Multiple positive sample training subsets are constructed based on at least partly positive sample data that the sample data is concentrated, to described
The unmarked sample data that sample data is concentrated be employed many times operation and constructs multiple negative sample training subsets, will each positive sample
This training subset and the multiple negative sample training subset are respectively combined to obtain multiple training sets.
Optionally, described to include: based on sample data set construction assessment collection
Sampling building positive sample assessment subset is carried out to the positive sample data that the sample data is concentrated, to the sample number
Sampling building negative sample assessment subset is carried out according to the unmarked sample data of concentration, positive sample is assessed into subset and negative sample is assessed
Sub-combinations obtain assessment collection.
Optionally, the sample data set construction assessment collection that is based on includes: more based on sample data set building
A assessment collection, wherein it includes positive sample data and the unmarked sample data as negative sample data that each assessment, which is concentrated,;
It is described that each candidate family is assessed respectively according to assessment collection and default evaluation condition, it obtains pair
Answer the assessment result of each candidate family, comprising: for each candidate family, collect and preset assessment item according to the multiple assessment
Part respectively assesses the candidate family, obtains multiple assessment results, merges the multiple assessment result and obtains candidate's mould
The corresponding final assessment result of type.
Optionally, when the default evaluation condition is maximal margin method, the assessment result of each candidate family of correspondence
It is the class interval of prediction result of each candidate family on assessment collection;
The candidate family that the selection assessment result meets preset condition includes: the class interval for selecting corresponding prediction result
Greater than the candidate family of preset value.
Optionally, the default evaluation condition is the assessment of each candidate family of correspondence when calculating the method for AUC value
The result is that AUC value of each candidate family on assessment collection;
The candidate family that the selection assessment result meets preset condition includes: that selection corresponding A UC value is greater than preset value
Candidate family.
Optionally, described that selected model is integrated according to preset integrated approach, obtain malicious traffic stream detection
Model includes:
It is the corresponding weighted value of each selected candidate family distribution according to corresponding assessment result, and according to weighted value
Selected candidate family is integrated.
On the other hand, the present invention provides a kind of device of the building based on the PU malicious traffic stream detection model learnt, specific to wrap
It includes:
Acquiring unit, for obtaining data on flows as sample data set, the sample data concentration includes with positive label
Positive sample data and unmarked sample data without label, wherein positive tag representation malicious traffic stream;
Training unit, for obtaining multiple candidate families based on sample data set training;
Structural unit, for based on sample data set construction assessment collection;
Assessment unit, for being commented respectively each candidate family according to assessment collection and default evaluation condition
Estimate, obtains the assessment result for corresponding to each candidate family;
Selecting unit, for selecting assessment result to meet the candidate family of preset condition;
Integrated unit obtains malicious traffic stream inspection for integrating according to preset integrated approach to selected model
Survey model.
Optionally, training unit includes:
Module is constructed, for constructing multiple training sets based on the sample data set;
Training module, set and the multiple training set for set, hyper parameter combination from machine learning algorithm
Middle to be selected respectively, training obtains multiple candidate families;Wherein, a kind of machine learning algorithm, one group of hyper parameter and an instruction
Practice to collect and determines a candidate family.
Optionally, the building module includes:
First building submodule, at least partly positive sample data for being concentrated based on the sample data are constructing one just
Sample training subset carries out multiple repairing weld operation to the unmarked sample data that the sample data is concentrated and constructs multiple negative samples
Training subset is respectively combined the positive sample training subset and the multiple negative sample training subset to obtain multiple training
Collection;
Second building submodule, at least partly positive sample data building for being concentrated based on the sample data are multiple just
Sample training subset carries out the unmarked sample data that the sample data is concentrated the multiple negative samples of operation building are employed many times
Training subset is respectively combined each positive sample training subset and the multiple negative sample training subset to obtain multiple training
Collection.
Optionally, the structural unit carries out sampling structure specifically for the positive sample data concentrated to the sample data
Positive sample assessment subset is built, sampling building negative sample assessment is carried out to the unmarked sample data that the sample data is concentrated
Positive sample is assessed subset and negative sample assessment sub-combinations obtains assessment collection by collection.
Optionally, the structural unit is specifically used for constructing multiple assessment collection based on the sample data set, wherein each
It includes positive sample data and the unmarked sample data as negative sample data that assessment, which is concentrated,;
The assessment unit, is specifically used for for each candidate family, collects and preset assessment item according to the multiple assessment
Part respectively assesses the candidate family, obtains multiple assessment results, merges the multiple assessment result and obtains candidate's mould
The corresponding final assessment result of type.
Optionally, when the default evaluation condition is maximal margin method, the assessment result of each candidate family of correspondence
It is the class interval of prediction result of each candidate family on assessment collection;
The selecting unit, specifically for selecting the class interval of corresponding prediction result to be greater than the candidate family of preset value.
Optionally, the default evaluation condition is the assessment of each candidate family of correspondence when calculating the method for AUC value
The result is that AUC value of each candidate family on assessment collection;
The selecting unit is greater than the candidate family of preset value specifically for selection corresponding A UC value.
Optionally, the integrated unit is specifically used for according to corresponding assessment result being each selected candidate family
Corresponding weighted value is distributed, and selected candidate family is integrated according to weighted value.
On the other hand, the present invention provides a kind of computer readable storage medium, wherein the computer readable storage medium
On be stored with computer program, wherein the computer program realizes above-mentioned structure when being executed by one or more computing devices
The method for building the malicious traffic stream detection model based on PU study.
On the other hand, the present invention provides a kind of is including one or more computing devices and one or more storage devices
It unites, record has computer program on one or more of storage devices, and the computer program is one or more of
Computing device makes one or more of computing devices realize that above-mentioned building is learnt based on PU malicious traffic stream inspection when executing
The method for surveying model.
Another aspect, the present invention provides the malicious traffic stream detection methods based on PU learning model, comprising:
Obtain data on flows to be detected;
According to the method as described in any one of aforementioned first aspect, malicious traffic stream detection model is constructed;
The data on flows to be detected is detected using obtained malicious traffic stream detection model.
Another aspect, the present invention provides a kind of malicious traffic stream detection systems based on PU learning model, wherein
Data to be tested acquiring unit, for obtaining data on flows to be detected;
Described in any item devices as above, for constructing malicious traffic stream detection model;
Detection unit, for being examined using obtained malicious traffic stream detection model to the data on flows to be detected
It surveys.
By above-mentioned technical proposal, a kind of side of the building based on the PU malicious traffic stream detection model learnt provided by the invention
Method and device can obtain multiple candidates by obtaining data on flows sample data set, and based on sample data set training
Model, then based on sample data set construction assessment collection, according to assessment collection and default evaluation condition respectively to each
Candidate family is assessed, and the assessment result for corresponding to each candidate family is obtained, and finally assessment result is selected to meet preset condition
Candidate family, and selected model is integrated according to preset integrated approach, obtains malicious traffic stream detection model, from
And the detection of malicious traffic stream can be carried out according to the malicious traffic stream detection model, it is relatively existing to use predetermined manner to flow number
According to detection mode, the problem of present invention be can be avoided to manpower intervention, can machine learning execute the detection of malicious traffic stream automatically,
It solves to artificial dependence in malicious traffic stream detection process, also, the method that the present invention is implemented combines PU learning model,
It can be according to the potential feature and rule for finding malicious traffic stream in known malicious flow, thus for unknown data on flows
When being detected, known feature is only relied upon compared to previous detection means and rule is compared, there is better accuracy.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention,
And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can
It is clearer and more comprehensible, the followings are specific embodiments of the present invention.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field
Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention
Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 shows a kind of method of building based on the PU malicious traffic stream detection model learnt of proposition of the embodiment of the present invention
Flow chart;
Fig. 2 shows device of a kind of building based on the PU malicious traffic stream detection model learnt that the embodiment of the present invention proposes
Composition block diagram;
Fig. 3 shows another dress of the building based on the PU malicious traffic stream detection model learnt of proposition of the embodiment of the present invention
The composition block diagram set;
Fig. 4 shows a kind of malicious traffic stream detection system composition frame based on PU learning model provided in an embodiment of the present invention
Figure.
Specific embodiment
The exemplary embodiment that the present invention will be described in more detail below with reference to accompanying drawings.Although showing the present invention in attached drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the present invention without should be by embodiments set forth here
It is limited.It is to be able to thoroughly understand the present invention on the contrary, providing these embodiments, and can be by the scope of the present invention
It is fully disclosed to those skilled in the art.
The embodiment of the invention provides a kind of method of the building based on the PU malicious traffic stream detection model learnt, this method is used
It is detected in the data on flows detected for needs, its object is to learn mould for detecting the PU of malicious traffic stream by building
Type detects data on flows to be detected, to realize malicious traffic stream detection function based on machine learning, solves
Depend on artificial problem in existing malicious traffic stream detection process unduly, this method specific steps are as shown in Figure 1, comprising:
101, data on flows is obtained as sample data set.
In current network, it is flooded with a large amount of network data, these network datas constitute network flow, wherein the net
Network flow can be understood as the data packet and network request quantity by particular network node.Based on there are various in network flow
The network flow of malicious act is cheated, malice crawler such as network attack, flow.In the network flow of these malicious acts
In, the malicious traffic stream overwhelming majority both is from automated procedures, is usually invaded by unauthorized mode, interferes, grabs other party
Business or data;Network attack consumes system performance often through a large amount of access, causes database or system stuck, from
And it can not externally provide service;And flow fraud normally results in public platform, short-sighted frequency, live streaming platform brush amount of access, amount of reading,
And the brush list amount of electric business platform, so as to cause commodity sequence is influenced.Therefore, the detection of malicious traffic stream then can be understood as
Detection to the similar above-mentioned flow with malice property in network flow.
It is a kind of building side of malicious traffic stream detection model based on PU study based on method described in the embodiment of the present invention
Therefore method before constructing model, requires to obtain the sample data for being used for training pattern first, the sample data set is
For data on flows, in addition, PU (Positive and unlabeled learning, abbreviation PU Learning) be referred to as positive example with
Unmarked sample learning, i.e., the only positive sample data and unmarked sample data the case where under train classification models.Such as
During flow detection described above, it is known that malicious traffic stream data be a small number of, be then more unknown flow number
According to, in this case, be just suitble to choose PU study to carry out the training of model, therefore, then it is available it is above-mentioned include band
There are the malicious traffic stream data of positive label as the positive sample data of sample data set and the unmarked sample data without label
Sample data set trains corresponding disaggregated model to train based on PU study.
102, multiple candidate families are obtained based on sample data set training.
After getting sample data set, then the training of candidate family can be carried out by the sample data set.In general, normal
In the model construction process for the PU study seen, classification is usually trained in previous research as negative sample never in marker samples
Device, however, the inspection detected based on the different models that different algorithms, hyper parameter and training set are trained for malicious traffic stream
Effect is surveyed to be different, and the selection in actual application for algorithm, hyper parameter then needs to rely on the experience of operator
It is chosen, threshold is higher, therefore, can then choose sample data set in embodiments of the present invention and carry out multiple candidate families
Training is also needed in this step before the multiple candidate families of training therefrom to choose more suitable model based on this
Construct multiple and different training sets, specifically, choose training set actual mechanical process in can also in the manner described above, or
Person's other modes are chosen, for example, it is also possible to extract partial data therein respectively from positive sample and unmarked sample, make respectively
For the positive sample collection and negative sample collection of training set.After obtaining multiple and different training sets, then it can choose preset machine and calculate
Method and hyper parameter train corresponding candidate family, specifically, its machine algorithm can be selected from preset machine algorithm model
It takes, hyper parameter can then be calculated here, the candidate family can be by a kind of study by obtaining in set that hyper parameter combines
Method carries out the determination of candidate family in conjunction with a training set in one group of selected hyper parameter and corresponding multiple training sets.
In addition, in embodiments of the present invention, in order to further increase the accuracy for the model construction result that PU learns, at this
It can be with trained candidate family as much as possible, here, not done herein specifically for the quantity of candidate family in inventive embodiments
Restriction, the training that the quantity that corresponds to actual needs carries out candidate family can be chosen in practical applications.
103, based on sample data set construction assessment collection.
Due to having obtained it being multiple candidate families in abovementioned steps 102, after obtaining above-mentioned candidate family, it is also necessary to carry out
Evaluation operation, to evaluate suitable model from multiple candidate families.It also needs to pass through sample number in this step as a result,
The construction of assessment collection is carried out according to collection.Specifically, the assessment collection can be respectively from positive sample data and unmarked sample data
What middle sampling respectively obtained.In addition, in order to ensure the accuracy of assessment result, here, can also be obtained not by multiple sampling
Same assessment collection can assess candidate family using multiple assessments collection so as to subsequent.
104, each candidate family is assessed respectively according to assessment collection and default evaluation condition, is obtained pair
Answer the assessment result of each candidate family.
In evaluation process, required default evaluation condition can be chosen according to the actual needs, is commented for example, can choose
Estimating condition is AUC value, then can be according to the AUC value of each assessment collection as assessment result.Certainly, selected herein default to comment
Estimate condition and do not do specific restriction herein, any evaluation condition that can be used in model default result can be chosen and carried out.
105, selection assessment result meets the candidate family of preset condition.
Based on the difference of default evaluation condition, set preset condition is also different in this step.For example, current
When to state default evaluation condition in step 104 be AUC value, then the preset condition in this step can be then the AUC greater than setting
Threshold value is then determined as qualified candidate family.
106, selected model is integrated according to preset integrated approach, obtains malicious traffic stream detection model.
In the actual operation process, the candidate family for meeting default evaluation condition chosen in abovementioned steps 105 is often
It is multiple, in order to further ensure can also to meet in this step by above-mentioned based on the accuracy of the PU model construction learnt
The candidate family of preset condition is integrated, wherein integrated process can sort according to assessment result, and carries out weight for it
Distribution, to be integrated according to weighted value to candidate family.
Further, in embodiments of the present invention, above-mentioned executing as the further refinement and extension of previous embodiment
During step 101-106, specific executive mode can also be carried out such as following manner.
Wherein, in a step 102 based on the sample data set training obtain multiple candidate families when, a candidate family
Training process be to be determined by a kind of machine learning algorithm, one group of hyper parameter and a training set.Therefore, getting
It after stating sample data set, in the multiple candidate families of training, can also specifically include: be primarily based on the sample data set building
Multiple training sets to training candidate family.Then the set combined from the set of machine learning algorithm, hyper parameter and institute
It states and is selected respectively in multiple training sets, training obtains multiple candidate families.Here, for machine learning algorithm, hyper parameter
Choose can voluntarily selection as described above according to the actual situation, it is not limited here, in addition, being based on the sample number
When constructing multiple training sets according to collection, at least partly positive sample data that the sample data is concentrated can be primarily based on and construct one
Positive sample training subset, and it is multiple negative to carry out multiple repairing weld operation building to the unmarked sample data that the sample data is concentrated
Sample training subset.Then the positive sample training subset and the multiple negative sample training subset are respectively combined again
To multiple training sets.It should be noted that in constructing training set during positive sample training subset, it can be as described above
Building one positive sample training subset, extraction section positive sample can also construct multiple positive samples from positive sample data set
Training subset, specifically can be with are as follows: firstly, multiple just based on at least partly positive sample data building that the sample data is concentrated
Sample training subset, and multiple repairing weld operation is carried out to the unmarked sample data that the sample data is concentrated and constructs multiple negative samples
This training subset.Then, then by each positive sample training subset and the multiple negative sample training subset it is respectively combined
To multiple training sets.
Meanwhile when in step 03 based on sample data set construction assessment collection, it is based on the aforementioned mistake in implementation process
What is obtained after being trained based on sample data set in journey is multiple candidate families, and for these models, accuracy is
Different, therefore, it is also desirable to assess these candidate families, to obtain relatively accurate model, therefore, before assessment
It when carrying out the construction of assessment collection, can also carry out in the following manner: firstly, the positive sample number concentrated to the sample data
According to sampling building positive sample assessment subset is carried out, it is negative that sampling building is carried out to the unmarked sample data that the sample data is concentrated
Then positive sample is assessed subset and negative sample assessment sub-combinations obtains assessment collection by Samples Estimates subset.In addition, in order into
The accuracy of the raising assessment result of one step can also construct multiple assessment collection in this step, comment so that later use is multiple
Estimate collection repeatedly to assess each candidate family, and determine comprehensive assessment effect according to multiple assessment result, i.e., based on described
Sample data set constructs multiple assessment collection, wherein each assessment is concentrated including positive sample data and as negative sample data not
Marker samples data.
When being assessed according to assessment collection and default evaluation condition each candidate family, when it is constructed be multiple comment
It after estimating collection, then can then be carried out in the following manner in evaluation process: firstly, for each candidate family, according to described more
A assessment collection and default evaluation condition respectively assess the candidate family, obtain multiple assessment results.Then, to each time
Multiple assessment results of modeling type are merged, and will merge the multiple assessment result to obtain the candidate family corresponding most
Whole assessment result is as actual assessment result.In addition, in embodiments of the present invention, based on different default evaluation conditions to commenting
Estimate mode and assessment result presence directly affects, therefore for assessment result, its is right based on different default evaluation conditions
The assessment result answered also is different, such as: when the default evaluation condition is maximal margin method, each candidate of correspondence
The assessment result of model is the class interval of prediction result of each candidate family on assessment collection.And work as the default assessment item
Part is when calculating the method for AUC value, and the assessment result of each candidate family of correspondence is each candidate family on assessment collection
AUC value.Wherein, AUC value can be understood as a probability value, when you select a positive sample and negative sample at random, when
It is exactly AUC value, AUC that this positive sample is come the probability before negative sample according to the fractional value being calculated by preceding sorting algorithm
Value is bigger, illustrates that current class model is more possible to come positive sample before negative sample, so as to preferably classify, thus
Determine that the classifying quality of model is more accurate.
When selecting assessment result to meet the candidate family of preset condition, based on aforementioned different default evaluation condition, then
For the selection mode of candidate family that meets preset condition, there is also differences: on the one hand, when the default evaluation condition is most
When large-spacing method, the assessment result of each candidate family of correspondence is prediction result of each candidate family on assessment collection
Class interval.This step then can be with are as follows: the class interval of corresponding prediction result is selected to be greater than the candidate family of preset value.Another party
Face, when the default evaluation condition is to calculate the method for AUC value, the assessment result of each candidate family of correspondence is each
AUC value of the candidate family on assessment collection.Then this step can be with are as follows: selection corresponding A UC value is greater than the candidate family of preset value.
In implementation process, after selection assessment result meets the candidate family of preset condition, obtain meeting preset condition
Candidate family it is often multiple, and the accuracy of above-mentioned candidate family is also not identical, therefore, in this case,
It needs to integrate on above-mentioned model, wherein collecting in order to ensure the malicious traffic stream detection model after being integrated is more accurate
At when can be that each candidate family is chosen corresponding weight and integrated according to assessment result.Therefore, according to preset
Integrated approach integrates selected model, and the process for obtaining malicious traffic stream detection model can be with specifically: according to correspondence
Assessment result be that each selected candidate family distributes corresponding weighted value, and according to weighted value to selected candidate mould
Type is integrated.
In addition, the embodiment of the invention also provides a kind of detection sides of malicious traffic stream in combining specific application scenarios
Method detects malicious traffic stream in data on flows to realize, wherein the realization process of this method can be as following shown:
It is possible, firstly, to obtain data on flows to be detected as sample data, and from wherein known malicious traffic stream as sample
Marked positive sample data in notebook data, other unknown datas on flows are then as being unmarked sample data in sample.
Then, according to the positive sample data of above-mentioned determination and unmarked sample data, as sample data, and with this base
The malicious traffic stream detection model to carry out malicious traffic stream detection is constructed on plinth, wherein building process can be such as above-described embodiment
In step carry out, specifically can be with are as follows:
The first, data on flows is obtained as sample data set, and the sample data concentration includes the positive sample with positive label
Data and unmarked sample data without label, wherein positive tag representation malicious traffic stream.
The second, multiple candidate families are obtained based on sample data set training.Wherein, based on each candidate family
Training is to be obtained based on a machine learning algorithm, one group of hyper parameter and one group of training set training, and different algorithms exists
The effect of detection malicious traffic stream is different, therefore, can be more by constructing first in the training process for carrying out candidate family
A training set, and a model is respectively trained according to multiple training sets, to obtain the candidate of multiple malicious traffic stream detection models
Model.
Third is collected based on sample data set construction assessment.Based on multiple candidate families for detection malicious traffic stream
Accuracy is different, and can then be carried out the suitable model of selection by assessing at this and therefore also be needed root before assessment
Multiple assessments are constructed according to sample data set according to the method for this step to collect.
4th, each candidate family is assessed respectively according to assessment collection and default evaluation condition, is obtained pair
Answer the assessment result of each candidate family.It wherein when being assessed, is carried out based on evaluation condition, therefore, from above-mentioned
In multiple candidate families choose be suitable for malicious traffic stream detection model when, can choose any one existing evaluation condition into
Row, for example, the assessment result of each candidate family of correspondence is each time when the default evaluation condition is maximal margin method
The class interval of prediction result of the modeling type on assessment collection;It is described when the default evaluation condition is the method for calculating AUC value
The assessment result of corresponding each candidate family is AUC value of each candidate family on assessment collection.
5th, selection assessment result meets the candidate family of preset condition.Based on the different corresponding assessments of evaluation condition
It is different when as a result, therefore it is also different for selecting the mode for the candidate family for meeting preset condition based on different assessment results
's.For example, the candidate family that the selection assessment result meets preset condition includes: the class interval for selecting corresponding prediction result
Greater than the candidate family of preset value;The candidate family that the selection assessment result meets preset condition includes: selection corresponding A UC value
Greater than the candidate family of preset value.
6th, selected model is integrated according to preset integrated approach, obtains malicious traffic stream detection model.Tool
Body, when being integrated, mode can be that each selected candidate family distribution is corresponding according to corresponding assessment result
Weighted value, and selected candidate family is integrated according to weighted value.
Finally, recycling the malicious traffic stream detection model to carry out the detection of malicious traffic stream, to realize from existing a small amount of
In the case where known malicious traffic stream, obtains potential rule and carry out the detection function of malicious traffic stream from unknown flow with this
Energy.
Specifically, realization process can be through malicious traffic stream detection model to each to be checked during detection
The data on flows of survey carries out scoring operations, to obtain the score of each data on flows, and to be above-mentioned flow number on the basis of this
It is operated according to being ranked up, and therefrom determines which is to dislike according to collating sequence according to the sequence of the data on flows obtained after sequence
Meaning flow or potential malicious traffic stream.
In addition, accurately identifying and taking precautions against in order to ensure subsequent malicious traffic stream, it can also be through malicious traffic stream detection model
The data for being detected as malicious traffic stream carry out the statistics and conclusion of feature, so that the feature or feature set of malicious traffic stream are obtained, and with
This feature or feature set as take precautions against and identification malicious traffic stream when foundation.
Further, the realization as the method to above-mentioned building based on the PU malicious traffic stream detection model learnt, this hair
Bright embodiment provides a kind of device of the building based on the PU malicious traffic stream detection model learnt, which is mainly useful for needing
The data on flows to be detected is detected, and its object is to be treated by constructing for detecting the PU learning model of malicious traffic stream
The data on flows of detection is detected, to realize the malicious traffic stream detection function based on machine learning, solves existing evil
Artificial problem is depended on unduly during meaning flow detection.To be easy to read, present apparatus embodiment is no longer implemented preceding method
Detail content in example is repeated one by one, it should be understood that the device in the present embodiment, which can correspond to, realizes that preceding method is real
Apply the full content in example.The device is as shown in Fig. 2, specifically include:
Acquiring unit 21 can be used for obtaining data on flows as sample data set, and it includes band that the sample data, which is concentrated,
Just the positive sample data of label and the unmarked sample data without label, wherein positive tag representation malicious traffic stream;
Training unit 22, the sample data set training that can be used for obtaining based on the acquiring unit 21 obtain multiple candidates
Model;
Structural unit 23, the sample data set construction assessment collection that can be used for obtaining based on the acquiring unit 21;
Assessment unit 24, the assessment collection and default evaluation condition point that can be used for being constructed according to the structural unit 23
Other each candidate family at the training of training unit 22 is assessed, and the assessment result for corresponding to each candidate family is obtained;
Selecting unit 25, the assessment result that can be used for that the assessment unit 34 is selected to obtain meet the candidate of preset condition
Model;
Integrated unit 26 can be used for collecting the selected model of selected unit 35 according to preset integrated approach
At obtaining malicious traffic stream detection model.
Further, as shown in figure 3, the training unit 22 includes:
Module 221 is constructed, can be used for constructing multiple training sets based on the sample data set;
Training module 222 can be used for the set and the building of the set from machine learning algorithm, hyper parameter combination
It is selected respectively in multiple training sets that module 221 constructs, training obtains multiple candidate families;Wherein, a kind of machine learning
Algorithm, one group of hyper parameter and a training set determine a candidate family.
Further, as shown in figure 3, the building module 221 includes:
First building submodule 2211, can be used for at least partly positive sample data structure concentrated based on the sample data
A positive sample training subset is built, it is more to carry out multiple repairing weld operation building to the unmarked sample data that the sample data is concentrated
The positive sample training subset and the multiple negative sample training subset are respectively combined to obtain by a negative sample training subset
Multiple training sets;
Second building submodule 2212, can be used for at least partly positive sample data structure concentrated based on the sample data
Multiple positive sample training subsets are built, it is more to carry out multiple repairing weld operation building to the unmarked sample data that the sample data is concentrated
Each positive sample training subset and the multiple negative sample training subset are respectively combined to obtain by a negative sample training subset
Multiple training sets.
Further, as shown in figure 3, the assessment unit 24, can be specifically used for concentrating just the sample data
Sample data carries out sampling building positive sample assessment subset, and the unmarked sample data concentrated to the sample data samples
It constructs negative sample and assesses subset, positive sample is assessed into subset and negative sample assessment sub-combinations obtain assessment collection.
Further, as shown in figure 3, the structural unit 23, can be specifically used for constructing based on the sample data set
Multiple assessment collection, wherein it includes positive sample data and the unmarked sample data as negative sample data that each assessment, which is concentrated,;
The assessment unit 24 can also be specifically used for constructing each candidate family according to the structural unit 23
Multiple assessment collection and default evaluation condition the candidate family is assessed respectively, multiple assessment results are obtained, described in fusion
Multiple assessment results obtain the corresponding final assessment result of the candidate family.
Further, as shown in figure 3, when the default evaluation condition is maximal margin method, each candidate mould of the correspondence
The assessment result of type is the class interval of prediction result of each candidate family on assessment collection;
The selecting unit 25 can be specifically used for the candidate for selecting the class interval of corresponding prediction result to be greater than preset value
Model.
Further, as shown in figure 3, the default evaluation condition is correspondence each time when calculating the method for AUC value
The assessment result of modeling type is AUC value of each candidate family on assessment collection;
The selecting unit 25 can also be specifically used for the candidate family that selection corresponding A UC value is greater than preset value.
Further, as shown in figure 3, the integrated unit 26, can be specifically used for according to corresponding assessment result being every
A selected candidate family distributes corresponding weighted value, and is integrated according to weighted value to selected candidate family.
Further, as the realization to above-mentioned malicious traffic stream detection function, the embodiment of the invention provides one kind to be based on
The malicious traffic stream detection system of PU learning model, the system are mainly useful for that the data on flows detected is needed to be detected,
Purpose is to detect data on flows to be detected by constructing for detecting the PU learning model of malicious traffic stream, thus
It realizes the malicious traffic stream detection function based on machine learning, solves and depended on unduly in existing malicious traffic stream detection process manually
The problem of.To be easy to read, this system embodiment no longer repeats the detail content in preceding method embodiment one by one, but
It will be appreciated that the system in the present embodiment can correspond to the full content realized in preceding method embodiment.The system such as Fig. 4 institute
Show, specifically include:
Data to be tested acquiring unit 41 can be used for obtaining data on flows to be detected;
The device 42 based on the PU malicious traffic stream detection model learnt is constructed, can be used for being obtained according to acquiring unit 41
Data on flows to be detected constructs malicious traffic stream detection model;Wherein, building is based on the PU malicious traffic stream detection model learnt
Device 4 specifically can be volume device as shown in Figure 2 or Figure 3;
Detection unit 43 can be used for using obtained by device 42 of the building based on the PU malicious traffic stream detection model learnt
Malicious traffic stream detection model the data on flows to be detected is detected.
Further, the embodiment of the invention also provides a kind of computer readable storage mediums, wherein the computer can
It reads to be stored with computer program on storage medium, wherein real when the computer program is executed by one or more computing devices
Existing above-mentioned method of the building based on the PU malicious traffic stream detection model learnt.
In addition, including one or more computing devices and one or more storage dresses the embodiment of the invention also provides one kind
The system set, record has computer program on one or more of storage devices, and the computer program is one
Or the malice that multiple computing devices make one or more of computing devices realize that above-mentioned building is learnt based on PU when executing
The method of flow detection model.
In conclusion method of a kind of building based on the PU malicious traffic stream detection model learnt that the embodiment of the present invention proposes
And device, multiple candidate moulds can be obtained by obtaining data on flows sample data set, and based on sample data set training
Type, then based on sample data set construction assessment collection, according to assessment collection and default evaluation condition respectively to each time
Modeling type is assessed, and the assessment result for corresponding to each candidate family is obtained, and finally assessment result is selected to meet preset condition
Candidate family, and selected model is integrated according to preset integrated approach, malicious traffic stream detection model is obtained, thus
The detection of malicious traffic stream can be carried out according to the malicious traffic stream detection model, it is relatively existing to use predetermined manner to data on flows
The problem of detection mode, the present invention be can be avoided to manpower intervention, can machine learning execute the detection of malicious traffic stream automatically, solution
To artificial dependence in malicious traffic stream of having determined detection process, also, the method that the present invention is implemented combines PU learning model, energy
It is enough according to the potential feature and rule that find malicious traffic stream in known malicious flow, thus for unknown data on flows into
When row detection, known feature is only relied upon compared to previous detection means and rule is compared, there is better accuracy.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment
Point, reference can be made to the related descriptions of other embodiments.
It is understood that the correlated characteristic in the above method and device can be referred to mutually.In addition, in above-described embodiment
" first ", " second " etc. be and not represent the superiority and inferiority of each embodiment for distinguishing each embodiment.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description,
The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
Algorithm and display are not inherently related to any particular computer, virtual system, or other device provided herein.
Various general-purpose systems can also be used together with teachings based herein.As described above, it constructs required by this kind of system
Structure be obvious.In addition, the present invention is also not directed to any particular programming language.It should be understood that can use various
Programming language realizes summary of the invention described herein, and the description done above to language-specific is to disclose this hair
Bright preferred forms.
In addition, memory may include the non-volatile memory in computer-readable medium, random access memory
(RAM) and/or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM), memory includes extremely
A few storage chip.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application
Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more,
The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces
The form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application
Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions
The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs
Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real
The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
In a typical configuration, calculating equipment includes one or more processors (CPU), input/output interface, net
Network interface and memory.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/
Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable Jie
The example of matter.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), moves
State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable
Programmable read only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM),
Digital versatile disc (DVD) or other optical storage, magnetic cassettes, tape magnetic disk storage or other magnetic storage devices
Or any other non-transmission medium, can be used for storage can be accessed by a computing device information.As defined in this article, it calculates
Machine readable medium does not include temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability
It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap
Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including element
There is also other identical elements in process, method, commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can provide as method, system or computer program product.
Therefore, complete hardware embodiment, complete software embodiment or embodiment combining software and hardware aspects can be used in the application
Form.It is deposited moreover, the application can be used to can be used in the computer that one or more wherein includes computer usable program code
The shape for the computer program product implemented on storage media (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)
Formula.
The above is only embodiments herein, are not intended to limit this application.To those skilled in the art,
Various changes and changes are possible in this application.It is all within the spirit and principles of the present application made by any modification, equivalent replacement,
Improve etc., it should be included within the scope of the claims of this application.
Claims (10)
1. a kind of method of the building based on the PU malicious traffic stream detection model learnt, wherein the described method includes:
It obtains data on flows and is used as sample data set, it includes positive sample data with positive label and without mark that the sample data, which is concentrated,
The unmarked sample data of label, wherein positive tag representation malicious traffic stream;
Multiple candidate families are obtained based on sample data set training;
Assessment collection is constructed based on the sample data set;
Each candidate family is assessed respectively according to assessment collection and default evaluation condition, obtains corresponding to each time
The assessment result of modeling type;
Selection assessment result meets the candidate family of preset condition;
Selected model is integrated according to preset integrated approach, obtains malicious traffic stream detection model.
2. the method for claim 1, wherein obtaining multiple candidate families based on sample data set training includes:
Multiple training sets are constructed based on the sample data set;
It is selected, is instructed respectively from the set and the multiple training set that the set of machine learning algorithm, hyper parameter combine
Get multiple candidate families;Wherein, a kind of machine learning algorithm, one group of hyper parameter and a training set determine a candidate mould
Type.
3. method according to claim 2, wherein described to include: based on the multiple training sets of sample data set building
A positive sample training subset is constructed based on at least partly positive sample data that the sample data is concentrated, to the sample
Unmarked sample data in data set carries out multiple repairing weld operation and constructs multiple negative sample training subsets, and the positive sample is instructed
Practice subset and the multiple negative sample training subset is respectively combined to obtain multiple training sets;
Alternatively,
Multiple positive sample training subsets are constructed based on at least partly positive sample data that the sample data is concentrated, to the sample
Unmarked sample data in data set carries out that the multiple negative sample training subsets of operation building are employed many times, and each positive sample is instructed
Practice subset and the multiple negative sample training subset is respectively combined to obtain multiple training sets.
4. the method for claim 1, wherein described include: based on sample data set construction assessment collection
Sampling building positive sample assessment subset is carried out to the positive sample data that the sample data is concentrated, to the sample data set
In unmarked sample data carry out sampling building negative sample assessment subset, positive sample is assessed into subset and negative sample and assesses subset
Combination obtains assessment collection.
5. the method for claim 1, wherein
The sample data set construction assessment collection that is based on includes: to construct multiple assessments based on the sample data set to collect,
In each assessment to concentrate include positive sample data and the unmarked sample data as negative sample data;
It is described that each candidate family is assessed respectively according to assessment collection and default evaluation condition, it obtains corresponding every
The assessment result of a candidate family, comprising: for each candidate family, according to the multiple assessment collection and default evaluation condition point
It is other that the candidate family is assessed, multiple assessment results are obtained, the multiple assessment result is merged and obtains the candidate family pair
The final assessment result answered.
6. a kind of malicious traffic stream detection method based on PU learning model, wherein
Obtain data on flows to be detected;
According to the method according to any one of claims 1 to 5, malicious traffic stream detection model is constructed;
The data on flows to be detected is detected using obtained malicious traffic stream detection model.
7. a kind of device of the building based on the PU malicious traffic stream detection model learnt, wherein described device includes:
Acquiring unit, for obtaining data on flows as sample data set, the sample data concentration include with positive label just
Sample data and unmarked sample data without label, wherein positive tag representation malicious traffic stream;
Training unit, for obtaining multiple candidate families based on sample data set training;
Structural unit, for based on sample data set construction assessment collection;
Assessment unit is obtained for being assessed respectively each candidate family according to assessment collection and default evaluation condition
To the assessment result of each candidate family of correspondence;
Selecting unit, for selecting assessment result to meet the candidate family of preset condition;
Integrated unit obtains malicious traffic stream detection mould for integrating according to preset integrated approach to selected model
Type.
8. a kind of malicious traffic stream detection system based on PU learning model, wherein
Data to be tested acquiring unit, for obtaining data on flows to be detected;
Device as claimed in claim 7, for constructing malicious traffic stream detection model;
Detection unit, for being detected using obtained malicious traffic stream detection model to the data on flows to be detected.
9. a kind of computer readable storage medium, wherein it is stored with computer program on the computer readable storage medium,
In, side described in any one of claim 1-6 is realized when the computer program is executed by one or more computing devices
Method.
10. a kind of system including one or more computing devices and one or more storage devices, one or more of to deposit
Record has computer program on storage device, and the computer program makes institute when being executed by one or more of computing devices
It states one or more computing devices and realizes such as method of any of claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910333902.XA CN109936582B (en) | 2019-04-24 | 2019-04-24 | Method and device for constructing malicious traffic detection model based on PU learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910333902.XA CN109936582B (en) | 2019-04-24 | 2019-04-24 | Method and device for constructing malicious traffic detection model based on PU learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109936582A true CN109936582A (en) | 2019-06-25 |
CN109936582B CN109936582B (en) | 2020-04-28 |
Family
ID=66990886
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910333902.XA Active CN109936582B (en) | 2019-04-24 | 2019-04-24 | Method and device for constructing malicious traffic detection model based on PU learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109936582B (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110414187A (en) * | 2019-07-03 | 2019-11-05 | 北京百度网讯科技有限公司 | Model safety delivers the system and method for automation |
CN111950738A (en) * | 2020-08-10 | 2020-11-17 | 中国平安人寿保险股份有限公司 | Machine learning model optimization effect evaluation method and device, terminal and storage medium |
WO2021057427A1 (en) * | 2019-09-25 | 2021-04-01 | 西安交通大学 | Pu learning based cross-regional enterprise tax evasion recognition method and system |
CN112884190A (en) * | 2019-11-29 | 2021-06-01 | 杭州海康威视数字技术股份有限公司 | Flow prediction method and device |
CN113111928A (en) * | 2021-04-01 | 2021-07-13 | 中国地质大学(北京) | Semi-supervised learning mineral resource quantitative prediction method based on geoscience database |
CN113271236A (en) * | 2021-06-11 | 2021-08-17 | 国家计算机网络与信息安全管理中心 | Engine evaluation method, device, equipment and storage medium |
CN113849645A (en) * | 2021-09-28 | 2021-12-28 | 平安科技(深圳)有限公司 | Mail classification model training method, device, equipment and storage medium |
CN113935031A (en) * | 2020-12-03 | 2022-01-14 | 网神信息技术(北京)股份有限公司 | Method and system for file feature extraction range configuration and static malicious software identification |
CN114095284A (en) * | 2022-01-24 | 2022-02-25 | 军事科学院系统工程研究院网络信息研究所 | Intelligent traffic scheduling protection method and system |
CN114553496A (en) * | 2022-01-28 | 2022-05-27 | 中国科学院信息工程研究所 | Malicious domain name detection method and device based on semi-supervised learning |
CN114629718A (en) * | 2022-04-07 | 2022-06-14 | 浙江工业大学 | Hidden malicious behavior detection method based on multi-model fusion |
CN115859106A (en) * | 2022-12-05 | 2023-03-28 | 中国地质大学(北京) | Mineral exploration method and device based on semi-supervised learning and storage medium |
CN116910501A (en) * | 2023-07-28 | 2023-10-20 | 中国电子科技集团公司第十五研究所 | Error case driven data identification method, device and equipment |
CN113849645B (en) * | 2021-09-28 | 2024-06-04 | 平安科技(深圳)有限公司 | Mail classification model training method, device, equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107169506A (en) * | 2017-04-14 | 2017-09-15 | 微梦创科网络科技(中国)有限公司 | Random assortment method and device based on assembled classifier |
CN108111489A (en) * | 2017-12-07 | 2018-06-01 | 阿里巴巴集团控股有限公司 | URL attack detection methods, device and electronic equipment |
US10063582B1 (en) * | 2017-05-31 | 2018-08-28 | Symantec Corporation | Securing compromised network devices in a network |
-
2019
- 2019-04-24 CN CN201910333902.XA patent/CN109936582B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107169506A (en) * | 2017-04-14 | 2017-09-15 | 微梦创科网络科技(中国)有限公司 | Random assortment method and device based on assembled classifier |
US10063582B1 (en) * | 2017-05-31 | 2018-08-28 | Symantec Corporation | Securing compromised network devices in a network |
CN108111489A (en) * | 2017-12-07 | 2018-06-01 | 阿里巴巴集团控股有限公司 | URL attack detection methods, device and electronic equipment |
Non-Patent Citations (1)
Title |
---|
宋群等: "基于集成PU学习数据流分类的入侵检测方法", 《微电子学与计算机》 * |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110414187A (en) * | 2019-07-03 | 2019-11-05 | 北京百度网讯科技有限公司 | Model safety delivers the system and method for automation |
CN110414187B (en) * | 2019-07-03 | 2021-09-17 | 北京百度网讯科技有限公司 | System and method for model safety delivery automation |
WO2021057427A1 (en) * | 2019-09-25 | 2021-04-01 | 西安交通大学 | Pu learning based cross-regional enterprise tax evasion recognition method and system |
CN112884190A (en) * | 2019-11-29 | 2021-06-01 | 杭州海康威视数字技术股份有限公司 | Flow prediction method and device |
CN112884190B (en) * | 2019-11-29 | 2023-11-03 | 杭州海康威视数字技术股份有限公司 | Flow prediction method and device |
CN111950738B (en) * | 2020-08-10 | 2023-09-15 | 中国平安人寿保险股份有限公司 | Machine learning model optimization effect evaluation method, device, terminal and storage medium |
CN111950738A (en) * | 2020-08-10 | 2020-11-17 | 中国平安人寿保险股份有限公司 | Machine learning model optimization effect evaluation method and device, terminal and storage medium |
CN113935031A (en) * | 2020-12-03 | 2022-01-14 | 网神信息技术(北京)股份有限公司 | Method and system for file feature extraction range configuration and static malicious software identification |
CN113111928A (en) * | 2021-04-01 | 2021-07-13 | 中国地质大学(北京) | Semi-supervised learning mineral resource quantitative prediction method based on geoscience database |
CN113111928B (en) * | 2021-04-01 | 2023-12-29 | 中国地质大学(北京) | Semi-supervised learning mineral resource quantitative prediction method based on geometrics database |
CN113271236A (en) * | 2021-06-11 | 2021-08-17 | 国家计算机网络与信息安全管理中心 | Engine evaluation method, device, equipment and storage medium |
CN113849645A (en) * | 2021-09-28 | 2021-12-28 | 平安科技(深圳)有限公司 | Mail classification model training method, device, equipment and storage medium |
CN113849645B (en) * | 2021-09-28 | 2024-06-04 | 平安科技(深圳)有限公司 | Mail classification model training method, device, equipment and storage medium |
CN114095284B (en) * | 2022-01-24 | 2022-04-15 | 军事科学院系统工程研究院网络信息研究所 | Intelligent traffic scheduling protection method and system |
CN114095284A (en) * | 2022-01-24 | 2022-02-25 | 军事科学院系统工程研究院网络信息研究所 | Intelligent traffic scheduling protection method and system |
CN114553496A (en) * | 2022-01-28 | 2022-05-27 | 中国科学院信息工程研究所 | Malicious domain name detection method and device based on semi-supervised learning |
CN114553496B (en) * | 2022-01-28 | 2022-11-15 | 中国科学院信息工程研究所 | Malicious domain name detection method and device based on semi-supervised learning |
CN114629718A (en) * | 2022-04-07 | 2022-06-14 | 浙江工业大学 | Hidden malicious behavior detection method based on multi-model fusion |
CN115859106A (en) * | 2022-12-05 | 2023-03-28 | 中国地质大学(北京) | Mineral exploration method and device based on semi-supervised learning and storage medium |
CN116910501A (en) * | 2023-07-28 | 2023-10-20 | 中国电子科技集团公司第十五研究所 | Error case driven data identification method, device and equipment |
CN116910501B (en) * | 2023-07-28 | 2024-04-12 | 中国电子科技集团公司第十五研究所 | Error case driven data identification method, device and equipment |
Also Published As
Publication number | Publication date |
---|---|
CN109936582B (en) | 2020-04-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109936582A (en) | Construct the method and device based on the PU malicious traffic stream detection model learnt | |
CN110084374A (en) | Construct method, apparatus and prediction technique, device based on the PU model learnt | |
Li et al. | Localizing and quantifying damage in social media images | |
CN110348580B (en) | Method and device for constructing GBDT model, and prediction method and device | |
CN106201871B (en) | Based on the Software Defects Predict Methods that cost-sensitive is semi-supervised | |
KR102142205B1 (en) | Explainable AI Modeling and Simulation System and Method | |
CN105069470A (en) | Classification model training method and device | |
CN107203467A (en) | The reference test method and device of supervised learning algorithm under a kind of distributed environment | |
CN108009593A (en) | A kind of transfer learning optimal algorithm choosing method and system | |
CN108830443A (en) | A kind of contract review method and device | |
CN108647800A (en) | A kind of online social network user missing attribute forecast method based on node insertion | |
CN110263934A (en) | A kind of artificial intelligence data mask method and device | |
CN112364352A (en) | Interpretable software vulnerability detection and recommendation method and system | |
CN113780342A (en) | Intelligent detection method and device based on self-supervision pre-training and robot | |
CN113641906A (en) | System, method, device, processor and medium for realizing similar target person identification processing based on fund transaction relation data | |
CN106935038A (en) | One kind parking detecting system and detection method | |
Ullah et al. | Adaptive data balancing method using stacking ensemble model and its application to non-technical loss detection in smart grids | |
CN110009012A (en) | A kind of risk specimen discerning method, apparatus and electronic equipment | |
CN113158084B (en) | Method, device, computer equipment and storage medium for processing movement track data | |
CN114723554B (en) | Abnormal account identification method and device | |
CN109669964A (en) | Model repetitive exercise method and device | |
CN107291722B (en) | Descriptor classification method and device | |
CN115470524A (en) | Method, system, equipment and medium for detecting leakage of confidential documents | |
CN110163470A (en) | Case evaluating method and device | |
Sinha et al. | Implication of Soft Computing and Machine Learning Method for Software Quality, Defect and Model Prediction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |