CN106209845A

CN106209845A - A kind of malicious HTTP based on Bayesian Learning Theory request decision method

Info

Publication number: CN106209845A
Application number: CN201610546795.5A
Authority: CN
Inventors: 何清林; 马秀娟; 张家琦; 王子厚; 王大伟; 朱佳伟; 刘培朋; 王维晟
Original assignee: National Computer Network and Information Security Management Center
Current assignee: National Computer Network and Information Security Management Center
Priority date: 2016-07-12
Filing date: 2016-07-12
Publication date: 2016-12-07

Abstract

The invention discloses a kind of malicious HTTP based on Bayesian Learning Theory request decision method, it is as follows that the method comprising the steps of: collects normal HTTP request and the malicious HTTP request setting quantity；The normal HTTP request collected and malicious HTTP request are respectively processed acquisition sample set, sample in sample set includes sample class and sample characteristics space: inputted as training set by sample set, Bayes's classification learning algorithm, study is utilized to obtain a quadratic classifier；To HTTP request to be determined, extract and judge feature, it is thus achieved that judge feature space, utilize in quadratic classifier and be predicted, judgement is the HTTP request of malice or normal HTTP request, and is that HTTP request to be determined adds label with result of determination, is derived from result of determination.The method can interpolate that the request of malice when the HTTP request that subscriber terminal side is initiated or normal request.

Description

A kind of malicious HTTP based on Bayesian Learning Theory request decision method

Technical field

The invention belongs to technical field of network security, being specifically related to a kind of malicious HTTP based on Bayesian Learning Theory please Seek decision method.

Background technology

Due to standard and the suitability of http protocol, in addition to common web site class service, the most emerging is various Mobile applications APP also begins to use http protocol to carry out data communication.A lot of application is had all to pass through the most resident backstage Mode, from trend service end send HTTP request message transmission data.If malicious application, these HTTP request can relate to To stealing privacy of user, the malicious act such as corpse wooden horse message propagation.

HTTP request is to survey, from user's direction finding service end, the message initiated, and generally uses HTTP GET mode or HTTP POST mode.For HTTP GET method, request message is as follows:

“/domain-name/demo_form.jsp？Name1=value1&name2=value2 "

For POST method, request message is as follows:

" POST/test/demo_form.jsp HTTP/1.1, Host:w3schools.com

Name1=value1&name2=value2 ".

From learning above, no matter it is HTTP GET request, or HTTP POST request, request all contain similar " name=value " field, these fields are the fields that application program oneself adds, and application program is exactly based on these fields Transmit the content of user side.This field is to judge the key point whether HTTP request is malicious act.

How to go to judge normal still malice when the HTTP request that user side sends, be a skill needing to solve An art difficult problem, whether the present invention proposes a kind of method based on bayesian theory, it is possible to be that malicious act is carried out to HTTP request Automatic Prediction and judgement.The method is based primarily upon Bayesian learning category theory, and this theory is applied to spam mistake The application such as filter.Bayes principle is a kind of ultimate principle of Probability, and full probability theory theoretical according to conditional probability, uses priori Probability judges posterior probability.

Summary of the invention

In view of this, the invention provides a kind of malicious HTTP based on Bayesian Learning Theory request decision method, energy Enough requests judging malice when the HTTP request that subscriber terminal side is initiated or normal request.

In order to achieve the above object, the technical scheme is that a kind of malicious HTTP based on Bayesian Learning Theory Request decision method, it is as follows that the method comprising the steps of:

S1, the normal HTTP request collecting setting quantity and malicious HTTP request.

S2, the normal HTTP request collected and malicious HTTP request are carried out the process of following S2.1～S2.4 respectively, with This obtains sample set, particularly as follows:

S2.1, the HTTP request collected is carried out manual tag classification, if normal HTTP request, then stamp mark Sign 0, if the HTTP request of malice, the most tagged 1.

S2.2, to all HTTP request collected, extract " value " word in " name=value " therein field Identifier value information, " value " character occurred in all HTTP request in sample set is as feature space.

S2.3, using each HTTP request as a sample, form sample set, sample includes that sample class and sample are special Levy space:

The classification of sample is the label that in S2.1, handmarking is good, is 0 or 1.

Sample characteristics space is the feature space in S2.2, and by sample characteristics space to should sample occur The field mark of all " value " character values is 1, is otherwise designated as 0.

S3, being inputted as training set by the sample set in step S2, utilize Bayes's classification learning algorithm, study obtains One quadratic classifier.

S4, to HTTP request to be determined, extract and judge feature, it is determined that characteristic extraction procedure is as follows: set up with in S2.2 The consistent judgement feature space of feature space, the equal initial markers of the most all fields is 0, then by HTTP request to be determined Middle all " value " character value corresponding field occurred are updated to 1, and it is constant that other remain 0.

S5, the HTTP request to be determined in S4 is put in the quadratic classifier in S3 it is predicted, it is determined that be malice HTTP request or normal HTTP request, and be that HTTP request to be determined adds label with result of determination, be derived from Result of determination.

Further, in S5, it is thus achieved that after result of determination, result of determination will with the addition of the HTTP to be determined of label Request, joins in training set as new sample, repeats step S2 and S3, updates quadratic classifier, until grader is stable.

Beneficial effect:

The method is based on Bayesian learning category theory, according to " name=value " field in known classification HTTP request Whether the probability learning occurred is malicious HTTP request, then extracts " name=value " field in unfiled HTTP request Information, judges the request the most maliciously of this HTTP request, and the method can quickly and accurately judge whether HTTP request dislikes Meaning.

Detailed description of the invention

Name embodiment, describe the present invention.

Whether the present invention proposes a kind of method based on bayesian theory, it is possible to be that malicious act is carried out to HTTP request Automatic Prediction and judgement.The method is based primarily upon Bayesian learning category theory, and this theory is applied to spam mistake The application such as filter.Bayes principle is a kind of ultimate principle of Probability, and full probability theory theoretical according to conditional probability, uses priori Probability judges posterior probability.Whether the probability learning occurred according to " name=value " field in known classification HTTP request It is malicious HTTP request, then extracts " name=value " field information in unfiled HTTP request, judge that this HTTP please The request of the no malice of Seeking Truth.It is as follows that the method comprising the steps of:

First a number of normal HTTP request and malicious HTTP request are collected；

S2. the HTTP request collected labelled and extracts feature, inputting as training set；

Wherein, S2 also comprises the steps:

First S2.1 carries out manual tag classification to the HTTP request collected, if normal HTTP request, then beats Upper label 0, if the HTTP request of malice, the most tagged 1；

S2.2, to all HTTP request collected, extracts " value " character in " name=value " therein field Value information, using " value " character of being occurred as feature space；

S2.3 is using each HTTP request as a sample, and the classification of sample is the mark that in S2.1, handmarking is good Sign, be 0 or 1；The feature space of sample is the feature space in S2.2: if certain " value " character value in this sample Occurred, be then 1 by this feature field mark, be otherwise designated as 0；

The HTTP request sample that each is collected by S2.4 inputs as training set；

S3. being inputted as training set by the sample set in step S2, utilize Bayes's classification learning algorithm, study is to one Individual quadratic classifier；

S4. the HTTP request judged needs, first extracts and calculates feature, prepares to start in advance as a sample Survey.Feature calculation process is as follows: using the feature space in S2.2 as feature space, is all labeled as 0, extracts this HTTP request " value " character value in middle all " name=value " occurred, by these " value " character value characteristics of correspondence more Being newly 1, it is constant that other remain 0；

S5. the sample to be predicted in S4 is put into S3 learning to quadratic classifier in be predicted, it is determined whether be The HTTP request of malice or normal HTTP request；

S6. the sample will predicted in S5, selective manually judge confirm after, join training as new sample Concentrating, repeat step S2, the content of S3, strengthening grader study, until grader is stable.

To sum up, these are only presently preferred embodiments of the present invention, be not intended to limit protection scope of the present invention.All Within the spirit and principles in the present invention, any modification, equivalent substitution and improvement etc. made, should be included in the protection of the present invention Within the scope of.

Claims

1. malicious HTTP based on a Bayesian Learning Theory request decision method, it is characterised in that the method comprising the steps of As follows:

S1, the normal HTTP request collecting setting quantity and malicious HTTP request；

S2, the normal HTTP request collected and malicious HTTP request are carried out the process of following S2.1～S2.4 respectively, obtain with this Obtain sample set, particularly as follows:

S2.1, the HTTP request collected is carried out manual tag classification, if normal HTTP request, the most tagged 0, If the HTTP request of malice, the most tagged 1；

S2.2, to all HTTP request collected, extract " value " character value in " name=value " therein field Information, " value " character occurred in all HTTP request in sample set is as feature space；

S2.3, using each HTTP request as a sample, form sample set, sample includes that sample class and sample characteristics are empty Between:

The classification of sample is the label that in S2.1, handmarking is good, is 0 or 1；

Sample characteristics space is the feature space in S2.2, and by all to should sample occur in sample characteristics space The field mark of " value " character value is 1, is otherwise designated as 0；

S3, being inputted as training set by the sample set in step S2, utilize Bayes's classification learning algorithm, study obtains one Quadratic classifier；

S4, to HTTP request to be determined, extract and judge feature, it is determined that characteristic extraction procedure is as follows: set up and the spy in S2.2 Levying the judgement feature space that space is consistent, the equal initial markers of the most all fields is 0, then will go out in HTTP request to be determined All " value " the character value corresponding field now crossed are updated to 1, and it is constant that other remain 0；

S5, the HTTP request to be determined in S4 is put in the quadratic classifier in S3 it is predicted, it is determined that be malice HTTP request or normal HTTP request, and be that HTTP request to be determined adds label with result of determination, it is derived from sentencing Determine result.

A kind of malicious HTTP based on Bayesian Learning Theory request decision method, its feature exists In, in described S5, it is thus achieved that after result of determination, result of determination will with the addition of the HTTP request to be determined of label, as New sample joins in training set, repeats step S2 and S3, updates quadratic classifier, until grader is stable.