KR20100058695A - Profile-based models and selective use for web application attack detection - Google Patents

Profile-based models and selective use for web application attack detection Download PDF

Info

Publication number
KR20100058695A
KR20100058695A KR1020080117161A KR20080117161A KR20100058695A KR 20100058695 A KR20100058695 A KR 20100058695A KR 1020080117161 A KR1020080117161 A KR 1020080117161A KR 20080117161 A KR20080117161 A KR 20080117161A KR 20100058695 A KR20100058695 A KR 20100058695A
Authority
KR
South Korea
Prior art keywords
model
factor
profile
models
value
Prior art date
Application number
KR1020080117161A
Other languages
Korean (ko)
Inventor
박영민
Original Assignee
박영민
중앙대학교 산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 박영민, 중앙대학교 산학협력단 filed Critical 박영민
Priority to KR1020080117161A priority Critical patent/KR20100058695A/en
Publication of KR20100058695A publication Critical patent/KR20100058695A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/82Protecting input, output or interconnection devices
    • G06F21/83Protecting input, output or interconnection devices input devices, e.g. keyboards, mice or controllers thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/30Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information
    • H04L63/302Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information gathering intelligence information for situation awareness or reconnaissance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/30Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information
    • H04L63/308Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information retaining data, e.g. retaining successful, unsuccessful communication attempts, internet access, or e-mail, internet telephony, intercept related information or call content

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computing Systems (AREA)
  • Technology Law (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

Recently, there are several profile-based techniques for detecting input manipulation attacks on web applications, but they have relatively high false positive rates and long detection times. In order to alleviate this drawback, the present invention improves the error rate and detection time by adding new models and existing models.

Description

Profile-based Models and Selective Use for Web Application Attack Detection

Web application

Recently, web application attacks are increasing and diversifying as more and more people share information as well as financial transactions and sell / buy products through the web. These methods of attacking web servers are diverse and evolving, and researches to prevent them have been conducted steadily.

These studies mainly include the method of preventing the attack through the vulnerability scanning of the web application itself and the method of detecting the abnormal behavior through the input / output inspection of the web application. I / O checking method of web application is largely based on signature.

It can be divided into policy based and profile based.

Kruegel's research, one of the profile-based methods, presented a positive model for the length of arguments, character distribution, token information, structure inference, presence of arguments, and order of arguments. Sum the results and judge the abnormality. This study widens the scope of detection and makes it difficult for an attacker to tamper, but it has a slow detection speed, making it difficult to process in real time.

Therefore, the problem to be achieved by the present invention is to reduce the false positive rate and execution time by using a number of positive models, by selectively applying the model according to the characteristics of each factor.

As described above, the method of preventing input manipulation attack on the web server according to the present invention is slightly improved than the related research as shown in Fig. 1 at the false positive rate and significantly improved than the related research as shown in Fig. 2 at the false negative rate. . Figure 3 shows a comparison of the execution speeds. Using the selective model application, the execution speed is improved by 50% over the related studies.

The present invention uses the abnormal behavior detection method for determining the abnormal behavior based on the characteristics of the normal behavior in order to achieve the above technical problem. To do this, we use positive models that describe the characteristics of normal factor values. We use slightly modified length models, character distribution models, and token models used in Kruegel's research. Newly added constituent model, structural analysis model, and factor constituent model are used together.

Each model calculates the probability that the factor is normal and makes a final decision using Equation 1. Where is the probability that the factor is normal for model i, and is the threshold defined for each model. And represents a threshold for the probability mean of the entire model.

The length model finds the mean and the variance of the length of each factor value during training. For detection, Chebyshev inequality is used to quantify how far the corresponding parameter value is from the mean. In Equation 2, since the probability that any normal factor x is out of the range of the factor to be examined is smaller than p (1), p (1) may be used as the probability that the factor is normal.

In the present invention, as in Kruegel's study, the distribution of the letters of a string s is defined as the value of the relative frequency of each letter of s sorted. In the character distribution model, the ideal character distribution of each factor is calculated during training. The frequency of characters from 0 to 255 is divided into six sets of [0], [1,3], [4,6], [7,11], [12,15], and [16,255], and each is written as do. The test shows the probability of the deviation from the ideal character distribution. The value obtained from Equation 3 is converted into a table with 5 degrees of freedom, and the probability that the factor is normal.

In the token model, it is determined whether a corresponding factor is an enumeration type during training, and in the case of an enumeration type, a hash value of possible factor values is stored. Determining whether an argument is an enumeration follows the method used in Kruegel's work. This model applies only to the arguments of the enumerated type. At detection, the hash value of the value of the factor to be examined is compared with the stored normal values.

If the data type of the argument is numeric, a non-numeric value or a value outside the normal range may be considered abnormal. If the factor is a numeric type, the average length variance of the value is calculated for each factor in training. In the test, Chebyshev inequality is used to find the probability p (v) that the factor value v is normal.

Since one character is one byte, it has one of 256 values of 0 ~ 255 and belongs to one of five character sets as shown in Table 2. In the character composition model, the average of the frequencies in all cases is calculated during training to obtain the distribution of each character set in the ideal case. Is a probability value between 0 and 1, which, in an ideal case, represents the probability that one letter of the argument belongs to the part. At detection time, the character composition of the value of the argument is obtained and then tested by how much different it is from the ideal case.

In the structural analysis model, the structure of the factor value is determined during training. If the argument value has a structure, that structure is determined by special characters. First, through the tokenization process, consecutive plain text is regarded as one token, and each special character is regarded as one token. Tokenize each of the profiling data and then create a state machine for each. Each state machine is made into a state machine using state merging. At the time of detection, it verifies that the factor to be examined conforms to the structure. If you follow the structure, it can be judged as normal and if it is not followed, it can be judged abnormal.

Save the normal parameter configuration of the URL during training. This is done by hashing consecutive argument names in the normal argument configuration and storing the hash value. At the time of detection, it checks the hash value of the composition of the factor to check and if it belongs to the normal value, it is determined as normal.

The present invention proposes a method of selectively applying only models that are important to each factor without applying all models to all factors. After profiling, the characteristics of each factor are identified, and the model set to be applied in the factor is configured in advance.

The method for determining the model set is made up of four steps. First, we remove models that cannot be applied depending on the data type and value characteristics of the factor. Second, determine what model is important for that factor. Third, determine which model is preventing the detection of the factor. Fourth, determine what models are duplicated in the argument.

Table 1

Figure 112008081001305-PAT00001

Table 2

Figure 112008081001305-PAT00002

Equation 1

Figure 112008081001305-PAT00003

Equation 2

Figure 112008081001305-PAT00004

Equation 3

Figure 112008081001305-PAT00005

Claims (2)

Improved performance by modifying existing positive models based on profile to prevent input manipulation by checking input value in front of web server. Length model; Character distribution model; Token model; New models to reduce false positives, Value range model; Character composition model; Structural analysis model; Factor configuration model;
KR1020080117161A 2008-11-25 2008-11-25 Profile-based models and selective use for web application attack detection KR20100058695A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020080117161A KR20100058695A (en) 2008-11-25 2008-11-25 Profile-based models and selective use for web application attack detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020080117161A KR20100058695A (en) 2008-11-25 2008-11-25 Profile-based models and selective use for web application attack detection

Publications (1)

Publication Number Publication Date
KR20100058695A true KR20100058695A (en) 2010-06-04

Family

ID=42360105

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020080117161A KR20100058695A (en) 2008-11-25 2008-11-25 Profile-based models and selective use for web application attack detection

Country Status (1)

Country Link
KR (1) KR20100058695A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101959544B1 (en) 2018-06-01 2019-03-18 주식회사 에프원시큐리티 Web attack detection and prevention system and method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101959544B1 (en) 2018-06-01 2019-03-18 주식회사 에프원시큐리티 Web attack detection and prevention system and method
US11171919B1 (en) 2018-06-01 2021-11-09 F1 Security Inc. Web attack detecting and blocking system and method thereof

Similar Documents

Publication Publication Date Title
Arp et al. Dos and don'ts of machine learning in computer security
Abu-Nimeh et al. A comparison of machine learning techniques for phishing detection
Cao et al. Machine learning to detect anomalies in web log analysis
US8037536B2 (en) Risk scoring system for the prevention of malware
CN103559235B (en) A kind of online social networks malicious web pages detection recognition methods
CN105956180B (en) A kind of filtering sensitive words method
CN104123501B (en) A kind of viral online test method based on many assessor set
CN105072214A (en) C&C domain name identification method based on domain name feature
CN109886016B (en) Method, apparatus, and computer-readable storage medium for detecting abnormal data
Kim et al. Differential effects of prior experience on the malware resolution process
Dey et al. Detection of fake accounts in Instagram using machine learning
Layton et al. Unsupervised authorship analysis of phishing webpages
CN108171054A (en) The detection method and system of a kind of malicious code for social deception
CN112765660A (en) Terminal security analysis method and system based on MapReduce parallel clustering technology
US9600644B2 (en) Method, a computer program and apparatus for analyzing symbols in a computer
YANG et al. Phishing website detection using C4. 5 decision tree
Zhang et al. A novel anomaly detection approach for mitigating web-based attacks against clouds
Meena Siwach Anomaly detection for web log data analysis: a review
US11321453B2 (en) Method and system for detecting and classifying malware based on families
KR20100058695A (en) Profile-based models and selective use for web application attack detection
CN110472416A (en) A kind of web virus detection method and relevant apparatus
Alazab et al. Malicious code detection using penalized splines on OPcode frequency
CN114257391B (en) Risk assessment method, apparatus and computer readable storage medium
Parish et al. Password guessers under a microscope: an in-depth analysis to inform deployments
Miyamoto et al. A proposal of the AdaBoost-based detection of phishing sites

Legal Events

Date Code Title Description
N231 Notification of change of applicant
WITN Withdrawal due to no request for examination