KR20100058695A - Profile-based models and selective use for web application attack detection - Google Patents
Profile-based models and selective use for web application attack detection Download PDFInfo
- Publication number
- KR20100058695A KR20100058695A KR1020080117161A KR20080117161A KR20100058695A KR 20100058695 A KR20100058695 A KR 20100058695A KR 1020080117161 A KR1020080117161 A KR 1020080117161A KR 20080117161 A KR20080117161 A KR 20080117161A KR 20100058695 A KR20100058695 A KR 20100058695A
- Authority
- KR
- South Korea
- Prior art keywords
- model
- factor
- profile
- models
- value
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/70—Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
- G06F21/82—Protecting input, output or interconnection devices
- G06F21/83—Protecting input, output or interconnection devices input devices, e.g. keyboards, mice or controllers thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/30—Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information
- H04L63/302—Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information gathering intelligence information for situation awareness or reconnaissance
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/30—Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information
- H04L63/308—Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information retaining data, e.g. retaining successful, unsuccessful communication attempts, internet access, or e-mail, internet telephony, intercept related information or call content
Landscapes
- Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computing Systems (AREA)
- Technology Law (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
Recently, there are several profile-based techniques for detecting input manipulation attacks on web applications, but they have relatively high false positive rates and long detection times. In order to alleviate this drawback, the present invention improves the error rate and detection time by adding new models and existing models.
Description
Web application
Recently, web application attacks are increasing and diversifying as more and more people share information as well as financial transactions and sell / buy products through the web. These methods of attacking web servers are diverse and evolving, and researches to prevent them have been conducted steadily.
These studies mainly include the method of preventing the attack through the vulnerability scanning of the web application itself and the method of detecting the abnormal behavior through the input / output inspection of the web application. I / O checking method of web application is largely based on signature.
It can be divided into policy based and profile based.
Kruegel's research, one of the profile-based methods, presented a positive model for the length of arguments, character distribution, token information, structure inference, presence of arguments, and order of arguments. Sum the results and judge the abnormality. This study widens the scope of detection and makes it difficult for an attacker to tamper, but it has a slow detection speed, making it difficult to process in real time.
Therefore, the problem to be achieved by the present invention is to reduce the false positive rate and execution time by using a number of positive models, by selectively applying the model according to the characteristics of each factor.
As described above, the method of preventing input manipulation attack on the web server according to the present invention is slightly improved than the related research as shown in Fig. 1 at the false positive rate and significantly improved than the related research as shown in Fig. 2 at the false negative rate. . Figure 3 shows a comparison of the execution speeds. Using the selective model application, the execution speed is improved by 50% over the related studies.
The present invention uses the abnormal behavior detection method for determining the abnormal behavior based on the characteristics of the normal behavior in order to achieve the above technical problem. To do this, we use positive models that describe the characteristics of normal factor values. We use slightly modified length models, character distribution models, and token models used in Kruegel's research. Newly added constituent model, structural analysis model, and factor constituent model are used together.
Each model calculates the probability that the factor is normal and makes a final decision using Equation 1. Where is the probability that the factor is normal for model i, and is the threshold defined for each model. And represents a threshold for the probability mean of the entire model.
The length model finds the mean and the variance of the length of each factor value during training. For detection, Chebyshev inequality is used to quantify how far the corresponding parameter value is from the mean. In Equation 2, since the probability that any normal factor x is out of the range of the factor to be examined is smaller than p (1), p (1) may be used as the probability that the factor is normal.
In the present invention, as in Kruegel's study, the distribution of the letters of a string s is defined as the value of the relative frequency of each letter of s sorted. In the character distribution model, the ideal character distribution of each factor is calculated during training. The frequency of characters from 0 to 255 is divided into six sets of [0], [1,3], [4,6], [7,11], [12,15], and [16,255], and each is written as do. The test shows the probability of the deviation from the ideal character distribution. The value obtained from Equation 3 is converted into a table with 5 degrees of freedom, and the probability that the factor is normal.
In the token model, it is determined whether a corresponding factor is an enumeration type during training, and in the case of an enumeration type, a hash value of possible factor values is stored. Determining whether an argument is an enumeration follows the method used in Kruegel's work. This model applies only to the arguments of the enumerated type. At detection, the hash value of the value of the factor to be examined is compared with the stored normal values.
If the data type of the argument is numeric, a non-numeric value or a value outside the normal range may be considered abnormal. If the factor is a numeric type, the average length variance of the value is calculated for each factor in training. In the test, Chebyshev inequality is used to find the probability p (v) that the factor value v is normal.
Since one character is one byte, it has one of 256 values of 0 ~ 255 and belongs to one of five character sets as shown in Table 2. In the character composition model, the average of the frequencies in all cases is calculated during training to obtain the distribution of each character set in the ideal case. Is a probability value between 0 and 1, which, in an ideal case, represents the probability that one letter of the argument belongs to the part. At detection time, the character composition of the value of the argument is obtained and then tested by how much different it is from the ideal case.
In the structural analysis model, the structure of the factor value is determined during training. If the argument value has a structure, that structure is determined by special characters. First, through the tokenization process, consecutive plain text is regarded as one token, and each special character is regarded as one token. Tokenize each of the profiling data and then create a state machine for each. Each state machine is made into a state machine using state merging. At the time of detection, it verifies that the factor to be examined conforms to the structure. If you follow the structure, it can be judged as normal and if it is not followed, it can be judged abnormal.
Save the normal parameter configuration of the URL during training. This is done by hashing consecutive argument names in the normal argument configuration and storing the hash value. At the time of detection, it checks the hash value of the composition of the factor to check and if it belongs to the normal value, it is determined as normal.
The present invention proposes a method of selectively applying only models that are important to each factor without applying all models to all factors. After profiling, the characteristics of each factor are identified, and the model set to be applied in the factor is configured in advance.
The method for determining the model set is made up of four steps. First, we remove models that cannot be applied depending on the data type and value characteristics of the factor. Second, determine what model is important for that factor. Third, determine which model is preventing the detection of the factor. Fourth, determine what models are duplicated in the argument.
Table 1
Table 2
Equation 1
Equation 2
Equation 3
Claims (2)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020080117161A KR20100058695A (en) | 2008-11-25 | 2008-11-25 | Profile-based models and selective use for web application attack detection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020080117161A KR20100058695A (en) | 2008-11-25 | 2008-11-25 | Profile-based models and selective use for web application attack detection |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20100058695A true KR20100058695A (en) | 2010-06-04 |
Family
ID=42360105
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020080117161A KR20100058695A (en) | 2008-11-25 | 2008-11-25 | Profile-based models and selective use for web application attack detection |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR20100058695A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101959544B1 (en) | 2018-06-01 | 2019-03-18 | 주식회사 에프원시큐리티 | Web attack detection and prevention system and method |
-
2008
- 2008-11-25 KR KR1020080117161A patent/KR20100058695A/en not_active Application Discontinuation
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101959544B1 (en) | 2018-06-01 | 2019-03-18 | 주식회사 에프원시큐리티 | Web attack detection and prevention system and method |
US11171919B1 (en) | 2018-06-01 | 2021-11-09 | F1 Security Inc. | Web attack detecting and blocking system and method thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Arp et al. | Dos and don'ts of machine learning in computer security | |
Abu-Nimeh et al. | A comparison of machine learning techniques for phishing detection | |
Cao et al. | Machine learning to detect anomalies in web log analysis | |
US8037536B2 (en) | Risk scoring system for the prevention of malware | |
CN103559235B (en) | A kind of online social networks malicious web pages detection recognition methods | |
CN105956180B (en) | A kind of filtering sensitive words method | |
CN104123501B (en) | A kind of viral online test method based on many assessor set | |
CN105072214A (en) | C&C domain name identification method based on domain name feature | |
CN109886016B (en) | Method, apparatus, and computer-readable storage medium for detecting abnormal data | |
Kim et al. | Differential effects of prior experience on the malware resolution process | |
Dey et al. | Detection of fake accounts in Instagram using machine learning | |
Layton et al. | Unsupervised authorship analysis of phishing webpages | |
CN108171054A (en) | The detection method and system of a kind of malicious code for social deception | |
CN112765660A (en) | Terminal security analysis method and system based on MapReduce parallel clustering technology | |
US9600644B2 (en) | Method, a computer program and apparatus for analyzing symbols in a computer | |
YANG et al. | Phishing website detection using C4. 5 decision tree | |
Zhang et al. | A novel anomaly detection approach for mitigating web-based attacks against clouds | |
Meena Siwach | Anomaly detection for web log data analysis: a review | |
US11321453B2 (en) | Method and system for detecting and classifying malware based on families | |
KR20100058695A (en) | Profile-based models and selective use for web application attack detection | |
CN110472416A (en) | A kind of web virus detection method and relevant apparatus | |
Alazab et al. | Malicious code detection using penalized splines on OPcode frequency | |
CN114257391B (en) | Risk assessment method, apparatus and computer readable storage medium | |
Parish et al. | Password guessers under a microscope: an in-depth analysis to inform deployments | |
Miyamoto et al. | A proposal of the AdaBoost-based detection of phishing sites |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
N231 | Notification of change of applicant | ||
WITN | Withdrawal due to no request for examination |