CN107644166A - It is a kind of based on the WEB application safety protecting method learnt automatically - Google Patents
It is a kind of based on the WEB application safety protecting method learnt automatically Download PDFInfo
- Publication number
- CN107644166A CN107644166A CN201710863641.3A CN201710863641A CN107644166A CN 107644166 A CN107644166 A CN 107644166A CN 201710863641 A CN201710863641 A CN 201710863641A CN 107644166 A CN107644166 A CN 107644166A
- Authority
- CN
- China
- Prior art keywords
- rule
- request
- web application
- attack
- white list
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The invention discloses a kind of based on the WEB application safety protecting method learnt automatically, comprise the following steps, step 1:Screen the daily record of non-attack request;Step 2:Field in daily record, by regular expression set of the machine learning generation with ad hoc rules, form white list rule;Step 3:The request received is matched using regular expression set, intercepts or mark the request not in white list rule;Step 4:Request to mark is identified, and if normal, is then added in white list rule, is then intercepted if attack;The present invention can generate white list rule by autonomous learning, fail to report when in use low with rate of false alarm, unknown leak can also be protected.
Description
Technical field
The present invention relates to field of computer technology, and in particular to a kind of based on the WEB application fire prevention system learnt automatically
Defence method.
Background technology
WEB application system is developed using various dynamic WEB techniques, based on B/S (browser/server) pattern
Transacter;Currently, WEB security threats grow in intensity, and for a user, WEB is a disaster safely;It is most normal at present
Way is fire wall, fire wall can filter out the data of non-traffic port, the leak for preventing non-Web service from occurring;But
It is that traditional WEB application fire wall is all detected using intrusion feature database to request, so as to which whether decision request is normally please
Ask;The content of request is then returned if normal request, if query-attack, then interception request and returns to prompt message;But pass
The WEB application fire wall of system is in use towards various types of website, it may appear that reports by mistake and fails to report;It can only defend public
The leak opened, unknown leak can not be defendd before rule is not upgraded.
The content of the invention
The present invention provide it is a kind of can learn automatically based on the WEB application safety protecting method learnt automatically.
The technical solution adopted by the present invention is:It is a kind of based on the WEB application safety protecting method learnt automatically, it is including following
Step:
Step 1:Extract the access log of WEB application, the daily record of screening non-attack request;
Step 2:According to the field in the daily record filtered out in step 1, there is ad hoc rules by machine learning generation
Regular expression set, form white list rule;
Step 3:The request received is matched using the regular expression set generated in step 2, intercepts or marks
Remember the not request in white list rule;
Step 4:The request marked in step 3 is identified, if normal, is then added in white list rule, such as
Fruit then intercepts for attack.
Further, recognition methods is traditional WAF rule detections or association analysis in the step 4.
Further, it is to carry out Keywords matching using script that non-attack requesting method is screened in step 1, and filtering attack please
Ask.
Further, the generation method of the step 2 regular expression set is as follows:
S1:Field is obtained from the most long public substring in character string beginning, according to this substring create-rule;
S2:Remove public most long substring part, remainder is calculated into similarity two-by-two according to string editing distance;
S3:The character string for being less than certain threshold value with other similarity of character string is extracted, generates independent matched rule, and with
The rule generated in step S1 is spliced;
S4:Remaining character string repeat step S1-S3 after being extracted in step S3, the compatible rule merging generated with step 3, until
Travel through all character strings.
Further, the field in the step 2 includes the field in URL, Cookie, Referer and self-defined record.
The beneficial effects of the invention are as follows:
(1) present invention is according to the WEB daily records normally accessed, by learning to generate white list rule automatically;
(2) present invention can carry out real-time update to white list rule set, available for different types of website;
(3) present invention fails to report low with rate of false alarm when in use, and can also be protected for unknown leak.
Brief description of the drawings
Fig. 1 is schematic flow sheet of the present invention.
Embodiment
The present invention will be further described with specific embodiment below in conjunction with the accompanying drawings.
As shown in figure 1, it is a kind of based on the WEB application safety protecting method learnt automatically, comprise the following steps:
Step 1:The access log of WEB application is extracted from WEB server or in traditional WAF equipment, screening non-attack please
The daily record asked;
Screening technique is to carry out Keywords matching using script, filters query-attack, may filter out in WAF daily records
Attack logs, or by manually identifying one by one, determine whether to attack.
Such as:Set of URL closes:
Query-attack set of URL can be filtered out by above-mentioned screening technique to close:
Step 2:According to the word of URL, Cookie, Referer and other self-defined records in the daily record filtered out in step 1
Section, by regular expression set of the machine learning generation with ad hoc rules, form white list rule;
Regular expression generation method is illustrated using following set of URL cooperations as example.
S1:Obtain in all URL from the most long public substring at character string beginning, directly generate and advise for this substring
Then;
Such as most long substring is " http in above-mentioned set of URL conjunction://www.xxxx.com/”
The rule of generation is http:\/\/www\.xxxx\.com\/;
S2:Remove public most long substring part, remaining part substring is calculated two-by-two according to string editing distance
Similarity, obtain result shown in table 1;
The Similarity Measure result of table 1
The character string relatively low with other similarity of character string is extracted, generates individually rule;
S3:It is too low with the similarity of other character strings less than 50.0 expression character string to set upper table intermediate value, directly processes
Corresponding text string extracting is into matched rule;
The rule that this step obtains:(:Download | list), rule is obtained after the result splicing obtained with step S1
For:http:\/\/www.xxxx.com\/(:download|list);
S4:Remaining character string repeat step S1-S3 after being extracted in step S3, the compatible rule merging generated with step 3, until
Travel through all character strings.
Repeat step S1-S3 obtain news /detailId=(:7126|4512|1231|7793);
It can further optimize to obtain:news\/detail\Id=d+;
After the compatible rule merging generated with step 3, obtain:
http\/\/www\.xxxx\.com\/(:news\/detail\Id=d+ | download | list) $.
Step 3:The request received is matched using the regular expression set generated in step 2, intercepts or marks
Remember the not request in white list rule;
The new request received to a server, firewall system can be attempted to extract parameter therein to be entered using rule
Row matching, the URL of such as one visitor's request are:
http://www.xxxx.com/news/detailId=1126unionselect 1,2,3,4
Due to regular expression " news /detailId=d+ " only allow id parameters for numeral, and contain herein
Character strings such as " union ", canonical can not match the string content of back, so this request is just in white list rule
Outside;According to the setting of user, directly this request can be intercepted, this request can also be marked, enter traveling one
Step is analyzed to determine whether query-attack.
Rapid 4:The request marked in step 3 is identified, if normal, is then added in white list rule set, such as
Fruit then intercepts for attack.
Request to mark carries out analysis identification with reference to the context of the information request after this IP or login, and analysis can herein
For traditional WAF rule detection, also analysis can be associated according to the nearest access record of some visitor, or be other detections
The combination of mode.
Wherein traditional WAF detections are mainly that leak rule known to use (according to the utilization information of open leak, is write
Regular expression) request to visitor matches;Association analysis, which refers to be recorded according to the nearest access of some visitor, to be carried out
Analysis, such as fire wall None- identified some request whether there is attack, and find this according to the historical record of access
The attack for all existing and determining is accessed several times before individual visitor, then current request is determined as query-attack;If sentence
Result is determined for attack, then direct interception request, if it is determined that being normal request, is then added in white list rule.
Further, manual review can also be finally carried out, edits the rule of generation, this single stepping is primarily to inspection
Look into the correctness of machine create-rule and allow keeper to add white list manually.
Manual review is primarily to find that automatically generate rule causes to intercept with the presence or absence of mistake by mistake;It is specific as follows:
Check and intercept daily record, if there is normal access (to separate normal request according to the characteristic area of common attack type, join
Examine OWASP documents) request be intercepted;If wrong interception be present, edit corresponding regular (being added to white list collection).
It there may be instant invention overcomes the attack detecting of traditional WEB fire walls intrusion feature database and largely report by mistake and fail to report
The defects of;By the WEB daily records normally accessed in analysis of history, generation white list rule, according to the setting of user, can only allow
Request in clearance white list rule, so as to defend the attack that hacker initiates.
Wen Zhong, regular expression refer to a kind of logical formula to string operation, with some the specific words defined in advance
The combination of symbol and these specific characters;WAF refers to WEB application guard system;Leak refer to hardware, software, agreement specific implementation or
Defect present on System Security Policy, it can enable attacker that system is accessed or destroyed in the case of unauthorized;Editor
Between distance refers to two word strings, as the minimum edit operation number needed for one changes into another;The edit operation of license includes
One character is substituted for another character, inserts a character, deletes a character;In general, editing distance is smaller, and two
The similarity of individual string is bigger;OWASP refers to open WEB application program safety project;URL refers to URL;Cookie
Refer to the data being stored on user local terminal;Referer refers to source website address.
Claims (5)
- It is 1. a kind of based on the WEB application safety protecting method learnt automatically, it is characterised in that to comprise the following steps:Step 1:Extract the access log of WEB application, the daily record of screening non-attack request;Step 2:According to the field in the daily record filtered out in step 1, pass through canonical of the machine learning generation with ad hoc rules Expression formula set, form white list rule;Step 3:The request received is matched using the regular expression set generated in step 2, intercepts or marks not Request in white list rule;Step 4:The request marked in step 3 is identified, if normal, is then added in white list rule, if Attack then intercepts.
- It is 2. according to claim 1 a kind of based on the WEB application safety protecting method learnt automatically, it is characterised in that institute It is traditional WAF rule detections or association analysis to state recognition methods in step 4.
- It is 3. according to claim 1 a kind of based on the WEB application safety protecting method learnt automatically, it is characterised in that step It is to carry out Keywords matching using script that non-attack requesting method is screened in rapid 1, filters query-attack.
- It is 4. according to claim 1 a kind of based on the WEB application safety protecting method learnt automatically, it is characterised in that institute The generation method for stating step 2 regular expression set is as follows:S1:Field is obtained from the most long public substring in character string beginning, according to this substring create-rule;S2:Remove public most long substring part, remainder is calculated into similarity two-by-two according to string editing distance;S3:The character string for being less than certain threshold value with other similarity of character string is extracted, generates independent matched rule, and and step The rule generated in S1 is spliced;S4:Remaining character string repeat step S1-S3 after being extracted in step S3, the compatible rule merging generated with step 3, until traversal All character strings.
- It is 5. according to claim 1 a kind of based on the WEB application safety protecting method learnt automatically, it is characterised in that institute The field stated in step 2 includes the field in URL, Cookie, Referer and self-defined record.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710863641.3A CN107644166A (en) | 2017-09-22 | 2017-09-22 | It is a kind of based on the WEB application safety protecting method learnt automatically |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710863641.3A CN107644166A (en) | 2017-09-22 | 2017-09-22 | It is a kind of based on the WEB application safety protecting method learnt automatically |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107644166A true CN107644166A (en) | 2018-01-30 |
Family
ID=61111896
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710863641.3A Pending CN107644166A (en) | 2017-09-22 | 2017-09-22 | It is a kind of based on the WEB application safety protecting method learnt automatically |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107644166A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108520180A (en) * | 2018-03-01 | 2018-09-11 | 中国科学院信息工程研究所 | A kind of firmware Web leak detection methods and system based on various dimensions |
CN110661680A (en) * | 2019-09-11 | 2020-01-07 | 深圳市永达电子信息股份有限公司 | Method and system for detecting data stream white list based on regular expression |
CN111835737A (en) * | 2020-06-29 | 2020-10-27 | 中国平安财产保险股份有限公司 | WEB attack protection method based on automatic learning and related equipment thereof |
CN111935133A (en) * | 2020-08-06 | 2020-11-13 | 北京顶象技术有限公司 | White list generation method and device |
CN111953638A (en) * | 2019-05-17 | 2020-11-17 | 北京京东尚科信息技术有限公司 | Network attack behavior detection method and device and readable storage medium |
CN112148842A (en) * | 2020-10-13 | 2020-12-29 | 厦门安胜网络科技有限公司 | Method, device and storage medium for reducing false alarm rate in attack detection |
CN113162909A (en) * | 2021-03-10 | 2021-07-23 | 北京顶象技术有限公司 | White list generation method and device based on AI (Artificial Intelligence), electronic equipment and readable medium |
CN113259303A (en) * | 2020-02-12 | 2021-08-13 | 网宿科技股份有限公司 | White list self-learning method and device based on machine learning technology |
CN113660230A (en) * | 2021-08-06 | 2021-11-16 | 杭州安恒信息技术股份有限公司 | Cloud security protection test method, system, computer and readable storage medium |
CN114039778A (en) * | 2021-11-09 | 2022-02-11 | 深信服科技股份有限公司 | Request processing method, device, equipment and readable storage medium |
CN114422206A (en) * | 2021-12-29 | 2022-04-29 | 北京致远互联软件股份有限公司 | JAVA WEB dynamic configuration security defense method |
CN114500018A (en) * | 2022-01-17 | 2022-05-13 | 武汉大学 | Web application firewall security detection and reinforcement system and method based on neural network |
CN117201194A (en) * | 2023-11-06 | 2023-12-08 | 华中科技大学 | URL classification method, device and system based on character string similarity calculation |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101727447A (en) * | 2008-10-10 | 2010-06-09 | 浙江搜富网络技术有限公司 | Generation method and device of regular expression based on URL |
CN103166966A (en) * | 2013-03-07 | 2013-06-19 | 星云融创(北京)信息技术有限公司 | Method and device for distinguishing illegal access request to website |
CN103428196A (en) * | 2012-12-27 | 2013-12-04 | 北京安天电子设备有限公司 | URL white list-based WEB application intrusion detecting method and apparatus |
CN104361283A (en) * | 2014-12-05 | 2015-02-18 | 网宿科技股份有限公司 | Web attack protection method |
US20160344696A1 (en) * | 2013-03-27 | 2016-11-24 | Fortinet, Inc. | Firewall policy management |
CN106415507A (en) * | 2014-06-06 | 2017-02-15 | 日本电信电话株式会社 | Log analysis device, attack detection device, attack detection method and program |
CN106657006A (en) * | 2016-11-17 | 2017-05-10 | 北京中电普华信息技术有限公司 | Software information safety protection method and device |
-
2017
- 2017-09-22 CN CN201710863641.3A patent/CN107644166A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101727447A (en) * | 2008-10-10 | 2010-06-09 | 浙江搜富网络技术有限公司 | Generation method and device of regular expression based on URL |
CN103428196A (en) * | 2012-12-27 | 2013-12-04 | 北京安天电子设备有限公司 | URL white list-based WEB application intrusion detecting method and apparatus |
CN103166966A (en) * | 2013-03-07 | 2013-06-19 | 星云融创(北京)信息技术有限公司 | Method and device for distinguishing illegal access request to website |
US20160344696A1 (en) * | 2013-03-27 | 2016-11-24 | Fortinet, Inc. | Firewall policy management |
CN106415507A (en) * | 2014-06-06 | 2017-02-15 | 日本电信电话株式会社 | Log analysis device, attack detection device, attack detection method and program |
CN104361283A (en) * | 2014-12-05 | 2015-02-18 | 网宿科技股份有限公司 | Web attack protection method |
CN106657006A (en) * | 2016-11-17 | 2017-05-10 | 北京中电普华信息技术有限公司 | Software information safety protection method and device |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108520180B (en) * | 2018-03-01 | 2020-04-24 | 中国科学院信息工程研究所 | Multi-dimension-based firmware Web vulnerability detection method and system |
CN108520180A (en) * | 2018-03-01 | 2018-09-11 | 中国科学院信息工程研究所 | A kind of firmware Web leak detection methods and system based on various dimensions |
CN111953638B (en) * | 2019-05-17 | 2023-06-27 | 北京京东尚科信息技术有限公司 | Network attack behavior detection method and device and readable storage medium |
CN111953638A (en) * | 2019-05-17 | 2020-11-17 | 北京京东尚科信息技术有限公司 | Network attack behavior detection method and device and readable storage medium |
CN110661680A (en) * | 2019-09-11 | 2020-01-07 | 深圳市永达电子信息股份有限公司 | Method and system for detecting data stream white list based on regular expression |
CN110661680B (en) * | 2019-09-11 | 2023-03-14 | 深圳市永达电子信息股份有限公司 | Method and system for detecting data stream white list based on regular expression |
CN113259303A (en) * | 2020-02-12 | 2021-08-13 | 网宿科技股份有限公司 | White list self-learning method and device based on machine learning technology |
EP3886394A4 (en) * | 2020-02-12 | 2021-09-29 | Wangsu Science & Technology Co., Ltd. | Machine learning technique based whitelist self-learning method and device |
CN111835737A (en) * | 2020-06-29 | 2020-10-27 | 中国平安财产保险股份有限公司 | WEB attack protection method based on automatic learning and related equipment thereof |
CN111835737B (en) * | 2020-06-29 | 2024-04-02 | 中国平安财产保险股份有限公司 | WEB attack protection method based on automatic learning and related equipment thereof |
CN111935133A (en) * | 2020-08-06 | 2020-11-13 | 北京顶象技术有限公司 | White list generation method and device |
CN112148842A (en) * | 2020-10-13 | 2020-12-29 | 厦门安胜网络科技有限公司 | Method, device and storage medium for reducing false alarm rate in attack detection |
CN113162909A (en) * | 2021-03-10 | 2021-07-23 | 北京顶象技术有限公司 | White list generation method and device based on AI (Artificial Intelligence), electronic equipment and readable medium |
CN113660230A (en) * | 2021-08-06 | 2021-11-16 | 杭州安恒信息技术股份有限公司 | Cloud security protection test method, system, computer and readable storage medium |
CN113660230B (en) * | 2021-08-06 | 2023-02-28 | 杭州安恒信息技术股份有限公司 | Cloud security protection testing method and system, computer and readable storage medium |
CN114039778A (en) * | 2021-11-09 | 2022-02-11 | 深信服科技股份有限公司 | Request processing method, device, equipment and readable storage medium |
CN114422206A (en) * | 2021-12-29 | 2022-04-29 | 北京致远互联软件股份有限公司 | JAVA WEB dynamic configuration security defense method |
CN114422206B (en) * | 2021-12-29 | 2024-02-02 | 北京致远互联软件股份有限公司 | JAVA WEB dynamic configuration security defense method |
CN114500018B (en) * | 2022-01-17 | 2022-10-14 | 武汉大学 | Web application firewall security detection and reinforcement system and method based on neural network |
CN114500018A (en) * | 2022-01-17 | 2022-05-13 | 武汉大学 | Web application firewall security detection and reinforcement system and method based on neural network |
CN117201194A (en) * | 2023-11-06 | 2023-12-08 | 华中科技大学 | URL classification method, device and system based on character string similarity calculation |
CN117201194B (en) * | 2023-11-06 | 2024-01-05 | 华中科技大学 | URL classification method, device and system based on character string similarity calculation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107644166A (en) | It is a kind of based on the WEB application safety protecting method learnt automatically | |
CN110233849B (en) | Method and system for analyzing network security situation | |
US10178107B2 (en) | Detection of malicious domains using recurring patterns in domain names | |
US10721245B2 (en) | Method and device for automatically verifying security event | |
CN103559235B (en) | A kind of online social networks malicious web pages detection recognition methods | |
Lee et al. | A novel method for SQL injection attack detection based on removing SQL query attribute values | |
Nelms et al. | {ExecScent}: Mining for New {C&C} Domains in Live Networks with Adaptive Control Protocol Templates | |
KR101001132B1 (en) | Method and System for Determining Vulnerability of Web Application | |
JP6397932B2 (en) | A system for identifying machines infected with malware that applies language analysis to network requests from endpoints | |
CN105844140A (en) | Website login brute force crack method and system capable of identifying verification code | |
CN112738126A (en) | Attack tracing method based on threat intelligence and ATT & CK | |
CN110351248B (en) | Safety protection method and device based on intelligent analysis and intelligent current limiting | |
CN111931173A (en) | APT attack intention-based operation authority control method | |
CN103428196A (en) | URL white list-based WEB application intrusion detecting method and apparatus | |
CN112887341B (en) | External threat monitoring method | |
CN110177114A (en) | The recognition methods of network security threats index, unit and computer readable storage medium | |
CN107612924A (en) | Attacker's localization method and device based on wireless network invasion | |
CN107016298B (en) | Webpage tampering monitoring method and device | |
CN109347808B (en) | Safety analysis method based on user group behavior activity | |
CN111104579A (en) | Identification method and device for public network assets and storage medium | |
CN103166966A (en) | Method and device for distinguishing illegal access request to website | |
WO2017063274A1 (en) | Method for automatically determining malicious-jumping and malicious-nesting offensive websites | |
CN112199677A (en) | Data processing method and device | |
CN111953697A (en) | APT attack identification and defense method | |
CN114021040A (en) | Method and system for alarming and protecting malicious event based on service access |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180130 |
|
RJ01 | Rejection of invention patent application after publication |