CN109714341A - A kind of Web hostile attack identification method, terminal device and storage medium - Google Patents
A kind of Web hostile attack identification method, terminal device and storage medium Download PDFInfo
- Publication number
- CN109714341A CN109714341A CN201811619182.5A CN201811619182A CN109714341A CN 109714341 A CN109714341 A CN 109714341A CN 201811619182 A CN201811619182 A CN 201811619182A CN 109714341 A CN109714341 A CN 109714341A
- Authority
- CN
- China
- Prior art keywords
- data
- web
- word
- idf
- character
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The present invention relates to Web security technology areas, propose a kind of Web hostile attack identification method, terminal device and storage medium, in the method, two big steps are identified including model foundation and data, in model foundation, including step 1: acquiring the blacklist and white list sample data of a large amount of web access data respectively, after carrying out unified decoding to the network address in sample data, decoded network address is carried out character processing;Step 2: feature extraction is carried out to by the processed sample data of step 1 by TF-IDF algorithm, calculates the characteristic value of each sample data;Step 3: according to the characteristic value of blacklist and white list sample data, being trained by algorithm of support vector machine, disaggregated model and preservation after being trained, and the disaggregated model is for distinguishing blacklist data and white list data.TF-IDF and support vector machines are applied to Web safety detection by the present invention, quickly to identify that malicious attack is requested.
Description
Technical field
The present invention relates to Web security technology area more particularly to a kind of Web hostile attack identification method, terminal device and
Storage medium.
Background technique
With the development of Global Internet, the world has been introduced into a high speed information epoch.Pass through network, Ren Menke
Easily to browse and share huge network data, meanwhile, the core business of more and more enterprises is real using Web application
Existing, this makes enterprise fortune be closely related with network security, and then is closely related with the life of broad masses.However, because Web
Opening, the uncontrollability of itself, so that hacker is emerged one after another using the security incident that network hole is attacked.Recently, whole world neck
First network security and Radware company, application delivery solution provider has issued second part of safe tune of year Web application
Look into report: Radware2018 Web application security status.Report points out that most enterprises' (67%) think that hacker still can invade
Enter enterprise network.It reports while pointing out, at least 89% interviewee was met in the previous year for Web application or Web clothes
The be engaged in attack of device especially claims that the interviewee by encryption Web attack has risen to 2018 from 12% in 2017
50%.Most interviewees' (59%) claim daily or weekly can all have attack.The frequency and complexity attacked with Web
Continuous to increase, traditional Web preventive means facing challenges also increase with it, and disadvantage also gradually highlights.
Up to the present, traditional Web preventive means, substantially the blacklist testing mechanism dependent on rule, either
Web application firewall or ids etc. depend on the canonical of detecting and alarm, carry out the matching of message.Although can resist big
Partial attack, but still there are the following problems:
1, rule base is difficult in maintenance.Currently, the attack means deformation of attacker is more next more, different coding staffs is such as used
The skills such as formula, capital and small letter variation and alternative sentence, it is possible to around detection, implement various modifications attack.If to this
Characterization rules are all added in a little deformation attacks, and feature database can be made too fat to move, difficult in maintenance.
2, regular formulation requires high.Rule write it is too wide in range easily manslaughter, rule is write too thin, easily bypasses.
3, when canonical item number is excessive, protective performance is seriously affected.
4, to new attack means, protective capacities is poor.
By the analysis to tradition dependent on the black list testing mechanism of rule, it can be derived that how to accomplish in magnanimity
It is fast and accurate in request really to identify that malicious attack is requested, it is the problem that we need to solve at present.
Summary of the invention
In view of the above-mentioned problems, the present invention is intended to provide a kind of Web hostile attack identification method, terminal device and storage are situated between
TF-IDF and support vector machines are applied to Web and examined safely by matter by introducing machine learning the relevant technologies in the security fields Web
It surveys, quickly to identify that malicious attack is requested.
Concrete scheme is as follows:
A kind of Web hostile attack identification method, comprising the following steps:
(1), disaggregated model is established
Step 1: the blacklist and white list sample data of a large amount of web access data are acquired respectively, in sample data
Network address carry out unified decoding, after being converted into unified coded format, decoded network address is subjected to character
Processing influences to avoid meaningless character and carries out the unification of format;
Step 2: feature extraction is carried out to by the processed sample data of step 1 by TF-IDF algorithm, is calculated every
The characteristic value of a sample data;
Step 3: according to the characteristic value of blacklist and white list sample data, being trained by algorithm of support vector machine,
Disaggregated model and preservation after being trained, the disaggregated model is for distinguishing blacklist data and white list data;
(2), data identify
Step 4: it after being decoded to the network address of the access data received, is converted into step 1 and uses
Coded format, while by decoded network address carry out character processing;
Step 5: feature extraction is carried out to by the processed data of step 4 by TF-IDF algorithm, calculates data
Characteristic value;
Step 6: according to the characteristic value of data, access data is identified by disaggregated model, judge whether it belongs to
Blacklist data.
Further, the character processing are as follows: all letters are uniformly set as upper case character or lowercase character, will be owned
Chinese and number are uniformly set as specific character, and the specific character is and the character in network address in addition to Chinese and number
Different characters.
Further, the calculating process of the characteristic value are as follows:
(1), the length of word in data is set as s, and it is s that data are divided into multiple length in sequence according to the length s of word
Word;
(2), the word frequency TF:TF=1+ln (N) of each word is calculated, in which: N is the number that the word occurs in data;
(3), the inverse document frequency IDF:IDF=1+ln (p/q) of each word is calculated, wherein p is data count, and q is
Data number comprising the word;
(4), the characteristic value TF-IDF of the data is calculated:
Further, the length s=3 of institute's predicate.
Further, screened in step 3 maximum 1000 sample datas of TF-IDF value in step 2 as support to
The training data of amount machine algorithm.
A kind of Web malicious attack identification terminal equipment, including processor, memory and storage are in the memory simultaneously
The computer program that can be run on the processor, the processor realize that the present invention is implemented when executing the computer program
The step of example above-mentioned method.
A kind of computer readable storage medium, the computer-readable recording medium storage have computer program, feature
The step of being, above-mentioned method of the embodiment of the present invention realized when the computer program is executed by processor.
The present invention uses technical solution as above, by introducing machine learning the relevant technologies in the security fields Web, by TF-IDF
It is applied to Web safety detection with support vector machines, quickly to identify that malicious attack is requested, while the model established infuses sql
Enter attack, XSS attack has very high precision of prediction, and model have deformation attack recognition, new attack mode identify and
The ability of semantic analysis.
Detailed description of the invention
Fig. 1 show the flow diagram of the embodiment of the present invention one.
Fig. 2 show the schematic diagram of the algorithm of support vector machine of the embodiment of the present invention one.
Specific embodiment
To further illustrate that each embodiment, the present invention are provided with attached drawing.These attached drawings are that the invention discloses one of content
Point, mainly to illustrate embodiment, and the associated description of specification can be cooperated to explain the operation principles of embodiment.Cooperation ginseng
These contents are examined, those of ordinary skill in the art will be understood that other possible embodiments and advantages of the present invention.
Now in conjunction with the drawings and specific embodiments, the present invention is further described.
Embodiment one:
Refering to what is shown in Fig. 1, the present invention provides a kind of Web hostile attack identification methods, comprising the following steps:
(1), disaggregated model is established
Step 1: the blacklist and white list sample data of a large amount of web access data are acquired respectively, in sample data
Network address (URL) carry out unified decoding, after being converted into unified coded format, decoded network address is carried out
Character processing influences to avoid meaningless character and carries out the unification of format.
The character handles those skilled in the art and can set according to demand, is specially in the embodiment are as follows: will
All letters are uniformly set as upper case character or lowercase character, and all Chinese and number are uniformly set as specific character, described
Specific character is the character different from the character in network address in addition to Chinese and number.
It is described that by Chinese and number, to be uniformly set as specific character be the influence for being used to reject hash, should be for
For the judgement of Web blacklist, Chinese and number are therefore meaningless character is set for blacklist judgement
The process that feature extraction can be simplified for specific character accelerates the speed of identification.
In the embodiment, all letters are revised as small letter, therefore, are set as capitalizing by specific character in the embodiment
Alphabetical " N ".Those skilled in the art also can be set as other characters.
Step 2: feature extraction is carried out to by the processed sample data of step 1 by TF-IDF algorithm, is calculated every
The characteristic value of a sample data.
TF-IDF (term frequency-inverse document frequency) be it is a kind of for information retrieval with
The common weighting technique of data mining.For assessing a word or a word for wherein one in a file set or a corpus
The significance level of part file.In the embodiment, it is flexibly applied in the security fields Web, is based on by the way that TF-IDF is this
The method of statistics carries out feature extraction to a large amount of web access data, most representative key word is obtained, to realize
Feature conversion.
(1), during feature converts, firstly, it is necessary to carry out word frequency to by the processed sample data of step 1
The statistics of TF (Term frequency).It is the length of a word by three character settings during specific implementation, it is also same
When consider data be smoothed and data are normalized, to improve the prediction classified after feature conversion
Precision.
When in smoothing process, the calculation formula of word frequency TF optimizes are as follows:
TF=1+ln (N)
Wherein: N is the number that certain word occurs in data.
It is described in detail below with a specific example:
It is assumed that an access data are "/css/css_js.php ", it is the length of a word according to three character settings,
The access data may be split into 13 words, be respectively as follows :/cs, css, ss/, s/c ,/cs, css, ss_, s_j, _ js, js., s.p,
.ph, php, wherein "/cs " and " css " occurs twice altogether, remaining 11 word only occurs once, then passing through the meter of word frequency TF
It calculates formula to be calculated: the TF value of "/cs " and " css " are as follows: 1.693, the TF value of remaining word is 1.
(2), in above-mentioned example, the TF value of "/cs " and " css " and other words can serve as the access after feature conversion
The dimension of data is supplied to sorting algorithm and is detected, wherein "/cs " and " css " frequency of occurrences with respect to other words frequency compared with
Height will play bigger effect during detection, still, if these words are in blacklist sample and white list sample
In it is all a large amount of occur, such as "/cs " and " css " distinguish it is representative just very small in blacklist and white list sample, they
The effect played in the detection just becomes very little, if be just difficult to using the method progress feature conversion for only considering word frequency
The key feature of blacklist and white list sample is counted, so that the normal data being difficult to detect by web access and abnormal number
According to.If some word, only occur in blacklist sample, although opposite total sample number, the number that it occurs is few, it is being detected
Shared weight is still very high in blacklist sample.So, consider the generation according to each word in blacklist and white list sample
Table gives one corresponding weight of each word.One word can predict that normal and abnormal data ability is bigger, and weight is got over
Greatly, conversely, weight is smaller.It is assumed that word " css " only occurs in blacklist sample, then its weight when predicting black sample is just
It is bigger, on the contrary it is smaller.In the embodiment, come using inverse document frequency IDF (Inverse document frequency)
It is measured.
Set the calculation formula of IDF are as follows:
IDF=1+ln (p/q)
Wherein, p is total sample number, and q is the sample number comprising the word.
It suppose there is 100,000 sample datas, all types of data volumes of these sample datas are suitable, wherein there is 200 sample numbers
" css " word is contained in, and "/cs " is contained in 1000 sample datas, then,
The weight of " css " in the sample are as follows: IDF=1+ln (100000/200)=7.215,
The weight of "/cs " in the sample are as follows: IDF=1+ln (100000/1000)=5.605.
(3), according to TF value calculated above, after being re-introduced into IDF value, the TF-IDF value of each word is calculated:
TF-IDF=TF*IDF
Then: the TF-IDF value of "/cs " and " css " are respectively 9.489 and 12.215.
It is obtained according to the above results, word " css " in the detection process, will play bigger effect.
(4), the characteristic value of sample data, i.e. TF-IDF value are calculated according to following equation:
Wherein, n is the number for the word for including in sample data.
(5), the characteristic value of sample data is normalized, and is normalized in the embodiment using Frobenius norm
Processing, calculation formula are as follows:
By the above method in the embodiment, " representativeness " of word frequency and word in the sample is comprehensively considered, by every number
According to be fractionation that length carries out word by three characters, then calculate the overall target TF-IDF value of each word as characteristic value, most
Eventually, achieve the purpose that feature converts.
Step 3: according to the characteristic value of blacklist and white list sample data, being trained by algorithm of support vector machine,
It is adjusted by parameter, the optimal classification model after being trained, and model is saved, the disaggregated model is black for distinguishing
List data and white list data.
Since it is considered that dimension is higher, calculating is more complicated, when training data is huge, be easy to cause " dimension disaster ", together
When, excessive dimension is not necessarily very helpful to the raising of accuracy, we have screened TF-IDF in specific implementation
It is worth the dimension of training and test of maximum 1000 words as algorithm of support vector machine.
Data are trained and are predicted by support vector machines (support vector machine, SVM) algorithm.Branch
Holding vector machine is a kind of sorting algorithm, and generalization ability is improved by seeking structuring least risk, realizes empiric risk and sets
The minimum of letter range can also obtain the purpose of good statistical law to reach in the case where statistical sample amount is less.Such as
Shown in Fig. 2, it is a kind of two classification model, and basic model is defined as the maximum linear classification in the interval on feature space
Device, the i.e. learning strategy of support vector machines are margin maximizations.According to the characteristic value of blacklist and white list sample data, lead to
Blacklist data and white list data can be distinguished by crossing the disaggregated model that algorithm of support vector machine trains.
Support vector machines, which is selected, as the reason of sorting algorithm the following:
1, it is based on structural risk minimization, in this way can be to avoid overfitting problem, generalization ability is strong.
2, support vector machines has the small-sample learning method of solid theoretical basis.It is not related to probability measure and big substantially
Number law.It inherently sees, avoids from the conventional procedure concluded to deduction, realize efficiently from training sample to pre- test sample
This " transduction inference ", the problems such as enormously simplifying common classification and return.
3, the terminal decision function of support vector machines is only determined that the complexity of calculating depends on by a small number of supporting vectors
The number of supporting vector, rather than the dimension of sample space, this avoids " dimension disaster " in some sense.
4, a small number of supporting vectors determine final result, this facilitates grasp the key link sample, " rejecting " bulk redundancy sample,
And it is simple to be doomed this method algorithm, while having preferable " robust " property.
(2), data identify
After establishing disaggregated model, so that it may the web access data newly received are predicted, judge its whether be
Blacklist data.
Step 4: it after being decoded to the network address of the access data received, is converted into step 1 and uses
Coded format, while by decoded network address Chinese and number be uniformly set as character used in step 1;
Step 5: feature extraction is carried out to by the processed data of step 4 by TF-IDF algorithm, calculates data
Characteristic value;
Step 6: according to the characteristic value of data, access data is identified by disaggregated model, judge whether it belongs to
Blacklist data.
In the embodiment, select 140,000 truthful datas as training and test, wherein random selection 80% is as instruction
Practice data, 20% is used as test data, seeks prediction accuracy of the mean for such cross validation 10 times, uses N-gram in experiment
Four kinds of+SVM, TF-IDF+SVM, TF-IDF+KNN, TF-IDF+Logistic Regression algorithm modes are tested, such as
Shown in table 1, experiment shows the model of TF-IDF+SVM, accuracy highest, accuracy 99.89%, while the model to sql
Injection attacks, XSS attack have a very high precision of prediction, and model there is deformation attack recognition, new attack mode to identify with
And the ability of semantic analysis.
Table 1
Embodiment two:
The present invention also provides a kind of Web malicious attack identification terminal equipment, including memory, processor and it is stored in institute
The computer program that can be run in memory and on the processor is stated, the processor executes real when the computer program
Step in the above method embodiment of the existing embodiment of the present invention one.
Further, as an executable scheme, the Web malicious attack identification terminal equipment can be desktop meter
Calculation machine, notebook, palm PC and cloud server etc. calculate equipment.The Web malicious attack identification terminal equipment may include,
But it is not limited only to, processor, memory.It will be understood by those skilled in the art that above-mentioned Web malicious attack identification terminal equipment
Composed structure is only the example of Web malicious attack identification terminal equipment, is not constituted to Web malicious attack identification terminal equipment
Restriction, may include perhaps combining certain components or different components, such as institute than above-mentioned more or fewer components
Stating Web malicious attack identification terminal equipment can also include input-output equipment, network access equipment, bus etc., and the present invention is real
Example is applied not limit this.
Further, as an executable scheme, alleged processor can be central processing unit (Central
Processing Unit, CPU), it can also be other general processors, digital signal processor (Digital Signal
Processor, DSP), it is specific integrated circuit (Application Specific Integrated Circuit, ASIC), existing
At programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete
Door or transistor logic, discrete hardware components etc..General processor can be microprocessor or the processor can also
To be any conventional processor etc., the processor is the control centre of the Web malicious attack identification terminal equipment, is utilized
The various pieces of various interfaces and the entire Web malicious attack identification terminal equipment of connection.
The memory can be used for storing the computer program and/or module, and the processor is by operation or executes
Computer program in the memory and/or module are stored, and calls the data being stored in memory, described in realization
The various functions of Web malicious attack identification terminal equipment.The memory can mainly include storing program area and storage data area,
Wherein, storing program area can application program needed for storage program area, at least one function;Storage data area can store basis
Mobile phone uses created data etc..In addition, memory may include high-speed random access memory, it can also include non-easy
The property lost memory, such as hard disk, memory, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital
(Secure Digital, SD) card, flash card (Flash Card), at least one disk memory, flush memory device or other
Volatile solid-state part.
The present invention also provides a kind of computer readable storage medium, the computer-readable recording medium storage has computer
Program, when the computer program is executed by processor the step of the realization above method of the embodiment of the present invention.
If the integrated module/unit of the Web malicious attack identification terminal equipment is real in the form of SFU software functional unit
Now and when sold or used as an independent product, it can store in a computer readable storage medium.Based in this way
Understanding, the present invention realize above-described embodiment method in all or part of the process, can also be instructed by computer program
Relevant hardware is completed, and the computer program can be stored in a computer readable storage medium, the computer program
When being executed by processor, it can be achieved that the step of above-mentioned each embodiment of the method.Wherein, the computer program includes computer
Program code, the computer program code can be source code form, object identification code form, executable file or certain centres
Form etc..The computer-readable medium may include: can carry the computer program code any entity or device,
Recording medium, USB flash disk, mobile hard disk, magnetic disk, CD, computer storage, read-only memory (ROM, ROM, Read-Only
Memory), random access memory (RAM, Random Access Memory) and software distribution medium etc..
Although specifically showing and describing the present invention in conjunction with preferred embodiment, those skilled in the art should be bright
It is white, it is not departing from the spirit and scope of the present invention defined by the appended claims, it in the form and details can be right
The present invention makes a variety of changes, and is protection scope of the present invention.
Claims (7)
1. a kind of Web hostile attack identification method, it is characterised in that: the following steps are included:
(1), disaggregated model is established
Step 1: the blacklist and white list sample data of a large amount of web access data are acquired respectively, to the net in sample data
Network address carries out unified decoding, and after being converted into unified coded format, decoded network address is carried out character processing,
The unification of format is influenced and carried out to avoid meaningless character;
Step 2: feature extraction is carried out to by the processed sample data of step 1 by TF-IDF algorithm, calculates each sample
The characteristic value of notebook data;
Step 3: it according to the characteristic value of blacklist and white list sample data, is trained, is obtained by algorithm of support vector machine
Disaggregated model and preservation after training, the disaggregated model is for distinguishing blacklist data and white list data;
(2), data identify
Step 4: after being decoded to the network address of the access data received, it is converted into volume used in step 1
Code format, while decoded network address is subjected to character processing;
Step 5: feature extraction is carried out to by the processed data of step 4 by TF-IDF algorithm, calculates the feature of data
Value;
Step 6: according to the characteristic value of data, access data is identified by disaggregated model, judge whether it belongs to black name
Forms data.
2. Web hostile attack identification method according to claim 1, it is characterised in that: the character processing are as follows: will own
Letter is uniformly set as upper case character or lowercase character, and all Chinese and number are uniformly set as specific character, described specific
Character is the character different from the character in network address in addition to Chinese and number.
3. Web hostile attack identification method according to claim 1, it is characterised in that: the calculating process of the characteristic value
Are as follows:
(1), the length of word in data is set as s, and data are divided into the word that multiple length are s in sequence according to the length s of word;
(2), the word frequency TF:TF=1+ln (N) of each word is calculated, in which: N is the number that the word occurs in data;
(3), calculate the inverse document frequency IDF:IDF=1+ln (p/q) of each word, wherein p is data count, q be comprising
The data number of the word;
(4), the characteristic value TF-IDF of the data is calculated:
4. Web hostile attack identification method according to claim 3, it is characterised in that: the length s=3 of institute's predicate.
5. Web hostile attack identification method according to claim 1, it is characterised in that: screened step 2 in step 3
Training data of middle maximum 1000 sample datas of TF-IDF value as algorithm of support vector machine.
6. a kind of Web malicious attack identification terminal equipment, it is characterised in that: including processor, memory and be stored in described
The computer program run in memory and on the processor, the processor are realized such as when executing the computer program
The step of Claims 1 to 5 the method.
7. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists
In realization is such as the step of Claims 1 to 5 the method when the computer program is executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811619182.5A CN109714341A (en) | 2018-12-28 | 2018-12-28 | A kind of Web hostile attack identification method, terminal device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811619182.5A CN109714341A (en) | 2018-12-28 | 2018-12-28 | A kind of Web hostile attack identification method, terminal device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109714341A true CN109714341A (en) | 2019-05-03 |
Family
ID=66258816
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811619182.5A Pending CN109714341A (en) | 2018-12-28 | 2018-12-28 | A kind of Web hostile attack identification method, terminal device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109714341A (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110266675A (en) * | 2019-06-12 | 2019-09-20 | 成都积微物联集团股份有限公司 | A kind of xss attack automated detection method based on deep learning |
CN110347724A (en) * | 2019-07-12 | 2019-10-18 | 深圳众赢维融科技有限公司 | Abnormal behaviour recognition methods, device, electronic equipment and medium |
CN111740957A (en) * | 2020-05-21 | 2020-10-02 | 江苏信息职业技术学院 | Automatic XSS attack detection method based on FP-tree optimization |
CN111970251A (en) * | 2020-07-28 | 2020-11-20 | 西安万像电子科技有限公司 | Data processing method and server |
CN112968872A (en) * | 2021-01-29 | 2021-06-15 | 成都信息工程大学 | Malicious flow detection method, system and terminal based on natural language processing |
CN113079127A (en) * | 2020-01-03 | 2021-07-06 | 台达电子工业股份有限公司 | Generation and application method of attack recognition data model |
WO2021159575A1 (en) * | 2020-02-12 | 2021-08-19 | 网宿科技股份有限公司 | Machine learning technique based whitelist self-learning method and device |
CN113783889A (en) * | 2021-09-22 | 2021-12-10 | 南方电网数字电网研究院有限公司 | Firewall control method for linkage access of network layer and application layer and firewall thereof |
CN113886815A (en) * | 2021-10-14 | 2022-01-04 | 北京华清信安科技有限公司 | SQL injection attack detection method based on machine learning |
CN113904834A (en) * | 2021-09-30 | 2022-01-07 | 北京华清信安科技有限公司 | XSS attack detection method based on machine learning |
CN114048740A (en) * | 2021-09-28 | 2022-02-15 | 马上消费金融股份有限公司 | Sensitive word detection method and device and computer readable storage medium |
CN114080783A (en) * | 2019-07-03 | 2022-02-22 | 沙特阿拉伯石油公司 | System and method for securely communicating selective data sets between terminals supporting multiple applications |
CN115987620A (en) * | 2022-12-21 | 2023-04-18 | 北京天云海数技术有限公司 | Method and system for detecting web attack |
CN116980235A (en) * | 2023-09-25 | 2023-10-31 | 成都数智创新精益科技有限公司 | Artificial intelligence-based interception method for WEB illegal request |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103577755A (en) * | 2013-11-01 | 2014-02-12 | 浙江工业大学 | Malicious script static detection method based on SVM (support vector machine) |
CN105553974A (en) * | 2015-12-14 | 2016-05-04 | 中国电子信息产业集团有限公司第六研究所 | Prevention method of HTTP slow attack |
CN105740460A (en) * | 2016-02-24 | 2016-07-06 | 中国科学技术信息研究所 | Webpage collection recommendation method and device |
CN107872452A (en) * | 2017-10-25 | 2018-04-03 | 东软集团股份有限公司 | A kind of recognition methods of malicious websites, device, storage medium and program product |
CN108229156A (en) * | 2017-12-28 | 2018-06-29 | 阿里巴巴集团控股有限公司 | URL attack detection methods, device and electronic equipment |
-
2018
- 2018-12-28 CN CN201811619182.5A patent/CN109714341A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103577755A (en) * | 2013-11-01 | 2014-02-12 | 浙江工业大学 | Malicious script static detection method based on SVM (support vector machine) |
CN105553974A (en) * | 2015-12-14 | 2016-05-04 | 中国电子信息产业集团有限公司第六研究所 | Prevention method of HTTP slow attack |
CN105740460A (en) * | 2016-02-24 | 2016-07-06 | 中国科学技术信息研究所 | Webpage collection recommendation method and device |
CN107872452A (en) * | 2017-10-25 | 2018-04-03 | 东软集团股份有限公司 | A kind of recognition methods of malicious websites, device, storage medium and program product |
CN108229156A (en) * | 2017-12-28 | 2018-06-29 | 阿里巴巴集团控股有限公司 | URL attack detection methods, device and electronic equipment |
Non-Patent Citations (1)
Title |
---|
甘 宏,潘 丹: "基于 SVM 和 TF-IDF 的恶意 URL 识别分析与研究", 《计算机与现代化》 * |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110266675A (en) * | 2019-06-12 | 2019-09-20 | 成都积微物联集团股份有限公司 | A kind of xss attack automated detection method based on deep learning |
CN110266675B (en) * | 2019-06-12 | 2022-11-04 | 成都积微物联集团股份有限公司 | Automatic detection method for xss attack based on deep learning |
CN114080783A (en) * | 2019-07-03 | 2022-02-22 | 沙特阿拉伯石油公司 | System and method for securely communicating selective data sets between terminals supporting multiple applications |
CN110347724A (en) * | 2019-07-12 | 2019-10-18 | 深圳众赢维融科技有限公司 | Abnormal behaviour recognition methods, device, electronic equipment and medium |
CN113079127A (en) * | 2020-01-03 | 2021-07-06 | 台达电子工业股份有限公司 | Generation and application method of attack recognition data model |
CN113079127B (en) * | 2020-01-03 | 2023-06-02 | 台达电子工业股份有限公司 | Method for generating and applying attack recognition data model |
WO2021159575A1 (en) * | 2020-02-12 | 2021-08-19 | 网宿科技股份有限公司 | Machine learning technique based whitelist self-learning method and device |
CN111740957A (en) * | 2020-05-21 | 2020-10-02 | 江苏信息职业技术学院 | Automatic XSS attack detection method based on FP-tree optimization |
CN111970251A (en) * | 2020-07-28 | 2020-11-20 | 西安万像电子科技有限公司 | Data processing method and server |
CN112968872A (en) * | 2021-01-29 | 2021-06-15 | 成都信息工程大学 | Malicious flow detection method, system and terminal based on natural language processing |
CN112968872B (en) * | 2021-01-29 | 2023-04-18 | 成都信息工程大学 | Malicious flow detection method, system and terminal based on natural language processing |
CN113783889A (en) * | 2021-09-22 | 2021-12-10 | 南方电网数字电网研究院有限公司 | Firewall control method for linkage access of network layer and application layer and firewall thereof |
CN114048740B (en) * | 2021-09-28 | 2022-10-28 | 马上消费金融股份有限公司 | Sensitive word detection method and device and computer readable storage medium |
CN114048740A (en) * | 2021-09-28 | 2022-02-15 | 马上消费金融股份有限公司 | Sensitive word detection method and device and computer readable storage medium |
CN113904834A (en) * | 2021-09-30 | 2022-01-07 | 北京华清信安科技有限公司 | XSS attack detection method based on machine learning |
CN113886815A (en) * | 2021-10-14 | 2022-01-04 | 北京华清信安科技有限公司 | SQL injection attack detection method based on machine learning |
CN115987620A (en) * | 2022-12-21 | 2023-04-18 | 北京天云海数技术有限公司 | Method and system for detecting web attack |
CN115987620B (en) * | 2022-12-21 | 2023-11-07 | 北京天云海数技术有限公司 | Method and system for detecting web attack |
CN116980235A (en) * | 2023-09-25 | 2023-10-31 | 成都数智创新精益科技有限公司 | Artificial intelligence-based interception method for WEB illegal request |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109714341A (en) | A kind of Web hostile attack identification method, terminal device and storage medium | |
Li et al. | Significant permission identification for machine-learning-based android malware detection | |
Uwagbole et al. | Applied machine learning predictive analytics to SQL injection attack detection and prevention | |
Han et al. | Learning to predict severity of software vulnerability using only vulnerability description | |
CN104077396B (en) | Method and device for detecting phishing website | |
CN106295333B (en) | method and system for detecting malicious code | |
KR101858620B1 (en) | Device and method for analyzing javascript using machine learning | |
CN111783132A (en) | SQL sentence security detection method, device, equipment and medium based on machine learning | |
CN113098887A (en) | Phishing website detection method based on website joint characteristics | |
Ban et al. | Integration of multi-modal features for android malware detection using linear SVM | |
CN110191096A (en) | A kind of term vector homepage invasion detection method based on semantic analysis | |
CN110263539A (en) | A kind of Android malicious application detection method and system based on concurrent integration study | |
CN106446124A (en) | Website classification method based on network relation graph | |
CN112131249A (en) | Attack intention identification method and device | |
CN106874760A (en) | A kind of Android malicious code sorting techniques based on hierarchy type SimHash | |
CN103488707A (en) | Method of searching for candidate classes based on greedy strategy and heuristic algorithm | |
Si et al. | Malware detection using automated generation of yara rules on dynamic features | |
Hao et al. | SCScan: A SVM-based scanning system for vulnerabilities in blockchain smart contracts | |
Zhu et al. | Making smart contract classification easier and more effective | |
CN112016317A (en) | Sensitive word recognition method and device based on artificial intelligence and computer equipment | |
CN116975865A (en) | Malicious Office document detection method, device, equipment and storage medium | |
Parmar et al. | A review on data balancing techniques and machine learning methods | |
Fettaya et al. | Detecting malicious PDF using CNN | |
CN107239704A (en) | Malicious web pages find method and device | |
Kalouptsoglou et al. | Software vulnerability prediction: A systematic mapping study |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190503 |
|
RJ01 | Rejection of invention patent application after publication |