CN109714341A - A kind of Web hostile attack identification method, terminal device and storage medium - Google Patents

A kind of Web hostile attack identification method, terminal device and storage medium Download PDF

Info

Publication number
CN109714341A
CN109714341A CN201811619182.5A CN201811619182A CN109714341A CN 109714341 A CN109714341 A CN 109714341A CN 201811619182 A CN201811619182 A CN 201811619182A CN 109714341 A CN109714341 A CN 109714341A
Authority
CN
China
Prior art keywords
data
web
word
idf
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811619182.5A
Other languages
Chinese (zh)
Inventor
陈奋
陈荣有
程长高
姚鸿富
吴顺祥
高云龙
陈柏华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Service Cloud Mdt Infotech Ltd
Original Assignee
Xiamen Service Cloud Mdt Infotech Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Service Cloud Mdt Infotech Ltd filed Critical Xiamen Service Cloud Mdt Infotech Ltd
Priority to CN201811619182.5A priority Critical patent/CN109714341A/en
Publication of CN109714341A publication Critical patent/CN109714341A/en
Pending legal-status Critical Current

Links

Abstract

The present invention relates to Web security technology areas, propose a kind of Web hostile attack identification method, terminal device and storage medium, in the method, two big steps are identified including model foundation and data, in model foundation, including step 1: acquiring the blacklist and white list sample data of a large amount of web access data respectively, after carrying out unified decoding to the network address in sample data, decoded network address is carried out character processing;Step 2: feature extraction is carried out to by the processed sample data of step 1 by TF-IDF algorithm, calculates the characteristic value of each sample data;Step 3: according to the characteristic value of blacklist and white list sample data, being trained by algorithm of support vector machine, disaggregated model and preservation after being trained, and the disaggregated model is for distinguishing blacklist data and white list data.TF-IDF and support vector machines are applied to Web safety detection by the present invention, quickly to identify that malicious attack is requested.

Description

A kind of Web hostile attack identification method, terminal device and storage medium
Technical field
The present invention relates to Web security technology area more particularly to a kind of Web hostile attack identification method, terminal device and Storage medium.
Background technique
With the development of Global Internet, the world has been introduced into a high speed information epoch.Pass through network, Ren Menke Easily to browse and share huge network data, meanwhile, the core business of more and more enterprises is real using Web application Existing, this makes enterprise fortune be closely related with network security, and then is closely related with the life of broad masses.However, because Web Opening, the uncontrollability of itself, so that hacker is emerged one after another using the security incident that network hole is attacked.Recently, whole world neck First network security and Radware company, application delivery solution provider has issued second part of safe tune of year Web application Look into report: Radware2018 Web application security status.Report points out that most enterprises' (67%) think that hacker still can invade Enter enterprise network.It reports while pointing out, at least 89% interviewee was met in the previous year for Web application or Web clothes The be engaged in attack of device especially claims that the interviewee by encryption Web attack has risen to 2018 from 12% in 2017 50%.Most interviewees' (59%) claim daily or weekly can all have attack.The frequency and complexity attacked with Web Continuous to increase, traditional Web preventive means facing challenges also increase with it, and disadvantage also gradually highlights.
Up to the present, traditional Web preventive means, substantially the blacklist testing mechanism dependent on rule, either Web application firewall or ids etc. depend on the canonical of detecting and alarm, carry out the matching of message.Although can resist big Partial attack, but still there are the following problems:
1, rule base is difficult in maintenance.Currently, the attack means deformation of attacker is more next more, different coding staffs is such as used The skills such as formula, capital and small letter variation and alternative sentence, it is possible to around detection, implement various modifications attack.If to this Characterization rules are all added in a little deformation attacks, and feature database can be made too fat to move, difficult in maintenance.
2, regular formulation requires high.Rule write it is too wide in range easily manslaughter, rule is write too thin, easily bypasses.
3, when canonical item number is excessive, protective performance is seriously affected.
4, to new attack means, protective capacities is poor.
By the analysis to tradition dependent on the black list testing mechanism of rule, it can be derived that how to accomplish in magnanimity It is fast and accurate in request really to identify that malicious attack is requested, it is the problem that we need to solve at present.
Summary of the invention
In view of the above-mentioned problems, the present invention is intended to provide a kind of Web hostile attack identification method, terminal device and storage are situated between TF-IDF and support vector machines are applied to Web and examined safely by matter by introducing machine learning the relevant technologies in the security fields Web It surveys, quickly to identify that malicious attack is requested.
Concrete scheme is as follows:
A kind of Web hostile attack identification method, comprising the following steps:
(1), disaggregated model is established
Step 1: the blacklist and white list sample data of a large amount of web access data are acquired respectively, in sample data Network address carry out unified decoding, after being converted into unified coded format, decoded network address is subjected to character Processing influences to avoid meaningless character and carries out the unification of format;
Step 2: feature extraction is carried out to by the processed sample data of step 1 by TF-IDF algorithm, is calculated every The characteristic value of a sample data;
Step 3: according to the characteristic value of blacklist and white list sample data, being trained by algorithm of support vector machine, Disaggregated model and preservation after being trained, the disaggregated model is for distinguishing blacklist data and white list data;
(2), data identify
Step 4: it after being decoded to the network address of the access data received, is converted into step 1 and uses Coded format, while by decoded network address carry out character processing;
Step 5: feature extraction is carried out to by the processed data of step 4 by TF-IDF algorithm, calculates data Characteristic value;
Step 6: according to the characteristic value of data, access data is identified by disaggregated model, judge whether it belongs to Blacklist data.
Further, the character processing are as follows: all letters are uniformly set as upper case character or lowercase character, will be owned Chinese and number are uniformly set as specific character, and the specific character is and the character in network address in addition to Chinese and number Different characters.
Further, the calculating process of the characteristic value are as follows:
(1), the length of word in data is set as s, and it is s that data are divided into multiple length in sequence according to the length s of word Word;
(2), the word frequency TF:TF=1+ln (N) of each word is calculated, in which: N is the number that the word occurs in data;
(3), the inverse document frequency IDF:IDF=1+ln (p/q) of each word is calculated, wherein p is data count, and q is Data number comprising the word;
(4), the characteristic value TF-IDF of the data is calculated:
Further, the length s=3 of institute's predicate.
Further, screened in step 3 maximum 1000 sample datas of TF-IDF value in step 2 as support to The training data of amount machine algorithm.
A kind of Web malicious attack identification terminal equipment, including processor, memory and storage are in the memory simultaneously The computer program that can be run on the processor, the processor realize that the present invention is implemented when executing the computer program The step of example above-mentioned method.
A kind of computer readable storage medium, the computer-readable recording medium storage have computer program, feature The step of being, above-mentioned method of the embodiment of the present invention realized when the computer program is executed by processor.
The present invention uses technical solution as above, by introducing machine learning the relevant technologies in the security fields Web, by TF-IDF It is applied to Web safety detection with support vector machines, quickly to identify that malicious attack is requested, while the model established infuses sql Enter attack, XSS attack has very high precision of prediction, and model have deformation attack recognition, new attack mode identify and The ability of semantic analysis.
Detailed description of the invention
Fig. 1 show the flow diagram of the embodiment of the present invention one.
Fig. 2 show the schematic diagram of the algorithm of support vector machine of the embodiment of the present invention one.
Specific embodiment
To further illustrate that each embodiment, the present invention are provided with attached drawing.These attached drawings are that the invention discloses one of content Point, mainly to illustrate embodiment, and the associated description of specification can be cooperated to explain the operation principles of embodiment.Cooperation ginseng These contents are examined, those of ordinary skill in the art will be understood that other possible embodiments and advantages of the present invention.
Now in conjunction with the drawings and specific embodiments, the present invention is further described.
Embodiment one:
Refering to what is shown in Fig. 1, the present invention provides a kind of Web hostile attack identification methods, comprising the following steps:
(1), disaggregated model is established
Step 1: the blacklist and white list sample data of a large amount of web access data are acquired respectively, in sample data Network address (URL) carry out unified decoding, after being converted into unified coded format, decoded network address is carried out Character processing influences to avoid meaningless character and carries out the unification of format.
The character handles those skilled in the art and can set according to demand, is specially in the embodiment are as follows: will All letters are uniformly set as upper case character or lowercase character, and all Chinese and number are uniformly set as specific character, described Specific character is the character different from the character in network address in addition to Chinese and number.
It is described that by Chinese and number, to be uniformly set as specific character be the influence for being used to reject hash, should be for For the judgement of Web blacklist, Chinese and number are therefore meaningless character is set for blacklist judgement The process that feature extraction can be simplified for specific character accelerates the speed of identification.
In the embodiment, all letters are revised as small letter, therefore, are set as capitalizing by specific character in the embodiment Alphabetical " N ".Those skilled in the art also can be set as other characters.
Step 2: feature extraction is carried out to by the processed sample data of step 1 by TF-IDF algorithm, is calculated every The characteristic value of a sample data.
TF-IDF (term frequency-inverse document frequency) be it is a kind of for information retrieval with The common weighting technique of data mining.For assessing a word or a word for wherein one in a file set or a corpus The significance level of part file.In the embodiment, it is flexibly applied in the security fields Web, is based on by the way that TF-IDF is this The method of statistics carries out feature extraction to a large amount of web access data, most representative key word is obtained, to realize Feature conversion.
(1), during feature converts, firstly, it is necessary to carry out word frequency to by the processed sample data of step 1 The statistics of TF (Term frequency).It is the length of a word by three character settings during specific implementation, it is also same When consider data be smoothed and data are normalized, to improve the prediction classified after feature conversion Precision.
When in smoothing process, the calculation formula of word frequency TF optimizes are as follows:
TF=1+ln (N)
Wherein: N is the number that certain word occurs in data.
It is described in detail below with a specific example:
It is assumed that an access data are "/css/css_js.php ", it is the length of a word according to three character settings, The access data may be split into 13 words, be respectively as follows :/cs, css, ss/, s/c ,/cs, css, ss_, s_j, _ js, js., s.p, .ph, php, wherein "/cs " and " css " occurs twice altogether, remaining 11 word only occurs once, then passing through the meter of word frequency TF It calculates formula to be calculated: the TF value of "/cs " and " css " are as follows: 1.693, the TF value of remaining word is 1.
(2), in above-mentioned example, the TF value of "/cs " and " css " and other words can serve as the access after feature conversion The dimension of data is supplied to sorting algorithm and is detected, wherein "/cs " and " css " frequency of occurrences with respect to other words frequency compared with Height will play bigger effect during detection, still, if these words are in blacklist sample and white list sample In it is all a large amount of occur, such as "/cs " and " css " distinguish it is representative just very small in blacklist and white list sample, they The effect played in the detection just becomes very little, if be just difficult to using the method progress feature conversion for only considering word frequency The key feature of blacklist and white list sample is counted, so that the normal data being difficult to detect by web access and abnormal number According to.If some word, only occur in blacklist sample, although opposite total sample number, the number that it occurs is few, it is being detected Shared weight is still very high in blacklist sample.So, consider the generation according to each word in blacklist and white list sample Table gives one corresponding weight of each word.One word can predict that normal and abnormal data ability is bigger, and weight is got over Greatly, conversely, weight is smaller.It is assumed that word " css " only occurs in blacklist sample, then its weight when predicting black sample is just It is bigger, on the contrary it is smaller.In the embodiment, come using inverse document frequency IDF (Inverse document frequency) It is measured.
Set the calculation formula of IDF are as follows:
IDF=1+ln (p/q)
Wherein, p is total sample number, and q is the sample number comprising the word.
It suppose there is 100,000 sample datas, all types of data volumes of these sample datas are suitable, wherein there is 200 sample numbers " css " word is contained in, and "/cs " is contained in 1000 sample datas, then,
The weight of " css " in the sample are as follows: IDF=1+ln (100000/200)=7.215,
The weight of "/cs " in the sample are as follows: IDF=1+ln (100000/1000)=5.605.
(3), according to TF value calculated above, after being re-introduced into IDF value, the TF-IDF value of each word is calculated:
TF-IDF=TF*IDF
Then: the TF-IDF value of "/cs " and " css " are respectively 9.489 and 12.215.
It is obtained according to the above results, word " css " in the detection process, will play bigger effect.
(4), the characteristic value of sample data, i.e. TF-IDF value are calculated according to following equation:
Wherein, n is the number for the word for including in sample data.
(5), the characteristic value of sample data is normalized, and is normalized in the embodiment using Frobenius norm Processing, calculation formula are as follows:
By the above method in the embodiment, " representativeness " of word frequency and word in the sample is comprehensively considered, by every number According to be fractionation that length carries out word by three characters, then calculate the overall target TF-IDF value of each word as characteristic value, most Eventually, achieve the purpose that feature converts.
Step 3: according to the characteristic value of blacklist and white list sample data, being trained by algorithm of support vector machine, It is adjusted by parameter, the optimal classification model after being trained, and model is saved, the disaggregated model is black for distinguishing List data and white list data.
Since it is considered that dimension is higher, calculating is more complicated, when training data is huge, be easy to cause " dimension disaster ", together When, excessive dimension is not necessarily very helpful to the raising of accuracy, we have screened TF-IDF in specific implementation It is worth the dimension of training and test of maximum 1000 words as algorithm of support vector machine.
Data are trained and are predicted by support vector machines (support vector machine, SVM) algorithm.Branch Holding vector machine is a kind of sorting algorithm, and generalization ability is improved by seeking structuring least risk, realizes empiric risk and sets The minimum of letter range can also obtain the purpose of good statistical law to reach in the case where statistical sample amount is less.Such as Shown in Fig. 2, it is a kind of two classification model, and basic model is defined as the maximum linear classification in the interval on feature space Device, the i.e. learning strategy of support vector machines are margin maximizations.According to the characteristic value of blacklist and white list sample data, lead to Blacklist data and white list data can be distinguished by crossing the disaggregated model that algorithm of support vector machine trains.
Support vector machines, which is selected, as the reason of sorting algorithm the following:
1, it is based on structural risk minimization, in this way can be to avoid overfitting problem, generalization ability is strong.
2, support vector machines has the small-sample learning method of solid theoretical basis.It is not related to probability measure and big substantially Number law.It inherently sees, avoids from the conventional procedure concluded to deduction, realize efficiently from training sample to pre- test sample This " transduction inference ", the problems such as enormously simplifying common classification and return.
3, the terminal decision function of support vector machines is only determined that the complexity of calculating depends on by a small number of supporting vectors The number of supporting vector, rather than the dimension of sample space, this avoids " dimension disaster " in some sense.
4, a small number of supporting vectors determine final result, this facilitates grasp the key link sample, " rejecting " bulk redundancy sample, And it is simple to be doomed this method algorithm, while having preferable " robust " property.
(2), data identify
After establishing disaggregated model, so that it may the web access data newly received are predicted, judge its whether be Blacklist data.
Step 4: it after being decoded to the network address of the access data received, is converted into step 1 and uses Coded format, while by decoded network address Chinese and number be uniformly set as character used in step 1;
Step 5: feature extraction is carried out to by the processed data of step 4 by TF-IDF algorithm, calculates data Characteristic value;
Step 6: according to the characteristic value of data, access data is identified by disaggregated model, judge whether it belongs to Blacklist data.
In the embodiment, select 140,000 truthful datas as training and test, wherein random selection 80% is as instruction Practice data, 20% is used as test data, seeks prediction accuracy of the mean for such cross validation 10 times, uses N-gram in experiment Four kinds of+SVM, TF-IDF+SVM, TF-IDF+KNN, TF-IDF+Logistic Regression algorithm modes are tested, such as Shown in table 1, experiment shows the model of TF-IDF+SVM, accuracy highest, accuracy 99.89%, while the model to sql Injection attacks, XSS attack have a very high precision of prediction, and model there is deformation attack recognition, new attack mode to identify with And the ability of semantic analysis.
Table 1
Embodiment two:
The present invention also provides a kind of Web malicious attack identification terminal equipment, including memory, processor and it is stored in institute The computer program that can be run in memory and on the processor is stated, the processor executes real when the computer program Step in the above method embodiment of the existing embodiment of the present invention one.
Further, as an executable scheme, the Web malicious attack identification terminal equipment can be desktop meter Calculation machine, notebook, palm PC and cloud server etc. calculate equipment.The Web malicious attack identification terminal equipment may include, But it is not limited only to, processor, memory.It will be understood by those skilled in the art that above-mentioned Web malicious attack identification terminal equipment Composed structure is only the example of Web malicious attack identification terminal equipment, is not constituted to Web malicious attack identification terminal equipment Restriction, may include perhaps combining certain components or different components, such as institute than above-mentioned more or fewer components Stating Web malicious attack identification terminal equipment can also include input-output equipment, network access equipment, bus etc., and the present invention is real Example is applied not limit this.
Further, as an executable scheme, alleged processor can be central processing unit (Central Processing Unit, CPU), it can also be other general processors, digital signal processor (Digital Signal Processor, DSP), it is specific integrated circuit (Application Specific Integrated Circuit, ASIC), existing At programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete Door or transistor logic, discrete hardware components etc..General processor can be microprocessor or the processor can also To be any conventional processor etc., the processor is the control centre of the Web malicious attack identification terminal equipment, is utilized The various pieces of various interfaces and the entire Web malicious attack identification terminal equipment of connection.
The memory can be used for storing the computer program and/or module, and the processor is by operation or executes Computer program in the memory and/or module are stored, and calls the data being stored in memory, described in realization The various functions of Web malicious attack identification terminal equipment.The memory can mainly include storing program area and storage data area, Wherein, storing program area can application program needed for storage program area, at least one function;Storage data area can store basis Mobile phone uses created data etc..In addition, memory may include high-speed random access memory, it can also include non-easy The property lost memory, such as hard disk, memory, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card), at least one disk memory, flush memory device or other Volatile solid-state part.
The present invention also provides a kind of computer readable storage medium, the computer-readable recording medium storage has computer Program, when the computer program is executed by processor the step of the realization above method of the embodiment of the present invention.
If the integrated module/unit of the Web malicious attack identification terminal equipment is real in the form of SFU software functional unit Now and when sold or used as an independent product, it can store in a computer readable storage medium.Based in this way Understanding, the present invention realize above-described embodiment method in all or part of the process, can also be instructed by computer program Relevant hardware is completed, and the computer program can be stored in a computer readable storage medium, the computer program When being executed by processor, it can be achieved that the step of above-mentioned each embodiment of the method.Wherein, the computer program includes computer Program code, the computer program code can be source code form, object identification code form, executable file or certain centres Form etc..The computer-readable medium may include: can carry the computer program code any entity or device, Recording medium, USB flash disk, mobile hard disk, magnetic disk, CD, computer storage, read-only memory (ROM, ROM, Read-Only Memory), random access memory (RAM, Random Access Memory) and software distribution medium etc..
Although specifically showing and describing the present invention in conjunction with preferred embodiment, those skilled in the art should be bright It is white, it is not departing from the spirit and scope of the present invention defined by the appended claims, it in the form and details can be right The present invention makes a variety of changes, and is protection scope of the present invention.

Claims (7)

1. a kind of Web hostile attack identification method, it is characterised in that: the following steps are included:
(1), disaggregated model is established
Step 1: the blacklist and white list sample data of a large amount of web access data are acquired respectively, to the net in sample data Network address carries out unified decoding, and after being converted into unified coded format, decoded network address is carried out character processing, The unification of format is influenced and carried out to avoid meaningless character;
Step 2: feature extraction is carried out to by the processed sample data of step 1 by TF-IDF algorithm, calculates each sample The characteristic value of notebook data;
Step 3: it according to the characteristic value of blacklist and white list sample data, is trained, is obtained by algorithm of support vector machine Disaggregated model and preservation after training, the disaggregated model is for distinguishing blacklist data and white list data;
(2), data identify
Step 4: after being decoded to the network address of the access data received, it is converted into volume used in step 1 Code format, while decoded network address is subjected to character processing;
Step 5: feature extraction is carried out to by the processed data of step 4 by TF-IDF algorithm, calculates the feature of data Value;
Step 6: according to the characteristic value of data, access data is identified by disaggregated model, judge whether it belongs to black name Forms data.
2. Web hostile attack identification method according to claim 1, it is characterised in that: the character processing are as follows: will own Letter is uniformly set as upper case character or lowercase character, and all Chinese and number are uniformly set as specific character, described specific Character is the character different from the character in network address in addition to Chinese and number.
3. Web hostile attack identification method according to claim 1, it is characterised in that: the calculating process of the characteristic value Are as follows:
(1), the length of word in data is set as s, and data are divided into the word that multiple length are s in sequence according to the length s of word;
(2), the word frequency TF:TF=1+ln (N) of each word is calculated, in which: N is the number that the word occurs in data;
(3), calculate the inverse document frequency IDF:IDF=1+ln (p/q) of each word, wherein p is data count, q be comprising The data number of the word;
(4), the characteristic value TF-IDF of the data is calculated:
4. Web hostile attack identification method according to claim 3, it is characterised in that: the length s=3 of institute's predicate.
5. Web hostile attack identification method according to claim 1, it is characterised in that: screened step 2 in step 3 Training data of middle maximum 1000 sample datas of TF-IDF value as algorithm of support vector machine.
6. a kind of Web malicious attack identification terminal equipment, it is characterised in that: including processor, memory and be stored in described The computer program run in memory and on the processor, the processor are realized such as when executing the computer program The step of Claims 1 to 5 the method.
7. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In realization is such as the step of Claims 1 to 5 the method when the computer program is executed by processor.
CN201811619182.5A 2018-12-28 2018-12-28 A kind of Web hostile attack identification method, terminal device and storage medium Pending CN109714341A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811619182.5A CN109714341A (en) 2018-12-28 2018-12-28 A kind of Web hostile attack identification method, terminal device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811619182.5A CN109714341A (en) 2018-12-28 2018-12-28 A kind of Web hostile attack identification method, terminal device and storage medium

Publications (1)

Publication Number Publication Date
CN109714341A true CN109714341A (en) 2019-05-03

Family

ID=66258816

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811619182.5A Pending CN109714341A (en) 2018-12-28 2018-12-28 A kind of Web hostile attack identification method, terminal device and storage medium

Country Status (1)

Country Link
CN (1) CN109714341A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110266675A (en) * 2019-06-12 2019-09-20 成都积微物联集团股份有限公司 A kind of xss attack automated detection method based on deep learning
CN110347724A (en) * 2019-07-12 2019-10-18 深圳众赢维融科技有限公司 Abnormal behaviour recognition methods, device, electronic equipment and medium
CN111740957A (en) * 2020-05-21 2020-10-02 江苏信息职业技术学院 Automatic XSS attack detection method based on FP-tree optimization
CN111970251A (en) * 2020-07-28 2020-11-20 西安万像电子科技有限公司 Data processing method and server
CN112968872A (en) * 2021-01-29 2021-06-15 成都信息工程大学 Malicious flow detection method, system and terminal based on natural language processing
CN113079127A (en) * 2020-01-03 2021-07-06 台达电子工业股份有限公司 Generation and application method of attack recognition data model
WO2021159575A1 (en) * 2020-02-12 2021-08-19 网宿科技股份有限公司 Machine learning technique based whitelist self-learning method and device
CN113783889A (en) * 2021-09-22 2021-12-10 南方电网数字电网研究院有限公司 Firewall control method for linkage access of network layer and application layer and firewall thereof
CN113886815A (en) * 2021-10-14 2022-01-04 北京华清信安科技有限公司 SQL injection attack detection method based on machine learning
CN113904834A (en) * 2021-09-30 2022-01-07 北京华清信安科技有限公司 XSS attack detection method based on machine learning
CN114048740A (en) * 2021-09-28 2022-02-15 马上消费金融股份有限公司 Sensitive word detection method and device and computer readable storage medium
CN114080783A (en) * 2019-07-03 2022-02-22 沙特阿拉伯石油公司 System and method for securely communicating selective data sets between terminals supporting multiple applications
CN115987620A (en) * 2022-12-21 2023-04-18 北京天云海数技术有限公司 Method and system for detecting web attack
CN116980235A (en) * 2023-09-25 2023-10-31 成都数智创新精益科技有限公司 Artificial intelligence-based interception method for WEB illegal request

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103577755A (en) * 2013-11-01 2014-02-12 浙江工业大学 Malicious script static detection method based on SVM (support vector machine)
CN105553974A (en) * 2015-12-14 2016-05-04 中国电子信息产业集团有限公司第六研究所 Prevention method of HTTP slow attack
CN105740460A (en) * 2016-02-24 2016-07-06 中国科学技术信息研究所 Webpage collection recommendation method and device
CN107872452A (en) * 2017-10-25 2018-04-03 东软集团股份有限公司 A kind of recognition methods of malicious websites, device, storage medium and program product
CN108229156A (en) * 2017-12-28 2018-06-29 阿里巴巴集团控股有限公司 URL attack detection methods, device and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103577755A (en) * 2013-11-01 2014-02-12 浙江工业大学 Malicious script static detection method based on SVM (support vector machine)
CN105553974A (en) * 2015-12-14 2016-05-04 中国电子信息产业集团有限公司第六研究所 Prevention method of HTTP slow attack
CN105740460A (en) * 2016-02-24 2016-07-06 中国科学技术信息研究所 Webpage collection recommendation method and device
CN107872452A (en) * 2017-10-25 2018-04-03 东软集团股份有限公司 A kind of recognition methods of malicious websites, device, storage medium and program product
CN108229156A (en) * 2017-12-28 2018-06-29 阿里巴巴集团控股有限公司 URL attack detection methods, device and electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
甘 宏,潘 丹: "基于 SVM 和 TF-IDF 的恶意 URL 识别分析与研究", 《计算机与现代化》 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110266675A (en) * 2019-06-12 2019-09-20 成都积微物联集团股份有限公司 A kind of xss attack automated detection method based on deep learning
CN110266675B (en) * 2019-06-12 2022-11-04 成都积微物联集团股份有限公司 Automatic detection method for xss attack based on deep learning
CN114080783A (en) * 2019-07-03 2022-02-22 沙特阿拉伯石油公司 System and method for securely communicating selective data sets between terminals supporting multiple applications
CN110347724A (en) * 2019-07-12 2019-10-18 深圳众赢维融科技有限公司 Abnormal behaviour recognition methods, device, electronic equipment and medium
CN113079127A (en) * 2020-01-03 2021-07-06 台达电子工业股份有限公司 Generation and application method of attack recognition data model
CN113079127B (en) * 2020-01-03 2023-06-02 台达电子工业股份有限公司 Method for generating and applying attack recognition data model
WO2021159575A1 (en) * 2020-02-12 2021-08-19 网宿科技股份有限公司 Machine learning technique based whitelist self-learning method and device
CN111740957A (en) * 2020-05-21 2020-10-02 江苏信息职业技术学院 Automatic XSS attack detection method based on FP-tree optimization
CN111970251A (en) * 2020-07-28 2020-11-20 西安万像电子科技有限公司 Data processing method and server
CN112968872A (en) * 2021-01-29 2021-06-15 成都信息工程大学 Malicious flow detection method, system and terminal based on natural language processing
CN112968872B (en) * 2021-01-29 2023-04-18 成都信息工程大学 Malicious flow detection method, system and terminal based on natural language processing
CN113783889A (en) * 2021-09-22 2021-12-10 南方电网数字电网研究院有限公司 Firewall control method for linkage access of network layer and application layer and firewall thereof
CN114048740B (en) * 2021-09-28 2022-10-28 马上消费金融股份有限公司 Sensitive word detection method and device and computer readable storage medium
CN114048740A (en) * 2021-09-28 2022-02-15 马上消费金融股份有限公司 Sensitive word detection method and device and computer readable storage medium
CN113904834A (en) * 2021-09-30 2022-01-07 北京华清信安科技有限公司 XSS attack detection method based on machine learning
CN113886815A (en) * 2021-10-14 2022-01-04 北京华清信安科技有限公司 SQL injection attack detection method based on machine learning
CN115987620A (en) * 2022-12-21 2023-04-18 北京天云海数技术有限公司 Method and system for detecting web attack
CN115987620B (en) * 2022-12-21 2023-11-07 北京天云海数技术有限公司 Method and system for detecting web attack
CN116980235A (en) * 2023-09-25 2023-10-31 成都数智创新精益科技有限公司 Artificial intelligence-based interception method for WEB illegal request

Similar Documents

Publication Publication Date Title
CN109714341A (en) A kind of Web hostile attack identification method, terminal device and storage medium
Li et al. Significant permission identification for machine-learning-based android malware detection
Uwagbole et al. Applied machine learning predictive analytics to SQL injection attack detection and prevention
Han et al. Learning to predict severity of software vulnerability using only vulnerability description
CN104077396B (en) Method and device for detecting phishing website
CN106295333B (en) method and system for detecting malicious code
KR101858620B1 (en) Device and method for analyzing javascript using machine learning
CN111783132A (en) SQL sentence security detection method, device, equipment and medium based on machine learning
CN113098887A (en) Phishing website detection method based on website joint characteristics
Ban et al. Integration of multi-modal features for android malware detection using linear SVM
CN110191096A (en) A kind of term vector homepage invasion detection method based on semantic analysis
CN110263539A (en) A kind of Android malicious application detection method and system based on concurrent integration study
CN106446124A (en) Website classification method based on network relation graph
CN112131249A (en) Attack intention identification method and device
CN106874760A (en) A kind of Android malicious code sorting techniques based on hierarchy type SimHash
CN103488707A (en) Method of searching for candidate classes based on greedy strategy and heuristic algorithm
Si et al. Malware detection using automated generation of yara rules on dynamic features
Hao et al. SCScan: A SVM-based scanning system for vulnerabilities in blockchain smart contracts
Zhu et al. Making smart contract classification easier and more effective
CN112016317A (en) Sensitive word recognition method and device based on artificial intelligence and computer equipment
CN116975865A (en) Malicious Office document detection method, device, equipment and storage medium
Parmar et al. A review on data balancing techniques and machine learning methods
Fettaya et al. Detecting malicious PDF using CNN
CN107239704A (en) Malicious web pages find method and device
Kalouptsoglou et al. Software vulnerability prediction: A systematic mapping study

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190503

RJ01 Rejection of invention patent application after publication