CN111314291A - Website security detection method and device and storage medium - Google Patents

Website security detection method and device and storage medium Download PDF

Info

Publication number
CN111314291A
CN111314291A CN202010041463.8A CN202010041463A CN111314291A CN 111314291 A CN111314291 A CN 111314291A CN 202010041463 A CN202010041463 A CN 202010041463A CN 111314291 A CN111314291 A CN 111314291A
Authority
CN
China
Prior art keywords
detection result
result
website
target website
security
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010041463.8A
Other languages
Chinese (zh)
Inventor
顾泽宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Mobile Software Co Ltd
Original Assignee
Beijing Xiaomi Mobile Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Mobile Software Co Ltd filed Critical Beijing Xiaomi Mobile Software Co Ltd
Priority to CN202010041463.8A priority Critical patent/CN111314291A/en
Publication of CN111314291A publication Critical patent/CN111314291A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1433Vulnerability analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1483Countermeasures against malicious traffic service impersonation, e.g. phishing, pharming or web spoofing

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer And Data Communications (AREA)

Abstract

The disclosure relates to a website security detection method and device and a storage medium. The website security detection method may include: detecting a target website by using a positive sample model to obtain a first detection result; detecting the target website by using the negative sample model to obtain a second detection result; and determining the security detection result of the target website according to the first detection result and the second detection result. In the embodiment of the application, the target website is detected by simultaneously adopting the positive sample model and the negative sample model, and then the detection results of the two models are integrated to comprehensively determine the security of the target website. Compared with the detection result of detecting the target website by a single model, the method has the problems of poor detection accuracy and the like caused by unrepeated coverage scenes, and improves the detection accuracy of the target website.

Description

Website security detection method and device and storage medium
Technical Field
The present disclosure relates to the field of information technology, and in particular, to a method and an apparatus for detecting website security, and a storage medium.
Background
Aiming at the increasingly severe network security situation, the improvement of the network security is more and more urgent. Various models are used in the prior art for security detection of web sites. Sometimes, however, a sample data set needs to be collected before model training is performed. However, in a real network environment, attack modes are various, and a sample data set is difficult to cover all malicious websites. At present, an attack is a zero-day attack and a zero-time difference attack, which refers to a security vulnerability which is immediately utilized maliciously after being discovered. Colloquially, that is, within the same day that a security patch is exposed to a flaw, the associated malicious program appears. Such attacks tend to be very bursty and destructive. This makes the website detection security always have a phenomenon of hole or poor accuracy in the related art.
Disclosure of Invention
The disclosure provides a website security detection method and device and a storage medium.
A first aspect of the embodiments of the present application provides a method for detecting website security, including:
detecting a target website by using a positive sample model to obtain a first detection result;
detecting the target website by using the negative sample model to obtain a second detection result;
and determining the security detection result of the target website according to the first detection result and the second detection result.
Based on the above scheme, the determining a security detection result of the target website according to the first detection result and the second detection result includes:
and according to a pessimistic combination algorithm, combining the first detection result and the second detection result to obtain a security detection result of the target website.
Based on the above scheme, the first detection result and the second detection result include any one of N indication values; wherein the security of the target website indicated by different indicated values is different; n is a positive integer equal to or greater than 2;
the merging the first detection result and the second detection result to obtain the security detection result of the target website according to a pessimistic merging algorithm includes:
when the indicated value indicating the safe website is smaller than the indicated value indicating the risk website, according to the pessimistic merging algorithm, determining the indicated value to be the safety detection result of the target website according to the maximum value of the indicated values contained in the first detection result and the second detection result;
alternatively, the first and second electrodes may be,
and when the indicated value indicating the safe website is greater than the indicated value indicating the risk website, determining the target website as a safety detection result according to the minimum value of the indicated value contained in the first detection result and the indicated value contained in the second detection result according to the pessimistic merging algorithm.
Based on the above scheme, the merging the first detection result and the second detection result to obtain the security detection result of the target website according to a pessimistic merging algorithm includes at least one of the following:
according to the pessimistic merging algorithm, when the first detection result and the second detection result are both safety results, determining that the safety detection result of the target website is a safety result;
according to the pessimistic merging algorithm, when at least one of the first detection result and the second detection result is a risk result, determining that the security detection result of the target website is a risk result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is a safety result and the other is an unknown result, determining that the safety detection result of the target website is an unknown result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is a safe result and the other is a suspected risk result, determining that the safety detection result of the target website is a suspected risk result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is an unknown result and the other is a suspected risk result, determining that the security detection result of the target website is a suspected risk result.
Based on the scheme, the positive sample model is a logistic regression model; and/or the negative sample model is a Support Vector Machine (SVM).
A second aspect of the embodiments of the present application provides a device for detecting website security, including:
the first detection module is configured to detect the target website by using the positive sample model to obtain a first detection result;
the second detection module is configured to detect the target website by using the negative sample model to obtain a second detection result;
a determining module configured to determine a security detection result of the target website according to the first detection result and the second detection result.
Based on the above scheme, the determining module is configured to combine the first detection result and the second detection result to obtain a security detection result of the target website according to a pessimistic combination algorithm.
Based on the above scheme, the first detection result and the second detection result include any one of N indication values; wherein the security of the target website indicated by different indicated values is different; n is a positive integer equal to or greater than 2;
the determining module is configured to determine, according to the pessimistic merging algorithm, a security detection result of the target website according to a maximum value of an indicated value included in the first detection result and an indicated value included in the second detection result when the indicated value indicating the security website is smaller than an indicated value indicating a risk website; or, when the indicated value indicating the safe website is greater than the indicated value indicating the risk website, according to the pessimistic merging algorithm, determining the minimum value of the indicated value included in the first detection result and the indicated value included in the second detection result as the safety detection result of the target website.
Based on the above scheme, the determining module is configured to perform at least one of:
according to the pessimistic merging algorithm, when the first detection result and the second detection result are both safety results, determining that the safety detection result of the target website is a safety result;
according to the pessimistic merging algorithm, when at least one of the first detection result and the second detection result is a risk result, determining that the security detection result of the target website is a risk result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is a safety result and the other is an unknown result, determining that the safety detection result of the target website is an unknown result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is a safe result and the other is a suspected risk result, determining that the safety detection result of the target website is a suspected risk result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is an unknown result and the other is a suspected risk result, determining that the security detection result of the target website is a suspected risk result.
Based on the scheme, the positive sample model is a logistic regression model; and/or the negative sample model is a Support Vector Machine (SVM).
A third aspect of the embodiments of the present application provides a website security detection apparatus, including a processor, a memory, and an executable program that is stored in the memory and can be run by the processor, where the processor executes the steps of the website security detection method according to any technical solution of the first aspect when running the executable program.
A third aspect of the embodiments of the present application provides a storage medium, on which an executable program is stored, where the executable program, when executed by a processor, implements the steps of the website security detection method provided in any of the foregoing technical solutions of the first aspect.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects: in the embodiment of the application, the target website is detected by simultaneously adopting the positive sample model and the negative sample model, and then the detection results of the two models are integrated to comprehensively determine the security of the target website. Compared with the detection result of detecting the target website by a single model, the method has the problems of poor detection accuracy and the like caused by unrepeated coverage scenes, improves the detection accuracy of the target website, and improves the detection rate.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
FIG. 1 is a flowchart illustrating a website security detection method according to an exemplary embodiment.
FIG. 2 is a flowchart illustrating a website security detection method according to an exemplary embodiment.
Fig. 3 is a block diagram illustrating a website security detection apparatus according to an exemplary embodiment.
Fig. 4 is a block diagram illustrating a website security detection apparatus according to an exemplary embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
As shown in fig. 1, the present embodiment provides a method for detecting website security, including:
s11: detecting a target website by using a positive sample model to obtain a first detection result;
s12: detecting the target website by using the negative sample model to obtain a second detection result;
s13: and determining the security detection result of the target website according to the first detection result and the second detection result.
The website security detection method can be applied to a terminal or a server. For example, before the terminal sends an access request carrying a target website, the security detection of the website may be performed by using the method. Or, the website address is sent to the server for the server to perform security detection on the website address.
In the embodiment of the application, when the security of the target website is detected, the two models are simultaneously used for respectively detecting the target website. The target web address herein may include: various types of Uniform resource addresses (URLs).
The positive sample model and the negative sample model are two models that oppose each other, and for example, the positive sample model and the negative sample model are machine learning models or deep learning models that are independent of each other.
The positive sample model is a model obtained by using safe website training. The negative sample model is a model obtained by using unsafe website training.
In the embodiments of the present application, the positive sample model and the negative sample model may have the same or different model structures.
In some embodiments, the positive sample model and the negative sample model adopt different model structures, so that the characteristics of the positive sample and the characteristics of the negative sample are respectively highlighted by using different model results, and the accuracy of the respective positive sample model and the negative sample model in judging whether the target website is a safe website is ensured. For example, the positive sample model may be a logistic regression model, and the negative sample model may be a (Support Vector Machine, SVM). Of course, the positive sample model and the negative sample model may also include different numbers of convolution layers or nodes included in the convolution layers, and may be applied to a convolutional neural network. In a specific implementation, the positive sample model and the negative sample model are not limited to the model structures exemplified above.
In some embodiments, the positive sample model may determine whether the target website is a safe website or not, or a probability of being a safe website, etc. according to a similarity between the safe website and the target website. The lower the probability of being a safe web site, the higher the probability of being a risky web site. The risk website here may be various malicious websites. The risk website herein includes, but is not limited to, at least one of:
a website suspected of social fraud;
a website suspected of information fraud;
a website suspected of false sales;
a web address containing a malicious file;
the website address of the illegal lottery website;
the website address of the pornographic website.
In other embodiments, the positive sample model may determine whether the target website is a risk website or not, or a probability that the target website is a risk website, or the like, according to the similarity between the risk website and the target website.
In some embodiments, the sensitivity of the positive sample model to safe websites may be higher than the sensitivity of the negative sample model, and the sensitivity of the negative sample model to unsafe websites (risk websites) may be higher than the sensitivity of the positive sample model.
In some embodiments, the S11 and the S12 may be performed synchronously. In some embodiments, S11 may be performed first, and then S12 may be performed. If the target website obtained in S11 is the first detection result of the secure website, then S12 is performed, so that the execution of S12 can be reduced, unnecessary computations can be reduced, and the output rate of the security detection of the target website can be increased. At this time, the final security detection result of the target website may be determined solely according to the first detection result of S11. Of course, S12 may be executed first, and then S11 may be executed, for example, S11 may be executed after S12 obtains the second detection result that the target website is determined to be a safe website, so that unnecessary execution of S11 may be reduced. At this time, the final security detection result of the target website may be determined solely according to the second detection result of S12.
In some embodiments, as shown in fig. 2, the method further comprises:
s14: and when the security detection result shows that the target website is a risk website, executing a preset security filtering operation.
In some embodiments, the S14 may include:
when the target website is determined to be a risk website according to the security detection result of the target website, outputting a risk prompt, for example, prompting that the target website is at risk for access through a pop-up window;
and/or the presence of a gas in the gas,
and intercepting the network access request carrying the risky website, and giving a notice to inform the interception reason.
And/or the presence of a gas in the gas,
recording the interception records of the risk websites, and subsequently accessing the risk websites by adopting the test terminal according to the interception records to determine the risk of the risk websites. And determining a risky website, which can be used for optimizing the training of the positive sample model and/or the negative sample model on line.
The S13 may include: and according to a pessimistic combination algorithm, combining the first detection result and the second detection result to obtain a security detection result of the target website.
In this embodiment of the application, when the pessimistic merging algorithm merges the first detection result and the second detection result, a pessimistic result of the first detection result and the second detection result is used as a final security detection result of the target website. The pessimistic result here is: and judging the target website to be an unsafe website.
The pessimistic result algorithm means that the selection result is pessimistic, namely, the result indicating that the target website is more probabilistic and higher than the risk website is taken as the final security detection result of the target website.
Thus, by adopting a pessimistic combination algorithm, the combination of the first detection result and the second detection result is performed, so that the detection result of the target website is pessimistic or conservative. And determining the target website when the first detection result and the second detection result both indicate that the target website is safe, otherwise, determining that the target website is unsafe, ensuring that misjudgment when the target website is a risk website is reduced, and reducing information safety, terminal use safety, property safety related to the terminal and the like caused by accessing the risk website.
Of course, in other embodiments, when determining the final security detection result of the target website based on the first detection result and the second detection result, the pessimistic merging algorithm is not limited to be used. For example, in some cases, the first detection result may be a probability value representing that the target website is a safe website, and the second detection result may be a probability value representing that the target website is a risky website. In S13, the first probability value and the second probability value may be substituted into a safety value calculation formula to calculate a probability value representing a final safety detection result of the target website, and finally, whether the target website is a safe website is determined according to the probability value. For example, the first probability value and the first weight value obtain a first product, the second probability value and the second weight value obtain a second product, and a difference between the first product and the second product is calculated to obtain a probability value indicating a final security result of the target website. And comparing the probability value with a probability threshold value, and judging the safety of the target website. Of course, here, is only an example of S13, and the specific implementation is not limited thereto.
In some embodiments, the first detection result and the second detection result comprise any one of N indication values; wherein the security of the target website indicated by different indicated values is different; and N is a positive integer equal to or greater than 2.
For example, the detection result may be one of arbitrary indication values. The safety indicated by different indication values is different. In some embodiments, the higher the security, the smaller the value of the indication. In other embodiments, the higher the security, the greater the value of the indication. For example, N is 3, indicating three safety states, safe, unknown and at risk, respectively. As another example, N may be 45, indicating a safe, unknown, potentially risky, 4 of these safe states, respectively.
In some embodiments, the S13 may include:
and when the indicated value indicating the safe website is smaller than the indicated value indicating the risk website, determining the target website as a safety detection result according to the maximum value of the indicated values contained in the first detection result and the second detection result according to the pessimistic merging algorithm. At this time, the larger the indication value is, the lower the security of the target website is, according to the pessimistic merging algorithm, when performing calculation, max is taken from the first detection result and the second detection result, and the security corresponding to the maximum value obtained by max is taken as the final security detection result of the target website.
In other embodiments, the S13 may include:
and when the indicated value indicating the safe website is greater than the indicated value indicating the risk website, determining the target website as a safety detection result according to the minimum value of the indicated value contained in the first detection result and the indicated value contained in the second detection result according to the pessimistic merging algorithm. At this time, the smaller the indicated value is, the lower the security of the target website is, according to the pessimistic merging algorithm, in the calculation, min is taken for the first detection result and the second detection result, and the security corresponding to the maximum value obtained according to min is taken as the final security detection result of the target website.
In some embodiments, the root S13 may include at least one of:
according to the pessimistic merging algorithm, when the first detection result and the second detection result are both safety results, determining that the safety detection result of the target website is a safety result;
according to the pessimistic merging algorithm, when at least one of the first detection result and the second detection result is a risk result, determining that the security detection result of the target website is a risk result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is a safety result and the other is an unknown result, determining that the safety detection result of the target website is an unknown result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is a safe result and the other is a suspected risk result, determining that the safety detection result of the target website is a suspected risk result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is an unknown result and the other is a suspected risk result, determining that the security detection result of the target website is a suspected risk result.
Specifically, it can be seen from table 1:
Figure BDA0002367908590000071
Figure BDA0002367908590000081
TABLE 1
As can be seen from the above table, the possible detection results corresponding to the first detection result and the second detection result include "safe", "unknown", "possibly risky", and "risky". These 4 results. The pessimistic result algorithm means that the selection result is pessimistic, namely, the result indicating that the target website is more probabilistic and higher than the risk website is taken as the final security detection result of the target website. Thus, if one of the first detection result and the second detection result indicates that the target website is at risk, the final detection result of the target website is necessarily at risk. And only if the first detection result and the second detection result indicate that the target website is a safe website, the final safety detection result of the target website is safe.
The detection result is unknown here and represents that: the positive or negative sample model cannot determine whether the current website is a safe website or a risky website.
The detection result here is that there is a possible risk that: the positive or negative sample model considers that the website is not a safe website, nor is it completely impossible to judge, and may be risky.
The risky website is a malicious website which is determined to be correct.
The utilizing a positive sample model may be a logistic regression model. The S11 may include:
classifying the target websites by using a linear logistic regression model to obtain a first classification value;
mapping the extracted first classification from a first space to a second space to obtain a second classification, wherein the value range of the first space is larger than that of the second space; wherein the second classification is a probability value that the target website is a safe website; and obtaining the first detection result according to the second characteristic value.
For example, the value range of the first space may be positive infinity to negative infinity; the value of the second space may range from 0 to 1.
As shown in fig. 3, the present embodiment provides a website security detection apparatus, which includes:
the first detection module 31 is configured to detect the target website by using the positive sample model, so as to obtain a first detection result;
the second detection module 32 is configured to detect the target website by using the negative sample model, so as to obtain a second detection result;
a determining module 33 configured to determine a security detection result of the target website according to the first detection result and the second detection result.
In some embodiments, the first detecting module 31, the second detecting module 32 and the determining module 33 may be all program modules; the program module can execute the first detection result, the second detection result and the safety detection result after being executed by the processor.
In other embodiments, the first detecting module 31, the second detecting module 32 and the determining module 33 may be all soft and hard combining modules; the soft and hard combination module; the soft and hard combining module may include: various programmable arrays; the programmable array includes, but is not limited to: complex programmable arrays or field programmable arrays.
In some embodiments, the first detecting module 31, the second detecting module 32 and the determining module 33 may be all soft and hard combining modules; the pure hardware module; the pure hardware module may include: an application specific integrated circuit.
In some embodiments, the determining module 33 is configured to combine the first detection result and the second detection result to obtain a security detection result of the target website according to a pessimistic combining algorithm.
In some embodiments, the first detection result and the second detection result comprise any one of N indication values; wherein the security of the target website indicated by different indicated values is different; n is a positive integer equal to or greater than 2;
the determining module 33 is configured to, when the indicated value indicating the safe website is smaller than the indicated value indicating the risk website, determine, according to the pessimistic merging algorithm, the maximum value of the indicated values included in the first detection result and the second detection result as the security detection result of the target website; and when the indicated value indicating the safe website is greater than the indicated value indicating the risk website, determining the target website as a safety detection result according to the minimum value of the indicated value contained in the first detection result and the indicated value contained in the second detection result according to the pessimistic merging algorithm.
In some embodiments, the determining module 33 is configured to perform at least one of:
according to the pessimistic merging algorithm, when the first detection result and the second detection result are both safety results, determining that the safety detection result of the target website is a safety result;
according to the pessimistic merging algorithm, when at least one of the first detection result and the second detection result is a risk result, determining that the security detection result of the target website is a risk result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is a safety result and the other is an unknown result, determining that the safety detection result of the target website is an unknown result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is a safe result and the other is a suspected risk result, determining that the safety detection result of the target website is a suspected risk result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is an unknown result and the other is a suspected risk result, determining that the security detection result of the target website is a suspected risk result.
In some embodiments, the detecting the target website by using the positive sample model to obtain a first detection result includes:
detecting the target website by using a logic logistic regression model to obtain a first detection value;
mapping the extracted first detection value from a first space to a second space to obtain a second detection value, wherein the value range of the first space is larger than that of the second space; wherein the second detection value is a probability value that the target website is a safe website;
and obtaining the first detection result according to the second detection value.
In some embodiments, the positive sample model is a logistic regression model; and/or the negative sample model is a Support Vector Machine (SVM).
One specific example is provided below in connection with any of the embodiments described above:
the model training mode based on the negative sample has the problems of insufficient coverage scene of the negative sample and accuracy of the detection model caused by insufficient magnitude of the sample.
According to the scheme, models are respectively trained on the basis of the positive samples and the negative samples, and results are sequentially judged on the basis of the 2 training models in the detection process, so that the problem caused by insufficient sample magnitude is solved; meanwhile, the positive sample data size is large, and the trained model can cover most of the characteristics of normal websites, so that the problem of insufficient coverage of negative samples on the scene can be solved from the reverse side.
1) Training a positive sample model: and selecting a logic (Logistic) regression to classify the URL on the basis of the positive sample training set. Logistic regression learns a 0 or 1 classification model through a sample set, takes linear combination of characteristics in the sample set as an independent variable, and the value range of the independent variable is from negative infinity to positive infinity. The argument is mapped to (0,1) using a logistic function, and the mapped value is considered to belong to a probability of y being 1. Logistic regression belongs to generalized linear regression, and the applicable condition of the model is a two-class problem, so the model is trained by using the method.
2) Training a negative sample model: and selecting an SVM for URL classification on the basis of the negative sample training set. SVM is a supervised model training method, representing instances as points in space, in such a way as to ensure that the different classes of instances are separated by a space line. When a new instance is generated, the categories can be predicted based on which side of the interval they fall on, as long as the mapping is to a point in the same space. In this way, the model is trained to distinguish the categories to which the URLs belong.
3) And aiming at each URL to be detected, firstly using a positive sample model to obtain a judgment result, then using a negative sample model to perform secondary verification, and finally adopting a pessimistic merging algorithm to obtain a final judgment result. Detailed calculation rules of pessimistic merging algorithm table 1 above.
If: set security 0, unknown 1, and possibly risk 2, risk 3
Then: result { $ { val1}, $ { val2} }. Wherein: val1 and val2 are feedback results of the positive sample model and the negative sample model, respectively.
By the scheme, the problem of accuracy rate of model detection caused by insufficient sample magnitude is solved, and meanwhile, the problem of insufficient coverage of a model trained by a negative sample can be solved from the reverse side by using the training data of the positive sample. Meanwhile, the scheme has the following characteristics: 1) the detection types are various: the detected website types comprise social worker fraud, information fraud, false sales, malicious files, lottery websites, pornographic websites and the like; 2) high throughput rate: the website detection request can be supported for 2500 ten thousand times per day; 3) low delay: the average response time of the service is within 100 ms; 4) the detection precision is high: the detection accuracy rate for million-magnitude labeled samples is more than 97%.
The present application provides a website security detection apparatus, including a processor, a memory, and an executable program stored on the memory and capable of being executed by the processor, where the processor executes the website security detection method provided by any of the foregoing technical solutions when executing the executable program, for example, the method shown in fig. 1 and/or fig. 2.
Fig. 4 is a block diagram illustrating a website security detection apparatus 800 according to an exemplary embodiment. For example, the apparatus 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
Referring to fig. 4, the apparatus 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.
The processing component 802 generally controls overall operation of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operation at the device 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power component 806 provides power to the various components of device 800. The power components 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the device 800.
The multimedia component 808 includes a screen that provides an output interface between the device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front-facing camera and/or the rear-facing camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the apparatus 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the device 800. For example, the sensor assembly 814 may detect the open/closed state of the device 800, the relative positioning of the components, such as a display and keypad of the apparatus 800, the sensor assembly 814 may also detect a change in position of the apparatus 800 or a component of the apparatus 800, the presence or absence of user contact with the apparatus 800, orientation or acceleration/deceleration of the apparatus 800, and a change in temperature of the apparatus 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate communications between the apparatus 800 and other devices in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the device 800 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
Embodiments of the present application also provide a non-transitory computer-readable storage medium, which may be referred to as a storage medium for short. The computer-executable instructions stored in the storage medium, when executed by the processor, enable the mobile terminal to perform a website security detection method. The website security detection method can comprise the following steps: detecting a target website by using a positive sample model to obtain a first detection result; detecting the target website by using the negative sample model to obtain a second detection result; and determining the security detection result of the target website according to the first detection result and the second detection result.
In some embodiments, the determining a security detection result of the target website according to the first detection result and the second detection result includes: and according to a pessimistic combination algorithm, combining the first detection result and the second detection result to obtain a security detection result of the target website.
In some embodiments, the first detection result and the second detection result comprise any one of N indication values; wherein the security of the target website indicated by different indicated values is different; n is a positive integer equal to or greater than 2; the merging the first detection result and the second detection result to obtain the security detection result of the target website according to a pessimistic merging algorithm includes:
when the indicated value indicating the safe website is smaller than the indicated value indicating the risk website, according to the pessimistic merging algorithm, determining the indicated value to be the safety detection result of the target website according to the maximum value of the indicated values contained in the first detection result and the second detection result; or, when the indicated value indicating the safe website is greater than the indicated value indicating the risk website, according to the pessimistic merging algorithm, determining the minimum value of the indicated value included in the first detection result and the indicated value included in the second detection result as the safety detection result of the target website.
In some embodiments, the merging the first detection result and the second detection result to obtain the security detection result of the target website according to a pessimistic merging algorithm includes at least one of:
according to the pessimistic merging algorithm, when the first detection result and the second detection result are both safety results, determining that the safety detection result of the target website is a safety result;
according to the pessimistic merging algorithm, when at least one of the first detection result and the second detection result is a risk result, determining that the security detection result of the target website is a risk result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is a safety result and the other is an unknown result, determining that the safety detection result of the target website is an unknown result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is a safe result and the other is a suspected risk result, determining that the safety detection result of the target website is a suspected risk result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is an unknown result and the other is a suspected risk result, determining that the security detection result of the target website is a suspected risk result.
In some embodiments, the detecting the target website by using the positive sample model to obtain a first detection result includes:
detecting the target website by using a logic logistic regression model to obtain a first detection value;
mapping the extracted first detection value from a first space to a second space to obtain a second detection value, wherein the value range of the first space is larger than that of the second space; wherein the second detection value is a probability value that the target website is a safe website;
and obtaining the first detection result according to the second detection value.
In some embodiments, the second model is an SVM.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (12)

1. A website security detection method is characterized by comprising the following steps:
detecting a target website by using a positive sample model to obtain a first detection result;
detecting the target website by using the negative sample model to obtain a second detection result;
and determining the security detection result of the target website according to the first detection result and the second detection result.
2. The method according to claim 1, wherein the determining the security detection result of the target website according to the first detection result and the second detection result comprises:
and according to a pessimistic combination algorithm, combining the first detection result and the second detection result to obtain a security detection result of the target website.
3. The method of claim 2, wherein the first and second detection results comprise any one of N indication values; wherein the security of the target website indicated by different indicated values is different; n is a positive integer equal to or greater than 2;
the merging the first detection result and the second detection result to obtain the security detection result of the target website according to a pessimistic merging algorithm includes:
when the indicated value indicating the safe website is smaller than the indicated value indicating the risk website, according to the pessimistic merging algorithm, determining the indicated value to be the safety detection result of the target website according to the maximum value of the indicated values contained in the first detection result and the second detection result;
alternatively, the first and second electrodes may be,
and when the indicated value indicating the safe website is greater than the indicated value indicating the risk website, determining the target website as a safety detection result according to the minimum value of the indicated value contained in the first detection result and the indicated value contained in the second detection result according to the pessimistic merging algorithm.
4. The method of claim 2, wherein the combining the first detection result and the second detection result according to a pessimistic combination algorithm to obtain the security detection result of the target website comprises at least one of:
according to the pessimistic merging algorithm, when the first detection result and the second detection result are both safety results, determining that the safety detection result of the target website is a safety result;
according to the pessimistic merging algorithm, when at least one of the first detection result and the second detection result is a risk result, determining that the security detection result of the target website is a risk result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is a safety result and the other is an unknown result, determining that the safety detection result of the target website is an unknown result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is a safe result and the other is a suspected risk result, determining that the safety detection result of the target website is a suspected risk result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is an unknown result and the other is a suspected risk result, determining that the security detection result of the target website is a suspected risk result.
5. The method of claim 1, wherein the positive sample model is a logistic regression model; and/or the negative sample model is a Support Vector Machine (SVM).
6. A website security detection apparatus, comprising:
the first detection module is configured to detect the target website by using the positive sample model to obtain a first detection result;
the second detection module is configured to detect the target website by using the negative sample model to obtain a second detection result;
a determining module configured to determine a security detection result of the target website according to the first detection result and the second detection result.
7. The apparatus according to claim 6, wherein the determining module is configured to combine the first detection result and the second detection result according to a pessimistic combining algorithm to obtain a security detection result of the target website.
8. The apparatus of claim 7, wherein the first and second detection results comprise any one of N indication values; wherein the security of the target website indicated by different indicated values is different; n is a positive integer equal to or greater than 2;
the determining module is configured to determine, according to the pessimistic merging algorithm, a security detection result of the target website according to a maximum value of an indicated value included in the first detection result and an indicated value included in the second detection result when the indicated value indicating the security website is smaller than an indicated value indicating a risk website; or, when the indicated value indicating the safe website is greater than the indicated value indicating the risk website, according to the pessimistic merging algorithm, determining the minimum value of the indicated value included in the first detection result and the indicated value included in the second detection result as the safety detection result of the target website.
9. The apparatus of claim 7, wherein the determining module is configured to perform at least one of:
according to the pessimistic merging algorithm, when the first detection result and the second detection result are both safety results, determining that the safety detection result of the target website is a safety result;
according to the pessimistic merging algorithm, when at least one of the first detection result and the second detection result is a risk result, determining that the security detection result of the target website is a risk result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is a safety result and the other is an unknown result, determining that the safety detection result of the target website is an unknown result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is a safe result and the other is a suspected risk result, determining that the safety detection result of the target website is a suspected risk result;
according to the pessimistic merging algorithm, when one of the first detection result and the second detection result is an unknown result and the other is a suspected risk result, determining that the security detection result of the target website is a suspected risk result.
10. The apparatus of claim 7, wherein the positive sample model is a logistic regression model; and/or the negative sample model is a Support Vector Machine (SVM).
11. A website security detection apparatus, comprising a processor, a memory and an executable program stored on the memory and capable of being executed by the processor, wherein the steps of the website security detection method according to any one of claims 1 to 5 are executed when the processor executes the executable program.
12. A storage medium on which an executable program is stored, wherein the executable program, when executed by a processor, implements the steps of the web site security detection method according to any one of claims 1 to 5.
CN202010041463.8A 2020-01-15 2020-01-15 Website security detection method and device and storage medium Pending CN111314291A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010041463.8A CN111314291A (en) 2020-01-15 2020-01-15 Website security detection method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010041463.8A CN111314291A (en) 2020-01-15 2020-01-15 Website security detection method and device and storage medium

Publications (1)

Publication Number Publication Date
CN111314291A true CN111314291A (en) 2020-06-19

Family

ID=71161420

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010041463.8A Pending CN111314291A (en) 2020-01-15 2020-01-15 Website security detection method and device and storage medium

Country Status (1)

Country Link
CN (1) CN111314291A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120158626A1 (en) * 2010-12-15 2012-06-21 Microsoft Corporation Detection and categorization of malicious urls
CN107577945A (en) * 2017-09-28 2018-01-12 阿里巴巴集团控股有限公司 URL attack detection methods, device and electronic equipment
CN107770132A (en) * 2016-08-18 2018-03-06 中兴通讯股份有限公司 A kind of method and device detected to algorithm generation domain name
CN108111489A (en) * 2017-12-07 2018-06-01 阿里巴巴集团控股有限公司 URL attack detection methods, device and electronic equipment
CN109325193A (en) * 2018-10-16 2019-02-12 杭州安恒信息技术股份有限公司 WAF normal discharge modeling method and device based on machine learning
CN109886290A (en) * 2019-01-08 2019-06-14 平安科技(深圳)有限公司 Detection method, device, computer equipment and the storage medium of user's request
CN109936561A (en) * 2019-01-08 2019-06-25 平安科技(深圳)有限公司 User request detection method and device, computer equipment and storage medium
CN109951500A (en) * 2019-04-29 2019-06-28 宜人恒业科技发展(北京)有限公司 Network attack detecting method and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120158626A1 (en) * 2010-12-15 2012-06-21 Microsoft Corporation Detection and categorization of malicious urls
CN107770132A (en) * 2016-08-18 2018-03-06 中兴通讯股份有限公司 A kind of method and device detected to algorithm generation domain name
CN107577945A (en) * 2017-09-28 2018-01-12 阿里巴巴集团控股有限公司 URL attack detection methods, device and electronic equipment
CN108111489A (en) * 2017-12-07 2018-06-01 阿里巴巴集团控股有限公司 URL attack detection methods, device and electronic equipment
CN109325193A (en) * 2018-10-16 2019-02-12 杭州安恒信息技术股份有限公司 WAF normal discharge modeling method and device based on machine learning
CN109886290A (en) * 2019-01-08 2019-06-14 平安科技(深圳)有限公司 Detection method, device, computer equipment and the storage medium of user's request
CN109936561A (en) * 2019-01-08 2019-06-25 平安科技(深圳)有限公司 User request detection method and device, computer equipment and storage medium
CN109951500A (en) * 2019-04-29 2019-06-28 宜人恒业科技发展(北京)有限公司 Network attack detecting method and device

Similar Documents

Publication Publication Date Title
CN108632081B (en) Network situation evaluation method, device and storage medium
RU2643473C2 (en) Method and tools for fingerprinting identification
US20140380478A1 (en) User centric fraud detection
WO2015058616A1 (en) Recognition method and device for malicious website
US10045166B2 (en) Method and device for identifying short messages from pseudo base stations
CN110191085B (en) Intrusion detection method and device based on multiple classifications and storage medium
CN108052822B (en) Terminal control method, device and system
CN109842612B (en) Log security analysis method and device based on graph library model and storage medium
KR101994561B1 (en) Website hijack detection method and device
EP3176719A1 (en) Methods and devices for acquiring certification document
CN113569992B (en) Abnormal data identification method and device, electronic equipment and storage medium
CN109672666B (en) Network attack detection method and device
US20170286927A1 (en) Method and device for online payment
CN112711723A (en) Malicious website detection method and device and electronic equipment
CN116707965A (en) Threat detection method and device, storage medium and electronic equipment
CN115277198A (en) Vulnerability detection method and device for industrial control system network and storage medium
CN110928425A (en) Information monitoring method and device
CN112487415A (en) Method and device for detecting safety of computing task
CN110149310B (en) Flow intrusion detection method, device and storage medium
CN115208647A (en) Attack behavior handling method and device
CN111314291A (en) Website security detection method and device and storage medium
CN112953916B (en) Anomaly detection method and device
CN113839852B (en) Mail account abnormity detection method, device and storage medium
CN111046385B (en) Software type detection method and device, electronic equipment and storage medium
CN112800442A (en) Encrypted file detection method, device and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200619

RJ01 Rejection of invention patent application after publication