WO2015058616A1 - 恶意网站的识别方法和装置 - Google Patents

恶意网站的识别方法和装置 Download PDF

Info

Publication number
WO2015058616A1
WO2015058616A1 PCT/CN2014/088251 CN2014088251W WO2015058616A1 WO 2015058616 A1 WO2015058616 A1 WO 2015058616A1 CN 2014088251 W CN2014088251 W CN 2014088251W WO 2015058616 A1 WO2015058616 A1 WO 2015058616A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
malicious
url
feature character
character
Prior art date
Application number
PCT/CN2014/088251
Other languages
English (en)
French (fr)
Inventor
刘健
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2015058616A1 publication Critical patent/WO2015058616A1/zh
Priority to US15/136,771 priority Critical patent/US20160241589A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/16Implementing security features at a particular protocol layer
    • H04L63/168Implementing security features at a particular protocol layer above the transport layer

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to a method and apparatus for identifying a malicious website.
  • the other type is based on the method of web page feature recognition. For example: to identify whether the page contains suspicious keywords and other characteristics.
  • a security software developer is required to perform a large amount of analysis on a malicious URL sample, extract key malicious page features, and add corresponding feature determination logic to the authentication program.
  • sites that use a malicious feature from small to final release, it can take anywhere from weeks to months. The period leading to malicious features from appearing to being discovered is longer.
  • the embodiment of the invention provides a method and a device for identifying a malicious website, which are used to solve the above problems.
  • a method for identifying a malicious website including:
  • the frequency of the first feature character obtained by the feature extraction in the first feature character set is higher than the frequency of the second feature character set, if the frequency of the first feature character in the first feature character set is high And in the second feature character set frequency, the first feature character is added to the malicious feature database; the feature character in the malicious feature library is used to identify the feature character of the malicious website.
  • a device for identifying a malicious website comprising:
  • a specimen obtaining unit configured to acquire a uniform resource locator (URL) that has been determined to be a malicious website, and a URL that has been determined to be a secure website;
  • URL uniform resource locator
  • a feature extraction unit configured to perform feature extraction on a URL of the malicious website acquired by the specimen obtaining unit to obtain a first feature character set, and perform feature extraction on a URL of the secure website to obtain a second feature character set;
  • a feature discriminating unit configured to determine whether a frequency of the first feature character obtained by the feature extraction in the first feature character set is higher than a frequency in the second feature character set, if the first feature character is in the first The frequency of a feature character set is higher than the frequency of the second feature character set, and if the feature extraction unit performs feature extraction, the first feature character is higher in the first feature character set than in the second feature character set.
  • the first feature character is added to the malicious feature database; the feature character in the malicious feature library is used to identify the feature character of the malicious website.
  • a non-transitory computer readable storage medium having computer executable instructions stored thereon. When the executable instructions are executed in a computer, the following steps are performed:
  • the frequency of the first feature character obtained by the feature extraction in the first feature character set is higher than the frequency of the second feature character set, if the frequency of the first feature character in the first feature character set is high And in the frequency of the second feature character set, the first feature character is added to the malicious feature database; and the feature character in the malicious feature library is used to identify the feature character of the malicious website.
  • the embodiment of the present invention performs feature character extraction based on a URL, and determines a specific feature character from the extracted feature characters, and adds it to the malicious feature database to facilitate recognition of the malicious website.
  • the new malicious features in the URL are extracted into the malicious feature database by the comparison method, thereby shortening the period from the appearance of the new malicious feature to the discovery.
  • FIG. 1 is a schematic flowchart of a method according to an embodiment of the present invention.
  • FIG. 2 is a schematic flowchart of a method according to an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of a method according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a system according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of an identification device according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of an identification device according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of an identification device according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of a terminal and a server system according to an embodiment of the present invention.
  • discriminative features such as 90sec in the above example
  • the malicious features can be first used for unknown URLs.
  • Matches are made in the library and can be considered malicious if the match is successful.
  • the embodiment of the invention provides a method for identifying a malicious website.
  • the method can be implemented in a cloud security server or other server on the network side. As shown in FIG. 1 , the method includes steps 101 to 103.
  • step 101 a URL that has been determined to be a malicious website, and a URL that has been determined to be a secure website are acquired.
  • the URL of the malicious website in this step may be the URL of the malicious website verified in a period of time before the current time point; and the URL of the secure website in this step may be the current time. The URL of the secure website verified before the point.
  • the number of URLs of each domain name obtained can be limited to a predetermined number, so that To reduce the problem of domain name concentration.
  • the URL related to this step can be obtained from the database of the security server, and can be obtained by other methods, which is not limited in this embodiment.
  • step 102 feature extraction of the URL of the malicious website is performed to obtain a first feature character set, and feature extraction of the URL of the secure website is performed to obtain a second feature character set.
  • feature extraction may be performed by using non-numeric non-English letters as a separation.
  • step 103 it is determined whether the frequency of the first feature character obtained by the feature extraction in the first feature character set is higher than the frequency in the second feature character set, if the first feature character is in the first feature
  • the frequency of the character set is higher than the frequency in the second feature character set, and the first feature character is added to the malicious feature library; the feature character in the malicious feature library is used to identify the feature character of the malicious website.
  • feature character extraction is performed based on the URL, and specific feature characters are determined from the extracted feature characters, and added to the malicious feature library to facilitate recognition of the malicious website.
  • the newly emerged malicious features in the URL are extracted into the malicious feature database by the method of comparison, thereby shortening the period from the appearance of the new malicious feature to the discovery.
  • the embodiment of the present invention further provides a method for determining whether the frequency of the first feature character in the first feature character set is higher than the frequency in the second feature character set.
  • This method is used to determine distinguishing feature characters. It should be noted that it is also possible to use other methods to determine distinguishing feature characters.
  • the frequency of the first feature character in the first feature character set is higher than the frequency in the second feature character set may be expressed as: obtaining a relative frequency of each feature character, where the relative frequency is a feature character in the first feature character set. a ratio of the frequency to the frequency in the second characteristic character set; the relative frequency of the first characteristic character is higher than a predetermined threshold, or the relative frequency of the first characteristic character is at all The relative frequency of the character is ranked within the set range.
  • the embodiment of the present invention further provides a specific implementation manner for verifying the extracted feature characters. It should be noted that separate verification using a single feature character is possible, and may also be used after determining a batch of new feature characters. A newly determined batch of feature characters is verified.
  • the following embodiment provides an example of using a separate verification, as follows: Before adding the first feature character to the malicious feature database, the method further includes: detecting, by using the first feature character, a URL that has been determined to be a secure website, if the false alarm is reported When the rate is lower than the predetermined threshold, the first feature character is added to the malicious feature library. Using a malicious signature database to detect a URL that has been determined to be a secure website. If the false alarm rate is higher than a predetermined threshold, increase the predetermined threshold, or narrow the set range, and re-determine whether to add the first feature character to the malicious party.
  • Feature Library is
  • the website feature may be used for security identification.
  • page features for security identification is only one way of secure identification. There are many other ways of security identification that are not exhaustive to the embodiments of the present invention.
  • the use of the URL's malicious signature database for further identification and further implementation of security identification can further improve security.
  • this step can also provide a basis for the update of the malicious signature database, but further use of other methods for security identification is not The steps absolutely necessary for the embodiment.
  • step 201 a recent malicious URL sample and a secure URL sample are collected.
  • step 202 the URL feature vocabulary is extracted in accordance with a predetermined rule.
  • the embodiment of the present invention does not limit the extraction rule used in this step, and the extraction rule may be adjusted according to actual needs.
  • the extracted feature vocabulary set is ⁇ http, www, test, com, 8080, index, php, id, 123, anchor ⁇ .
  • step 203 the number of occurrences of each feature vocabulary in the malicious URL and the secure URL sample is separately counted, and the relative frequency f of each feature vocabulary is compared.
  • N(w) is the number of occurrences of w in the malicious URL sample
  • N(w)/N is the probability that w appears in the malicious URL sample
  • M(w) is the number of occurrences of w in the secure URL sample
  • M(w) /M is the probability that w appears in the secure URL sample
  • relative frequency represents a multiple of the probability that a vocabulary will appear in a malicious URL. It can be understood that the greater the relative frequency, the more distinguishing the vocabulary is for malicious URLs and secure URLs.
  • each feature vocabulary is sorted according to the relative frequency from large to small, and the most distinguishing feature vocabulary set is selected.
  • the feature vocabulary of the top n position relative frequency may be selected; or a relative frequency threshold F may be set, and only the feature vocabulary exceeding the threshold value is selected.
  • step 205 the selected feature vocabulary set is used for identification.
  • This step is: after the selected feature vocabulary set, when the URL of the website to be detected contains the feature vocabulary, it can be determined as a malicious URL.
  • step 206 may be further included after step 205.
  • step 206 the false positive rate when the feature vocabulary set is used is tested to determine whether the false positive rate is lower than a preset threshold. If yes, the process proceeds to step 207, otherwise, the process proceeds to step 208.
  • Step 206 may include: selecting a batch of secure URL samples (assuming a total of n1), and detecting using the selected feature vocabulary, assuming a total of n2 pieces determined to be malicious, the false positive rate is n2/n1.
  • step 207 when the false alarm rate is lower than the set threshold, it is determined that the feature vocabulary can be selected.
  • step 208 the feature set is zoomed out and returns to step 204.
  • the manner of narrowing down the feature set may include decreasing the threshold n (or increasing the threshold F) to narrow the feature vocabulary set.
  • Steps 204, 205, 206, and 208 are executed cyclically until the false positive rate test is passed, and the process proceeds to step 207.
  • feature character extraction is performed based on the URL, and specific feature characters are determined from the extracted feature characters, and added to the malicious feature library to facilitate recognition of the malicious website.
  • the newly emerged malicious features in the URL are extracted into the malicious feature database by the method of comparison, thereby shortening the period from the appearance of the new malicious feature to the discovery.
  • the URL authentication method of the website is as shown in FIG. 3, and may include steps 301 to 306.
  • step 301 a URL to be detected is obtained.
  • step 302 after the URL to be detected is obtained, the webpage is detected to be accessible. If it is accessible, the process proceeds to step 304; otherwise, the process proceeds to step 303.
  • step 303 if it is determined that the to-be-detected URL is inaccessible, the URL status is set to be unknown.
  • step 304 it is determined that the URL to be detected is accessible, the URL feature is extracted and matched with the current malicious feature database, and it is determined whether the matching is successful (ie, whether the extracted feature character exists in the malicious feature database), and if so, the step is entered. 306, otherwise proceeds to step 305.
  • step 305 its status is set to a malicious URL.
  • step 306 the page detection logic is entered, and it is further determined according to the page characteristics whether the page corresponding to the URL is malicious.
  • An embodiment of the present invention provides a malicious website identification system, and the system architecture diagram is as shown in FIG. Including: client and server, wherein the server side includes: detection system, malicious feature library, feature extraction system, malicious URL library, secure URL library.
  • the client may be: a terminal device such as a client equipped with instant messaging, computer management tools, and the like.
  • a client which is used to send a URL accessed by the user to the detection system of the server;
  • the detecting system is configured to determine, according to the malicious signature database of the current malicious URL, the URL sent by the client, if the malicious URL feature is not matched, further perform other page feature determination; the detection system identifies the malicious URL and other manual identification
  • the malicious URL will be stored in the malicious URL library, and if it is recognized as safe, it will be stored in the secure URL library;
  • the feature extraction system is configured to periodically compare the features of the malicious URL library with the samples in the secure URL library, and find out the features with high discrimination, so as to continuously supplement and update the current malicious feature database.
  • feature character extraction is performed based on the URL, and specific feature characters are determined from the extracted feature characters, and added to the malicious feature library to facilitate recognition of the malicious website.
  • the newly emerged malicious features in the URL are extracted into the malicious feature database by the method of comparison, thereby shortening the period from the appearance of the new malicious feature to the discovery.
  • An embodiment of the present invention further provides a device for identifying a malicious website.
  • the present invention includes a specimen acquiring unit 501, a feature extracting unit 502, and a feature determining unit 503.
  • the specimen obtaining unit 501 is configured to acquire a URL that has been determined to be a malicious website, and a URL that has been determined to be a secure website.
  • the URL of the malicious website may be the URL of the malicious website verified within a certain period of time before the current time point; and the URL of the secure website may be before the current time point The URL of the secure website verified over a period of time.
  • the number of URLs of each domain name obtained can be limited to a predetermined number, which can reduce the problem of domain name concentration.
  • back-end cloud servers such as computer management tools, they will store security information with a large number of URLs. Therefore, the specimen acquisition unit can obtain the relevant URL from the database of the security server.
  • a feature extraction unit 502 configured to acquire the malicious website obtained by the specimen acquisition unit 501
  • the feature extraction of the URL obtains the first feature character set
  • the feature URL of the secure website is extracted to obtain the second feature character set.
  • the feature discriminating unit 503 is configured to determine whether the frequency of the first feature character obtained by the feature extraction unit 502 in the first feature character set is higher than the frequency in the second feature character set, if the first feature character is in the The frequency of the first feature character set is higher than the frequency of the second feature character set, and the first feature character is added to the malicious feature database; the feature character in the malicious feature database is used to identify the feature character of the malicious website.
  • feature character extraction is performed based on the URL, and specific feature characters are determined from the extracted feature characters, and added to the malicious feature library to facilitate recognition of the malicious website.
  • the newly emerged malicious features in the URL are extracted into the malicious feature database by the method of comparison, thereby shortening the period from the appearance of the new malicious feature to the discovery.
  • the embodiment of the present invention further provides a method for determining whether the frequency of the first feature character in the first feature character set is higher than the frequency in the second feature character set.
  • This method is used to determine distinguishing feature characters. It should be noted that it is also possible to use other methods to determine distinguishing feature characters.
  • the first feature character set is a second feature character set.
  • the feature discriminating unit 503 is configured to obtain a relative frequency of each feature character.
  • the relative frequency is a frequency of the feature character in the first feature character set and a second feature character set. The ratio of the frequencies.
  • the first feature character is added to the malicious feature library.
  • the embodiment of the present invention further provides a specific implementation manner for verifying the extracted feature characters. It should be noted that separate verification using a single feature character is possible, and may also be used after determining a batch of new feature characters. A newly determined batch of feature characters is verified.
  • the following embodiment provides an example of using a separate verification, as follows:
  • the feature discriminating unit 503 is further configured to use the first feature character pair to determine a secure website before adding the first feature character to the malicious feature database.
  • the URL is detected, and if the false alarm rate is lower than a predetermined threshold, the first feature character is added to the malicious feature database.
  • the above identification device further includes:
  • a feature library control unit 601 configured to use a malicious feature library to identify a URL that has been determined to be a secure website If the false alarm rate is higher than the predetermined threshold, the predetermined threshold is increased, or the set range is reduced, and the first feature character is added to the malicious signature database.
  • the feature extraction unit 502 may be configured to perform feature extraction by using non-numeric non-English letters as a separation.
  • the website feature may be used for security identification.
  • page features for security identification is only one way of secure identification. There are many other ways of security identification that are not exhaustive to the embodiments of the present invention.
  • the use of the URL's malicious signature database for further identification and further implementation of security identification can further improve security.
  • this step can also provide a basis for the update of the malicious signature database, but further use of other methods for security identification is not The steps absolutely necessary for the embodiment.
  • the above identification device further includes:
  • the page identification unit 701 is configured to use the page feature for security identification if the URL to be recognized is identified by using the malicious signature database, and the recognition result is secure, and the URL to be identified is accessible.
  • the embodiment of the present invention further provides another device for identifying a malicious website.
  • FIG. 8 for the convenience of description, only parts related to the embodiment of the present invention are shown. If the specific technical details are not disclosed, please refer to the present invention.
  • the identification device may be any terminal device including a mobile phone, a tablet computer, a personal digital assistant (PDA), a point of sales (POS), a vehicle-mounted computer, and the like.
  • the identification device is a mobile phone. example.
  • Server 900 is also illustrated in Figure 8, it being understood that server 900 is not part of the identification device.
  • FIG. 8 is a block diagram showing a partial structure of a mobile phone related to a terminal provided by an embodiment of the present invention.
  • the mobile phone includes: a radio frequency (RF) circuit 810, a memory 820, an input unit 830, a display unit 840, a sensor 850, an audio circuit 860, a wireless fidelity (WiFi) module 870, and a processor 880. And power supply 890 and other components.
  • RF radio frequency
  • the RF circuit 810 can be used for receiving and transmitting signals during the transmission or reception of information or during a call. Specifically, after receiving the downlink information of the base station, it is processed by the processor 880. In addition, the uplink data is designed to be sent to the base station.
  • RF circuits include, but are not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like.
  • RF circuitry 80 can also communicate with the network and other devices via wireless communication. The above wireless communication may use any communication standard or protocol, including but not limited to Global System of Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (Code Division). Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), E-mail, Short Messaging Service (SMS), and the like.
  • GSM Global System of Mobile communication
  • GPRS General Packet Radio Service
  • CDMA Code
  • the memory 820 can be used to store software programs and modules, and the processor 880 executes various functional applications and data processing of the mobile phone by running software programs and modules stored in the memory 820.
  • the memory 820 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to Data created by the use of the mobile phone (such as audio data, phone book, etc.).
  • memory 820 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
  • the input unit 830 can be configured to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the handset 800.
  • the input unit 830 may include a touch panel 831 and other input devices 832.
  • the touch panel 831 also called a touch screen, can be used for collection. a touch operation on or near the user (such as a user using a finger, a stylus, or the like on the touch panel 831 or in the vicinity of the touch panel 831), and driving according to a preset program Connection device.
  • the touch panel 831 can include two parts: a touch detection device and a touch controller.
  • the touch detection device detects the touch orientation of the user, and detects a signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts the touch information into contact coordinates, and sends the touch information.
  • the processor 880 is provided and can receive commands from the processor 880 and execute them.
  • the touch panel 831 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic waves.
  • the input unit 830 may also include other input devices 832.
  • other input devices 832 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, joysticks, and the like.
  • the display unit 840 can be used to display information input by the user or information provided to the user as well as various menus of the mobile phone.
  • the display unit 840 can include a display panel 841.
  • the display panel 841 can be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like.
  • the touch panel 831 can cover the display panel 841. When the touch panel 831 detects a touch operation thereon or nearby, the touch panel 831 transmits to the processor 880 to determine the type of the touch event, and then the processor 880 according to the touch event. The type provides a corresponding visual output on display panel 841.
  • the touch panel 831 and the display panel 841 are two independent components to implement the input and input functions of the mobile phone, in some embodiments, the touch panel 831 can be integrated with the display panel 841. Realize the input and output functions of the phone.
  • the handset 800 can also include at least one type of sensor 850, such as a light sensor, motion sensor, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 841 according to the brightness of the ambient light, and the proximity sensor may close the display panel 841 and/or when the mobile phone moves to the ear. Or backlight.
  • the accelerometer sensor can detect the magnitude of acceleration in all directions (usually three axes). When it is stationary, it can detect the magnitude and direction of gravity.
  • the mobile phone can be used to identify the gesture of the mobile phone (such as horizontal and vertical screen switching, related Game, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tapping), etc.; as for the mobile phone can also be configured with gyroscopes, barometers, hygrometers, thermometers, infrared sensors and other sensors, no longer Narration.
  • the gesture of the mobile phone such as horizontal and vertical screen switching, related Game, magnetometer attitude calibration
  • vibration recognition related functions such as pedometer, tapping
  • the mobile phone can also be configured with gyroscopes, barometers, hygrometers, thermometers, infrared sensors and other sensors, no longer Narration.
  • An audio circuit 860, a speaker 861, and a microphone 862 can provide an audio interface between the user and the handset.
  • the audio circuit 860 can transmit the converted electrical data of the received audio data to the speaker 861 for conversion to the sound signal output by the speaker 861; on the other hand, the microphone 862 converts the collected sound signal into an electrical signal by the audio circuit 860. After receiving, it is converted into audio data, and then processed by the audio data output processor 880, sent to the other mobile phone via the RF circuit 810, or outputted to the memory 820 for further processing.
  • the mobile phone can provide the user with wireless broadband Internet access through a wireless communication module, such as the WiFi module 870 shown in FIG. 8, which can help the user to send and receive emails, browse web pages, and access streaming media.
  • a wireless communication module such as the WiFi module 870 shown in FIG. 8, which can help the user to send and receive emails, browse web pages, and access streaming media.
  • FIG. 8 shows the WiFi module 870, it can be understood that it does not belong to the essential configuration of the mobile phone 800, and may be omitted as needed within the scope of not changing the essence of the invention.
  • the processor 880 is the control center of the handset, and connects various portions of the entire handset using various interfaces and lines, by executing or executing software programs and/or modules stored in the memory 820, and invoking data stored in the memory 820, executing The phone's various functions and processing data, so that the overall monitoring of the phone.
  • the processor 880 may include one or more processing units; preferably, the processor 880 may integrate an application processor and a modem processor, where the application processor mainly processes an operating system, a user interface, an application, and the like.
  • the modem processor primarily handles wireless communications. It will be appreciated that the above described modem processor may also not be integrated into the processor 880.
  • the mobile phone 800 also includes a power source 890 (such as a battery) that supplies power to various components.
  • a power source 890 such as a battery
  • the power source can be logically coupled to the processor 880 through a power management system to manage functions such as charging, discharging, and power management through the power management system.
  • the mobile phone 800 may further include a camera, a Bluetooth module, and the like, and details are not described herein again.
  • the processor 880 included in the terminal further has the following functions:
  • the processor 880 is configured to receive the input of the user through the input unit 830 to obtain the URL as the URL to be identified; send the URL to be identified to the server 900 through the transmitting device, such as the RF circuit 810 or the WIFI module 870; or through the RF circuit 810 or The WIFI module 870 receives the recognition result returned by the server 900.
  • the recognition result can also be displayed in the display unit 840.
  • the server 900 is configured to acquire a URL that has been determined to be a malicious website, and a URL that has been determined to be a secure website; and feature extraction of the URL of the malicious website.
  • a first feature character set performing feature extraction on a URL of the secure website to obtain a second feature character set; if the frequency of the first feature character obtained by the feature extraction in the first feature character set is higher than the frequency in the second feature character set, Adding the first feature character to the malicious feature database; receiving the URL to be recognized from the mobile phone 800, extracting the feature character of the URL to be identified, and matching with the malicious feature library, and if the URL to be identified exists in the malicious feature database, determining the URL
  • the URL is a malicious URL and sends a malicious alert message to the mobile phone 800. It can be understood that if it is a secure website, a security alert message can also be sent to the mobile phone 800.
  • the URL of the malicious website is limited to a URL of the malicious website verified in a period of time before the current time point; and the URL of the secure website may be the current time.
  • the number of URLs of each domain name obtained can be limited to a predetermined number, which can reduce the problem of domain name concentration.
  • server 900 can obtain the relevant URL from the database of the secure server.
  • the server 900 may be configured to perform feature extraction, for example, feature extraction by using non-numeric non-English letters as a separation.
  • the embodiment of the present invention further provides a method for determining whether the frequency of the first feature character in the first feature character set is higher than the frequency in the second feature character set.
  • This method is used to determine distinguishing feature characters. It should be noted that it is also possible to use other methods to determine distinguishing feature characters.
  • the first feature character set is a second feature character set.
  • the server 900 is configured to obtain a relative frequency of each feature character.
  • the relative frequency is a frequency of the feature character in the first feature character set and a frequency in the second feature character set. Ratio; the relative frequency of the first feature character is higher than a predetermined threshold, or the relative frequency of the first feature character is ranked in the relative frequency of all feature characters Within the scope, the first feature character is added to the malicious feature library.
  • the embodiment of the present invention further provides a specific implementation manner for verifying the extracted feature characters. It should be noted that separate verification using a single feature character is possible, and may also be used after determining a batch of new feature characters. It is also possible to verify the newly determined batch of feature characters.
  • the following embodiment provides an example of using separate verification, as follows:
  • the server 900 can also be used to add the first feature character to the malicious feature library before using the above
  • the first feature character detects the URL that has been determined to be a secure website, and if the false alarm rate is lower than a predetermined threshold, the first feature character is added to the malicious feature database.
  • the server 900 is further configured to detect, by using a malicious signature database, a URL that has been determined to be a secure website, and if the false alarm rate is higher than a predetermined threshold, increase the predetermined threshold, or reduce the set range, and re-determine Whether the above first feature character is added to the above malicious feature library.
  • the website feature may be used for security identification.
  • page features for security identification is only one way of secure identification. There are many other ways of security identification that are not exhaustive to the embodiments of the present invention.
  • the use of the URL's malicious signature database for further identification and further implementation of security identification can further improve security.
  • this step can also provide a basis for the update of the malicious signature database, but further use of other methods for security identification is not The steps absolutely necessary for the embodiment.
  • the server 900 may be further configured to use the page feature for security identification if the URL to be identified is identified by using the malicious signature database, and the recognition result is secure, and the URL to be identified is accessible.
  • each unit included is only divided according to functional logic, but is not limited to the above division, as long as the corresponding function can be implemented;
  • the names are also for convenience of distinction from each other and are not intended to limit the scope of protection of the present invention.
  • the storage medium may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)
  • Computer And Data Communications (AREA)

Abstract

本发明实施例公开了一种恶意网站的识别方法和装置,其中方法的实现包括:获取已经确定为恶意网站的统一资源定位符URL,以及已经确定为安全网站的URL;对所述恶意网站的URL进行特征提取得到第一特征字符集,对安全网站的URL进行特征提取得到第二特征字符集;判断特征提取得到的第一特征字符在所述第一特征字符集中的频率是否高于在所述第二特征字符集中频率,如果所述第一特征字符在所述第一特征字符集中的频率高于在所述第二特征字符集中的频率,则将所述第一特征字符加入恶意特征库;所述恶意特征库内的特征字符用于识别恶意网站的特征字符。

Description

恶意网站的识别方法和装置
本申请要求于2013年10月23日提交中国专利局、申请号为201310503579.9、发明名称为“一种恶意网站的识别方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及通信技术领域,特别涉及一种恶意网站的识别方法和装置。
背景技术
互联网技术的快速发展给人们生活带来越来越多的便利。人们通过互联网可以方便的分享和下载各类资料、获取各类重要信息、在线支付账单等。与此同时,互联网的安全形势也不容乐观,各类木马病毒伪装成正常文件肆意传播,钓鱼网站模仿正常网站盗取用户帐号密码等情况,也愈演愈烈。
业界对于恶意网站的识别和打击,通常有两种方案。一类是基于用户举报和人工审核的方法,如,用户可以提交可疑网站的统一资源定位符(Uniform Resoure Locator,URL),该网站经过人工核实为恶意后,将该网站的URL加入到恶意URL列表中,这样在后续恶意网站识别过程中,将会使用恶意URL列表确定是否为恶意网站。首先,由于人工审核的审核质量取决于审核人员的专业性;另外,由于审核人员有限,从URL提交到确定为恶意有很长的滞后性,无法保证URL鉴定的及时有效。
另一类是基于网页特征识别的方法。例如:对页面是否包含可疑关键词等各类特征进行鉴定。在本方法中,需要安全软件开发人员对恶意URL的样本做大量分析,提取出关键的恶意页面特征,在鉴定程序中加入相应的特征判定逻辑。对于使用某一恶意特征的网站,从小规模传播到最终发布,一般会长达数周到数月不等。导致恶意特征从出现到被发现的周期较长。
发明内容
本发明实施例提供了一种恶意网站的识别方法和装置,用于解决以上问题。
一种恶意网站的识别方法,包括:
获取已经确定为恶意网站的统一资源定位符(URL),以及已经确定为安全网站的URL;
对所述恶意网站的URL进行特征提取得到第一特征字符集,对安全网站的URL进行特征提取得到第二特征字符集;以及
判断特征提取得到的第一特征字符在所述第一特征字符集中的频率是否高于在所述第二特征字符集中频率,如果所述第一特征字符在所述第一特征字符集中的频率高于在所述第二特征字符集中频率,则将所述第一特征字符加入恶意特征库;所述恶意特征库内的特征字符用于识别恶意网站的特征字符。
一种恶意网站的识别装置,包括:
标本获取单元,用于获取已经确定为恶意网站的统一资源定位符(URL),以及已经确定为安全网站的URL;
特征提取单元,用于对所述标本获取单元获取的所述恶意网站的URL进行特征提取得到第一特征字符集,对安全网站的URL进行特征提取得到第二特征字符集;以及
特征判别单元,用于判断特征提取得到的第一特征字符在所述第一特征字符集中的频率是否高于在所述第二特征字符集中的频率,如果所述第一特征字符在所述第一特征字符集中的频率高于在所述第二特征字符集中的频率,若所述特征提取单元进行特征提取得到的第一特征字符在第一特征字符集中的频率高于在第二特征字符集中频率,则将所述第一特征字符加入恶意特征库;所述恶意特征库内的特征字符用于识别恶意网站的特征字符。
一种非瞬时性的计算机可读存储介质,其上存储有计算机可执行指令,当计算机中运行这些可执行指令时,执行如下步骤:
获取已经确定为恶意网站的统一资源定位符(URL),以及已经确定为安全网站的URL;
对所述恶意网站的URL进行特征提取得到第一特征字符集,对安全网站的URL进行特征字符提取得到第二特征字符集;以及
判断特征提取得到的第一特征字符在所述第一特征字符集中的频率是否高于在所述第二特征字符集中频率,如果所述第一特征字符在所述第一特征字符集中的频率高于在所述第二特征字符集中的频率,则将所述第一特征字符加入恶意特征库;所述恶意特征库内的特征字符用于识别恶意网站的特征字符。
从以上技术方案可以看出,本发明实施例基于URL进行特征字符提取,并从提取的特征字符中确定特定特征字符,加入到恶意特征库中以便于实现恶意网站的识别。通过比对的方法将URL中新出现的恶意特征提取到恶意特征库中,从而缩短新的恶意特征从出现到被发现的周期。
附图说明
图1为本发明实施例方法流程示意图;
图2为本发明实施例方法流程示意图;
图3为本发明实施例方法流程示意图;
图4为本发明实施例系统结构示意图;
图5为本发明实施例识别装置结构示意图;
图6为本发明实施例识别装置结构示意图;
图7为本发明实施例识别装置结构示意图;以及
图8为本发明实施例终端与服务器系统示意图。
具体实施方式
为了使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明作进一步地详细描述,显然,所描述的实施例仅仅是本发明一部份实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。
在对检测为恶意的URL进行分析时,发现很多恶意URL都包含相似的内容片段。这是由于黑客在发现一类网站漏洞后,对于含有此类漏洞的网站会批 量上传相似文件到相似的目录,生成具有相似路径或文件名的URL地址。例如某段时间建站工具DedeCms漏洞被曝光后,黑客利用此漏洞攻击大量站点,在plus目录下上传90sec.php文件,网络上出现大面积传播类似以下的恶意URL。如下表1所示:
表1恶意URL示例
序号 URL示例
1 http://ixyy.web-103.com/plus/90sec.php
2 http://www.meiruoji.com/plus/90sec.php
3 http://www.hnhmjx.com/plus/90sec.php
4 http://www.33283328.com/plus/90sec.php
5 http://www.csenchi.com/plus/90sec.php
6 http://www.mlwhj.com/plus/90sec.php
********************/plus/90sec.php
n http://www.mvbocai.com/plus/90sec.php
通过对一段时间内被核实为恶意网站的URL进行特征分析,可以自动将具有区分性的特征(例如上例中的90sec)探测出来并加入恶意特征库中,然后对于未知URL可以先在恶意特征库中进行匹配,如果匹配成功则可以认定为恶意。
本发明实施例提供了一种恶意网站的识别方法,本方法可以在云安全服务器或者网络侧的其它服务器中实现,如图1所示,包括步骤101至步骤103。
在步骤101中,获取已经确定为恶意网站的URL,以及已经确定为安全网站的URL。
在本实施例中,为确保实时性,本步骤中的恶意网站的URL可以是当前时间点之前一段时间内所核实的恶意网站的URL;以及,本步骤中的安全网站的URL可以是当前时间点之前一段时间内所核实的安全网站的URL。
另外,还可以将获取的每种域名的URL数限定在预定数量以内,这样可 以减少域名集中的问题。
对于电脑管理工具等后台云服务器而言,由于其会保存有海量URL的安全信息。因此本步骤相关的URL可以从安全服务器的数据库中获得,也可以通过其他方式获取,本实施例不做限定。
在步骤102中,对上述恶意网站的URL进行特征提取得到第一特征字符集,对安全网站的URL进行特征提取得到第二特征字符集。
在本实施例中,可以采用以非数字非英文字母作为分隔进行特征提取。
需要说明的是,进行特征提取的方式还可以有很多,本实施例中的举例仅是适用于URL特征提取并适用于恶意网站识别的一个优选的举例,更改特征提取的算法并不影响本发明实施例的实现,本领域技术人员可以根据实际情况进行算法选择,因此本发明实施例对特征提取所使用的算法不进行限定。以上以非数字非英文字母作为分隔进行特征提取的举例不应理解为对本发明实施例的唯一限定。
在步骤103中,判断特征提取得到的第一特征字符在所述第一特征字符集中的频率是否高于在所述第二特征字符集中频率,如果所述第一特征字符在所述第一特征字符集中的频率高于在所述第二特征字符集中的频率,则将上述第一特征字符加入恶意特征库;上述恶意特征库内的特征字符用于识别恶意网站的特征字符。
以上实施例,基于URL进行特征字符提取,并从提取的特征字符中确定特定特征字符,加入到恶意特征库中以便于实现恶意网站的识别。在该实施例中,通过比对的方法将URL中新出现的恶意特征提取到恶意特征库中,从而缩短新的恶意特征从出现到被发现的周期。
可选地,本发明实施例还提供了如何判断第一特征字符在第一特征字符集中的频率是否高于在第二特征字符集中频率的方法。该方法用于确定具有区分性的特征字符。需要说明的是,采用其他方法来确定具有区分性的特征字符也是可以的。具体地:上述第一特征字符在第一特征字符集中的频率高于在第二特征字符集中频率可以表示为:获取各特征字符的相对频率,上述相对频率为特征字符在第一特征字符集中的频率与在第二特征字符集中频率的比值;第一特征字符的相对频率高于预定门限,或者,第一特征字符的相对频率在所有特 征字符的相对频率中排名在设定范围内。
可选地,本发明实施例还提供了对提取的特征字符进行验证的具体实现方式,需要说明的是采用单个特征字符的单独验证是可以的,也可以在确定一批新的特征字符后使用新确定的一批特征字符进行验证。以下实施例给出了采用单独验证的举例,具体如下:将上述第一特征字符加入恶意特征库之前还可以包括:使用上述第一特征字符对已经确定为安全网站的URL进行检测,若误报率低于预定阈值,则将上述第一特征字符加入恶意特征库。使用恶意特征库对已经确定为安全网站的URL进行检测,若误报率高于预定阈值,则提高上述预定门限,或者缩小上述设定范围,并重新确定是否将上述第一特征字符加入上述恶意特征库。
可选地,当对网站的URL进行检测时,并没有发现与恶意特征库中有相匹配的特征字符,还可以使用页面特征对该网站进行安全性识别。本领域技术人员可以理解的是使用页面特征进行安全性识别仅为安全识别的一种方式,其他安全识别的方式还有很多本发明实施例不可能对其进行穷举。另外,使用URL的恶意特征库进行识别以后进一步执行其他方式的安全识别可以进一步提高安全性,另外,该步骤还可以为恶意特征库的更新提供依据,但进一步使用其他方式进行安全识别并不是本实施例绝对必要的步骤。
以下实施例将会给出一个更为详细的例子对本发明实施例提供的方法进行进一步说明,请参阅图2所示,包括步骤201至步骤208。
在步骤201中,收集近期出现的恶意URL样本以及安全URL样本。
假定恶意URL样本共N条,安全URL样本共M条。
由于实际网络中恶意URL所占比例较小(一般低于1%),在样本选取上也可以遵循这一原则,比如假设恶意URL样本有一万条,则可以选取一百万条安全URL。同时,在样本选取时,可以避免URL集中在少量域名下,比如可以限定每一域名下URL最多选取K条。
在步骤202中,按照预定规则提取URL特征词汇。
本发明实施例对本步骤中使用的提取规则不做限定,并且提取规则可以根据实际需要调整。
比如:可以选择非数字非英文字母作为分隔符来提取特征词汇,则对于以下示例URL:
http://www.test.com:8080/index.php?id=123#anchor
提取得到的特征词汇集合为{http,www,test,com,8080,index,php,id,123,anchor}。
在步骤203中,分别统计恶意URL和安全URL样本中各特征词汇出现次数,并对比得出各个特征词汇的相对频度f。
对于词汇w,其相对频度f(w)计算公式为:
f(w)=(N(w)/N)/(M(w)/M),当M(w)>0;
f(w)=(N(w)/N)/(1/M),当M(w)=0.
其中,N(w)为w在恶意URL样本中出现次数,N(w)/N为w在恶意URL样本中出现几率;M(w)为w在安全URL样本中出现次数,M(w)/M为w在安全URL样本中出现几率;相对频度代表词汇在恶意URL中出现几率是在安全URL中出现几率的倍数。可以理解的是,相对频度越大,说明该词汇对于恶意URL和安全URL越具有区分性。
假设对于词汇“http”,N=100,N(“http”)=95,M=10000,M(“http”)=9500,
则f(“http”)=(95/100)/(9500/10000)=1;
这说明对于“http”,在安全和恶意URL中出现几率一样,不具有区分性。
假设对于词汇“8080”,N=100,N(“8080”)=10,M=10000,M(“8080”)=50,
则f(“8080”)=(10/100)/(50/10000)=20;
这说明对于“8080”,在恶意URL中出现几率是在安全URL中出现几率的20倍,具有很强的区分性。
在步骤204中,根据相对频度从大到小将各特征词汇进行排序,并选取最具有区分性的特征词汇集合。
例如,可以选取相对频度排名前n位的特征词汇;或者设定一个相对频度阈值F,只选取超过此阈值的特征词汇。
在步骤205中,使用选定的特征词汇集进行识别。
本步骤是:在选定特征词汇集后,当待检测网站的URL包含特征词汇时,可以判定为恶意URL。
另外,在步骤205之后还可以进一步包含步骤206。
在步骤206中,测试使用上述特征词汇集时的误报率,判断误报率是否低于预设阈值,如果是,进入步骤207,否则进入步骤208。
步骤206可以包括:选取一批安全URL样本(假定共n1条),使用所选特征词汇集进行检测,假定判定为恶意的共n2条,则误报率为n2/n1。
在步骤207中,误报率低于设定阈值时,确定可以选用此特征词汇集。
在步骤208中,缩小特征集合,并返回到步骤204。
缩小特征集合的方式可以包括:减小阈值n(或增大阈值F)来缩小特征词汇集合。循环执行步骤204、205、206以及208,直到通过误报率测试,进入步骤207为止。
以上实施例,基于URL进行特征字符提取,并从提取的特征字符中确定特定特征字符,加入到恶意特征库中以便于实现恶意网站的识别。在本实施例中,通过比对的方法将URL中新出现的恶意特征提取到恶意特征库中,从而缩短新的恶意特征从出现到被发现的周期。
在特征词汇加入恶意特征库后,网站的URL鉴定方法如图3所示,可以包括步骤301至步骤306。
在步骤301中,获取待检测URL。
在步骤302中,获取到待检测URL后,探测网页是否可以访问,若可访问,进入步骤304;否则进入步骤303。
在步骤303中,确定待检测URL无法访问,则设置URL状态为未知。
在步骤304中,确定待检测URL可访问,提取URL特征并与当前恶意特征库进行匹配,确定是否匹配成功(即:所提取的特征字符是否存在于恶意特征库中),如果是,进入步骤306,否则进入步骤305。
在步骤305中,设置其状态为恶意URL。
在步骤306中,进入页面检测逻辑,根据页面特征进一步判定确定URL对应的页面是否为恶意。
本发明实施例提供了一种恶意网站识别系统,该系统架构图如4所示,包 含:客户端和服务器,其中服务器侧包含有:检测系统、恶意特征库、特征提取系统、恶意URL库、安全URL库。
其中,客户端可以是:如装有即时通讯、电脑管理工具等客户端的终端设备。
整个系统架构的运行如下:
客户端,用于将用户访问的URL发送到服务器的检测系统;
检测系统,用于根据当前恶意URL的恶意特征库对客户端所发送的URL进行判定,如果未匹配到恶意URL特征,则进一步做其他页面特征判定;检测系统识别为恶意的URL以及其他人工识别为恶意的URL都会存入恶意URL库,识别为安全的则存入安全URL库;
特征提取系统,用于定期获取恶意URL库与安全URL库中样本进行特征对比,找出其中区分度高的特征,从而不断补充并更新当前的恶意特征库。
以上实施例,基于URL进行特征字符提取,并从提取的特征字符中确定特定特征字符,加入到恶意特征库中以便于实现恶意网站的识别。在该实施例中,通过比对的方法将URL中新出现的恶意特征提取到恶意特征库中,从而缩短新的恶意特征从出现到被发现的周期。
本发明实施例还提供了一种恶意网站的识别装置,如图5所示,包括标本获取单元501,特征提取单元502以及特征判别单元503。
标本获取单元501,用于获取已经确定为恶意网站的URL,以及已经确定为安全网站的URL。
在本实施例中,为确保服务器使用的标本集的实时性,恶意网站的URL可以是当前时间点之前一段时间内所核实的恶意网站的URL;以及,安全网站的URL可以是当前时间点之前一段时间内所核实的安全网站的URL。另外,还可以将获取的每种域名的URL数限定在预定数量以内,这样可以减少域名集中的问题。对于电脑管理工具等后台云服务器而言,由于其会保存有海量URL的安全信息。因此标本获取单元可以从安全服务器的数据库中获得相关URL。
特征提取单元502,用于对上述标本获取单元501获取的上述恶意网站的 URL进行特征提取得到第一特征字符集,对安全网站的URL进行特征提取得到第二特征字符集。
特征判别单元503,用于判断上述特征提取单元502进行特征提取得到的第一特征字符在第一特征字符集中的频率是否高于在第二特征字符集中频率,如果所述第一特征字符在所述第一特征字符集中的频率高于在所述第二特征字符集中的频率,则将上述第一特征字符加入恶意特征库;上述恶意特征库内的特征字符用于识别恶意网站的特征字符。
以上实施例,基于URL进行特征字符提取,并从提取的特征字符中确定特定特征字符,加入到恶意特征库中以便于实现恶意网站的识别。在该实施例中,通过比对的方法将URL中新出现的恶意特征提取到恶意特征库中,从而缩短新的恶意特征从出现到被发现的周期。
可选地,本发明实施例还提供了如何判断第一特征字符在第一特征字符集中的频率是否高于在第二特征字符集中频率的方法。该方法用于确定具有区分性的特征字符。需要说明的是,采用其他方法来确定具有区分性的特征字符也是可以的。第一特征字符集第二特征字符集具体地:上述特征判别单元503,用于获取各特征字符的相对频率,上述相对频率为特征字符在第一特征字符集中的频率与在第二特征字符集中频率的比值。
若第一特征字符的相对频率高于预定门限,或者,第一特征字符的相对频率在所有特征字符的相对频率中排名在设定范围内,则将上述第一特征字符加入恶意特征库。
可选地,本发明实施例还提供了对提取的特征字符进行验证的具体实现方式,需要说明的是采用单个特征字符的单独验证是可以的,也可以在确定一批新的特征字符后使用新确定的一批特征字符进行验证。以下实施例给出了采用单独验证的举例,具体如下:上述特征判别单元503,还用于在将上述第一特征字符加入恶意特征库之前,使用上述第一特征字符对已经确定为安全网站的URL进行检测,若误报率低于预定阈值,则将上述第一特征字符加入恶意特征库。
如图6所示,上述识别装置,还包括:
特征库控制单元601,用于使用恶意特征库对已经确定为安全网站的URL 进行检测,若误报率高于预定阈值,则提高上述预定门限,或者缩小上述设定范围,并重新确定是否将上述第一特征字符加入上述恶意特征库。
可选地,上述特征提取单元502,可以用于以非数字非英文字母作为分隔进行特征提取。
需要说明的是,进行特征提取的方式还可以有很多,本实施例中的举例仅是适用于URL特征提取并适用于恶意网站识别的一个优选的举例,更改特征提取的算法并不影响本发明实施例的实现,本领域技术人员可以根据实际情况进行算法选择,因此本发明实施例对特征提取所使用的算法不进行限定。以上以非数字非英文字母作为分隔进行特征提取的举例不应理解为对本发明实施例的唯一限定。
可选地,当对网站的URL进行检测时,并没有发现与恶意特征库中有相匹配的特征字符,还可以使用页面特征对该网站进行安全性识别。本领域技术人员可以理解的是使用页面特征进行安全性识别仅为安全识别的一种方式,其他安全识别的方式还有很多本发明实施例不可能对其进行穷举。另外,使用URL的恶意特征库进行识别以后进一步执行其他方式的安全识别可以进一步提高安全性,另外,该步骤还可以为恶意特征库的更新提供依据,但进一步使用其他方式进行安全识别并不是本实施例绝对必要的步骤。如图7所示,上述识别装置,还包括:
页面识别单元701,用于若使用上述恶意特征库对待识别URL进行识别,识别结果为安全,并且上述待识别URL可访问,则使用页面特征进行安全性识别。
本发明实施例还提供了另一种恶意网站的识别装置,如图8所示,为了便于说明,仅示出了与本发明实施例相关的部分,具体技术细节未揭示的,请参照本发明实施例方法部分。该识别装置可以为包括手机、平板电脑、个人数字助理(Personal Digital Assistant,PDA)、销售终端(Point of Sales,POS)、车载电脑等任意终端设备,本实施例中,以识别装置为手机为例。
图8中还示意了服务器900,可以理解的是服务器900并不是识别装置的一部分。
图8示出的是与本发明实施例提供的终端相关的手机的部分结构的框图。参考图8,手机包括:射频(Radio Frequency,RF)电路810、存储器820、输入单元830、显示单元840、传感器850、音频电路860、无线保真(wireless fidelity,WiFi)模块870、处理器880、以及电源890等部件。本领域技术人员可以理解,图8中示出的手机结构并不构成对手机的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
下面结合图8对手机的各个构成部件进行具体的介绍:
RF电路810可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,给处理器880处理;另外,将设计上行的数据发送给基站。通常,RF电路包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(Low Noise Amplifier,LNA)、双工器等。此外,RF电路80还可以通过无线通信与网络和其他设备通信。上述无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯系统(Global System of Mobile communication,GSM)、通用分组无线服务(General Packet Radio Service,GPRS)、码分多址(Code Division Multiple Access,CDMA)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、长期演进(Long Term Evolution,LTE)、电子邮件、短消息服务(Short Messaging Service,SMS)等。
存储器820可用于存储软件程序以及模块,处理器880通过运行存储在存储器820的软件程序以及模块,从而执行手机的各种功能应用以及数据处理。存储器820可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器820可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。
输入单元830可用于接收输入的数字或字符信息,以及产生与手机800的用户设置以及功能控制有关的键信号输入。具体地,输入单元830可包括触控面板831以及其他输入设备832。触控面板831,也称为触摸屏,可收集用 户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板831上或在触控面板831附近的操作),并根据预先设定的程式驱动相应的连接装置。可选的,触控面板831可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器880,并能接收处理器880发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板831。除了触控面板831,输入单元830还可以包括其他输入设备832。具体地,其他输入设备832可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。
显示单元840可用于显示由用户输入的信息或提供给用户的信息以及手机的各种菜单。显示单元840可包括显示面板841,可选的,可以采用液晶显示器(Liquid Crystal Display,LCD)、有机发光二极管(Organic Light-Emitting Diode,OLED)等形式来配置显示面板841。进一步的,触控面板831可覆盖显示面板841,当触控面板831检测到在其上或附近的触摸操作后,传送给处理器880以确定触摸事件的类型,随后处理器880根据触摸事件的类型在显示面板841上提供相应的视觉输出。虽然在图8中,触控面板831与显示面板841是作为两个独立的部件来实现手机的输入和输入功能,但是在某些实施例中,可以将触控面板831与显示面板841集成而实现手机的输入和输出功能。
手机800还可包括至少一种传感器850,比如光传感器、运动传感器以及其他传感器。具体地,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板841的亮度,接近传感器可在手机移动到耳边时,关闭显示面板841和/或背光。作为运动传感器的一种,加速计传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别手机姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;至于手机还可配置的陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。
音频电路860、扬声器861,传声器862可提供用户与手机之间的音频接口。音频电路860可将接收到的音频数据转换后的电信号,传输到扬声器861,由扬声器861转换为声音信号输出;另一方面,传声器862将收集的声音信号转换为电信号,由音频电路860接收后转换为音频数据,再将音频数据输出处理器880处理后,经RF电路810以发送给比如另一手机,或者将音频数据输出至存储器820以便进一步处理。
另外,手机可以通过无线通信模块,例如图8中所示的WiFi模块870,可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图8示出了WiFi模块870,但是可以理解的是,其并不属于手机800的必须构成,完全可以根据需要在不改变发明的本质的范围内而省略。
处理器880是手机的控制中心,利用各种接口和线路连接整个手机的各个部分,通过运行或执行存储在存储器820内的软件程序和/或模块,以及调用存储在存储器820内的数据,执行手机的各种功能和处理数据,从而对手机进行整体监控。可选的,处理器880可包括一个或多个处理单元;优选的,处理器880可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器880中。
手机800还包括给各个部件供电的电源890(比如电池),优选的,电源可以通过电源管理系统与处理器880逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。
尽管未示出,手机800还可以包括摄像头、蓝牙模块等,在此不再赘述。
在本发明实施例中,该终端所包括的处理器880还具有以下功能:
上述处理器880,用于通过输入单元830接收用户的输入从而获取URL作为待识别URL;通过发射设备,如RF电路810或者WIFI模块870将上述待识别URL发送给服务器900;通过RF电路810或者WIFI模块870接收服务器900返回的识别结果。识别结果还可以在显示单元840中进行显示。
在服务器900一侧,服务器900,用于获取已经确定为恶意网站的URL,以及已经确定为安全网站的URL;对上述恶意网站的URL进行特征提取得到 第一特征字符集,对安全网站的URL进行特征提取得到第二特征字符集;若进行特征提取得到的第一特征字符在第一特征字符集中的频率高于在第二特征字符集中频率,则将上述第一特征字符加入恶意特征库;从手机800接收待识别URL,提取待识别URL的特征字符并与上述恶意特征库内进行匹配,若待识别URL存在于恶意特征库内,则确定该URL为为恶意URL,并向手机800发送恶意提示消息。可以理解的是,如果为安全网站也可以发送安全提示消息给上述手机800.
在本实施例中,为确保服务器使用的标本集的实时性,恶意网站的URL限定为可以是当前时间点之前一段时间内所核实的恶意网站的URL;以及,安全网站的URL可以是当前时间点之前一段时间内所核实的安全网站的URL。另外,还可以将获取的每种域名的URL数限定在预定数量以内,这样可以减少域名集中的问题。对于电脑管理工具等后台云服务器而言,由于其会保存有海量URL的安全信息。因此服务器900可以从安全服务器的数据库中获得相关URL。
可选地,上述服务器900,可以用于进行特征提取,例如,以非数字非英文字母作为分隔进行特征提取。
需要说明的是,进行特征提取的方式还可以有很多,本实施例中的举例仅是适用于URL特征提取并适用于恶意网站识别的一个优选的举例,更改特征提取的算法并不影响本发明实施例的实现,本领域技术人员可以根据实际情况进行算法选择,因此本发明实施例对特征提取所使用的算法不进行限定。以上以非数字非英文字母作为分隔进行特征提取的举例不应理解为对本发明实施例的唯一限定。
可选地,本发明实施例还提供了如何判断第一特征字符在第一特征字符集中的频率是否高于在第二特征字符集中频率的方法。该方法用于确定具有区分性的特征字符。需要说明的是,采用其他方法来确定具有区分性的特征字符也是可以的。第一特征字符集第二特征字符集具体地:服务器900,可以用于获取各特征字符的相对频率,上述相对频率为特征字符在第一特征字符集中的频率与在第二特征字符集中频率的比值;第一特征字符的相对频率高于预定门限,或者,第一特征字符的相对频率在所有特征字符的相对频率中排名在设定 范围内,则将上述第一特征字符加入恶意特征库。
可选地,本发明实施例还提供了对提取的特征字符进行验证的具体实现方式,需要说明的是采用单个特征字符的单独验证是可以的,也可以在确定一批新的特征字符后使用新确定的一批特征字符进行验证也是可以的,以下实施例给出了采用单独验证的举例,具体如下:服务器900,还可以用于在将上述第一特征字符加入恶意特征库之前,使用上述第一特征字符对已经确定为安全网站的URL进行检测,若误报率低于预定阈值,则将上述第一特征字符加入恶意特征库。
可选地服务器900,还可以用于使用恶意特征库对已经确定为安全网站的URL进行检测,若误报率高于预定阈值,则提高上述预定门限,或者缩小上述设定范围,并重新确定是否将上述第一特征字符加入上述恶意特征库。
可选地,当对网站的URL进行检测时,并没有发现与恶意特征库中有相匹配的特征字符,还可以使用页面特征对该网站进行安全性识别。本领域技术人员可以理解的是使用页面特征进行安全性识别仅为安全识别的一种方式,其他安全识别的方式还有很多本发明实施例不可能对其进行穷举。另外,使用URL的恶意特征库进行识别以后进一步执行其他方式的安全识别可以进一步提高安全性,另外,该步骤还可以为恶意特征库的更新提供依据,但进一步使用其他方式进行安全识别并不是本实施例绝对必要的步骤。服务器900,还可以用于若使用上述恶意特征库对待识别URL进行识别,识别结果为安全,并且上述待识别URL可访问,则使用页面特征进行安全性识别。
值得注意的是,上述识别装置实施例中,所包括的各个单元只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,各功能单元的具体名称也只是为了便于相互区分,并不用于限制本发明的保护范围。
另外,本领域普通技术人员可以理解实现上述各方法实施例中的全部或部分步骤是可以通过程序来指令相关的硬件完成,相应的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上仅为本发明较佳的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明实施例揭露的技术范围内,可轻 易想到的变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应该以权利要求的保护范围为准。

Claims (13)

  1. 一种恶意网站的识别方法,其特征在于,包括:
    获取已经确定为恶意网站的统一资源定位符(URL),以及已经确定为安全网站的URL;
    对所述恶意网站的URL进行特征提取得到第一特征字符集,对安全网站的URL进行特征字符提取得到第二特征字符集;以及
    判断特征提取得到的第一特征字符在所述第一特征字符集中的频率是否高于在所述第二特征字符集中频率,如果所述第一特征字符在所述第一特征字符集中的频率高于在所述第二特征字符集中的频率,则将所述第一特征字符加入恶意特征库;所述恶意特征库内的特征字符用于识别恶意网站的特征字符。
  2. 根据权利要求1所述方法,其特征在于,判断提取得到的第一特征字符在所述第一特征字符集中的频率是否高于在所述第二特征字符集中的频率,包括:第一特征字符集第二特征字符集
    获取所述第一特征字符的相对频率,所述第一特征字符的相对频率为所述第一特征字符在第一特征字符集中的频率与在所述第二特征字符集中频率的比值;
    判断所述第一特征字符的相对频率是否高于预定门限,或者,判断所述第一特征字符的相对频率在所有特征字符的相对频率中排名是否在设定范围内。
  3. 根据权利要求1或2所述方法,其特征在于,所述将所述第一特征字符加入恶意特征库之前还包括:
    使用所述第一特征字符对已经确定为安全网站的URL进行检测,若误报率低于预定阈值,则将所述第一特征字符加入恶意特征库。
  4. 根据权利要求2所述方法,其特征在于,还包括:
    使用所述恶意特征库对已经确定为安全网站的URL进行检测,若误报率高于预定阈值,则提高所述预定门限,或者缩小所述设定范围,并重新确定是否将所述第一特征字符加入所述恶意特征库。
  5. 根据权利要求1或2所述方法,其特征在于,所述进行特征提取包括:
    以非数字非英丈字母作为分隔进行特征提取。
  6. 根据权利要求1或2所述方法,其特征在于,若使用所述恶意特征库对待识别URL进行识别,如果识别结果为安全,还包括:
    若所述待识别URL可访问,则使用页面特征对所述待识别URL进行安全性识别。
  7. 一种恶意网站的识别装置,其特征在于,包括:
    标本获取单元,用于获取已经确定为恶意网站的统一资源定位符(URL),以及已经确定为安全网站的URL;
    特征提取单元,用于对所述标本获取单元获取的所述恶意网站的URL进行特征提取得到第一特征字符集,对安全网站的URL进行特征提取得到第二特征字符集;以及
    特征判别单元,用于判断特征提取得到的第一特征字符在所述第一特征字符集中的频率是否高于在所述第二特征字符集中的频率,如果所述第一特征字符在所述第一特征字符集中的频率高于在所述第二特征字符集中的频率,则将所述第一特征字符加入恶意特征库;所述恶意特征库内的特征字符用于识别恶意网站的特征字符。
  8. 根据权利要求7所述识别装置,其特征在于,
    所述特征判别单元,用于获取所述第一特征字符的相对频率,所述第一特征字符的相对频率为所述第一特征字符在所述第一特征字符集中的频率与在第二特征字符集中频率的比值;
    判断所述第一特征字符的相对频率是否高于预定门限,或者,判断所述第一特征字符的相对频率在所有特征字符的相对频率中排名是否在设定范围内。
  9. 根据权利要求7或8所述识别装置,其特征在于,
    所述特征判别单元,还用于在将所述第一特征字符加入恶意特征库之前,使用所述第一特征字符对已经确定为安全网站的URL进行检测,若误报率低于预定阈值,则将所述第一特征字符加入恶意特征库。
  10. 根据权利要求8所述识别装置,其特征在于,还包括:
    特征库控制单元,用于使用所述恶意特征库对已经确定为安全网站的URL进行检测,若误报率高于预定阈值,则提高所述预定门限,或者缩小所 述设定范围,并重新确定是否将所述第一特征字符加入所述恶意特征库。
  11. 根据权利要求7或8所述识别装置,其特征在于,
    所述特征提取单元,用于以非数字非英文字母作为分隔进行特征提取。
  12. 根据权利要求7或8所述识别装置,其特征在于,还包括:页面识别单元,用于若使用所述恶意特征库对待识别URL进行识别,如果识别结果为安全,并且所述待识别URL可访问,则使用页面特征进行安全性识别。
  13. 一种非瞬时性的计算机可读存储介质,其上存储有计算机可执行指令,当计算机中运行这些可执行指令时,执行如下步骤:
    获取已经确定为恶意网站的统一资源定位符(URL),以及已经确定为安全网站的URL;
    对所述恶意网站的URL进行特征提取得到第一特征字符集,对安全网站的URL进行特征字符提取得到第二特征字符集;以及
    判断特征提取得到的第一特征字符在所述第一特征字符集中的频率是否高于在所述第二特征字符集中频率,如果所述第一特征字符在所述第一特征字符集中的频率高于在所述第二特征字符集中的频率,则将所述第一特征字符加入恶意特征库;所述恶意特征库内的特征字符用于识别恶意网站的特征字符。
PCT/CN2014/088251 2013-10-23 2014-10-10 恶意网站的识别方法和装置 WO2015058616A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/136,771 US20160241589A1 (en) 2013-10-23 2016-04-22 Method and apparatus for identifying malicious website

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310503579.9 2013-10-23
CN201310503579.9A CN103530562A (zh) 2013-10-23 2013-10-23 一种恶意网站的识别方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/136,771 Continuation US20160241589A1 (en) 2013-10-23 2016-04-22 Method and apparatus for identifying malicious website

Publications (1)

Publication Number Publication Date
WO2015058616A1 true WO2015058616A1 (zh) 2015-04-30

Family

ID=49932565

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/088251 WO2015058616A1 (zh) 2013-10-23 2014-10-10 恶意网站的识别方法和装置

Country Status (3)

Country Link
US (1) US20160241589A1 (zh)
CN (1) CN103530562A (zh)
WO (1) WO2015058616A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814643A (zh) * 2020-06-30 2020-10-23 杭州科度科技有限公司 黑灰url识别方法、装置、电子设备及介质
WO2020232902A1 (zh) * 2019-05-23 2020-11-26 平安科技(深圳)有限公司 异常对象识别方法、装置、计算设备和存储介质

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530562A (zh) * 2013-10-23 2014-01-22 腾讯科技(深圳)有限公司 一种恶意网站的识别方法和装置
CN104935494B (zh) * 2014-03-19 2019-04-23 腾讯科技(深圳)有限公司 信息处理方法和装置
CN105681257B (zh) * 2014-11-19 2020-01-14 腾讯科技(深圳)有限公司 一种基于即时通信交互平台的信息举报方法、装置、设备、系统及计算机存储介质
JP6386593B2 (ja) * 2015-02-04 2018-09-05 日本電信電話株式会社 悪性通信パターン抽出装置、悪性通信パターン抽出システム、悪性通信パターン抽出方法、および、悪性通信パターン抽出プログラム
CN106933860B (zh) * 2015-12-31 2020-12-01 北京新媒传信科技有限公司 恶意统一资源定位符识别方法和装置
CN107239701B (zh) * 2016-03-29 2020-06-26 腾讯科技(深圳)有限公司 识别恶意网站的方法及装置
CN106357618B (zh) * 2016-08-26 2020-10-16 北京奇虎科技有限公司 一种Web异常检测方法和装置
WO2018068664A1 (zh) 2016-10-13 2018-04-19 腾讯科技(深圳)有限公司 网络信息识别方法和装置
CN107741938A (zh) * 2016-10-13 2018-02-27 腾讯科技(深圳)有限公司 一种网络信息识别方法及装置
US10880330B2 (en) * 2017-05-19 2020-12-29 Indiana University Research & Technology Corporation Systems and methods for detection of infected websites
CN107526967B (zh) * 2017-07-05 2020-06-02 阿里巴巴集团控股有限公司 一种风险地址识别方法、装置以及电子设备
CN109544165B (zh) * 2017-09-21 2022-11-11 腾讯科技(深圳)有限公司 资源转移处理方法、装置、计算机设备和存储介质
US11503072B2 (en) * 2019-07-01 2022-11-15 Mimecast Israel Ltd. Identifying, reporting and mitigating unauthorized use of web code
CN110837619B (zh) * 2019-11-05 2022-07-12 北京锐安科技有限公司 一种网站审核的方法、装置、设备和存储介质
CN112182575A (zh) * 2020-09-27 2021-01-05 北京六方云信息技术有限公司 基于lstm的攻击数据集恶意片段标注方法及系统
CN113051876B (zh) * 2021-04-02 2024-04-23 杭州网易智企科技有限公司 恶意网址识别方法及装置、存储介质、电子设备
CN113315766B (zh) * 2021-05-26 2022-03-29 中国信息通信研究院 一种基于强化学习的恶意网址识别方法、系统和介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101692639A (zh) * 2009-09-15 2010-04-07 西安交通大学 一种基于url的不良网页识别方法
CN102790762A (zh) * 2012-06-18 2012-11-21 东南大学 基于url分类的钓鱼网站检测方法
CN102801697A (zh) * 2011-12-20 2012-11-28 北京安天电子设备有限公司 基于多url的恶意代码检测方法和系统
CN102932348A (zh) * 2012-10-30 2013-02-13 常州大学 一种钓鱼网站的实时检测方法及系统
CN103530562A (zh) * 2013-10-23 2014-01-22 腾讯科技(深圳)有限公司 一种恶意网站的识别方法和装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8521667B2 (en) * 2010-12-15 2013-08-27 Microsoft Corporation Detection and categorization of malicious URLs
CN102708186A (zh) * 2012-05-11 2012-10-03 上海交通大学 一种钓鱼网站的识别方法
CN103106365B (zh) * 2013-01-25 2015-11-25 中国科学院软件研究所 一种移动终端上的恶意应用软件的检测方法
CN103338211A (zh) * 2013-07-19 2013-10-02 腾讯科技(深圳)有限公司 一种恶意url鉴定方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101692639A (zh) * 2009-09-15 2010-04-07 西安交通大学 一种基于url的不良网页识别方法
CN102801697A (zh) * 2011-12-20 2012-11-28 北京安天电子设备有限公司 基于多url的恶意代码检测方法和系统
CN102790762A (zh) * 2012-06-18 2012-11-21 东南大学 基于url分类的钓鱼网站检测方法
CN102932348A (zh) * 2012-10-30 2013-02-13 常州大学 一种钓鱼网站的实时检测方法及系统
CN103530562A (zh) * 2013-10-23 2014-01-22 腾讯科技(深圳)有限公司 一种恶意网站的识别方法和装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020232902A1 (zh) * 2019-05-23 2020-11-26 平安科技(深圳)有限公司 异常对象识别方法、装置、计算设备和存储介质
CN111814643A (zh) * 2020-06-30 2020-10-23 杭州科度科技有限公司 黑灰url识别方法、装置、电子设备及介质

Also Published As

Publication number Publication date
CN103530562A (zh) 2014-01-22
US20160241589A1 (en) 2016-08-18

Similar Documents

Publication Publication Date Title
WO2015058616A1 (zh) 恶意网站的识别方法和装置
JP6576555B2 (ja) サービス処理方法、デバイス及びシステム
WO2019096008A1 (zh) 身份识别方法、计算机设备及存储介质
CN111368290B (zh) 一种数据异常检测方法、装置及终端设备
US10187419B2 (en) Method and system for processing notification messages of a website
CN107493378B (zh) 应用程序登录的方法和装置、计算机设备及可读存储介质
WO2014206203A1 (en) System and method for detecting unauthorized login webpage
CN107506646B (zh) 恶意应用的检测方法、装置及计算机可读存储介质
CN106649126B (zh) 一种对应用程序进行测试的方法和装置
US11164022B2 (en) Method for fingerprint enrollment, terminal, and non-transitory computer readable storage medium
US10956653B2 (en) Method and apparatus for displaying page and a computer storage medium
CN107145780B (zh) 恶意软件检测方法及装置
CN109089229B (zh) 进行风险提示的方法、装置、存储介质及终端
CN104217172B (zh) 隐私内容查看方法及装置
US20140359790A1 (en) Method and apparatus for visiting privacy content
CN104573437B (zh) 信息认证方法、装置和终端
CN109726121B (zh) 一种验证码获取方法和终端设备
WO2018127048A1 (zh) 数据显示方法、装置及存储介质
CN107743108B (zh) 一种介质访问控制地址识别方法及装置
CN108304369B (zh) 一种文件类型的识别方法和装置
CN109657469B (zh) 一种脚本检测方法及装置
CN110069407B (zh) 一种应用程序的功能测试方法和装置
CN107577933B (zh) 应用登录方法和装置、计算机设备、计算机可读存储介质
WO2021223177A1 (zh) 异常文件检测方法及相关产品
CN109450853B (zh) 恶意网站判定方法、装置、终端及服务器

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14855936

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205N DATED 29/06/2016)

122 Ep: pct application non-entry in european phase

Ref document number: 14855936

Country of ref document: EP

Kind code of ref document: A1