CN108650250B - Illegal page detection method, system, computer system and readable storage medium - Google Patents

Illegal page detection method, system, computer system and readable storage medium Download PDF

Info

Publication number
CN108650250B
CN108650250B CN201810390940.4A CN201810390940A CN108650250B CN 108650250 B CN108650250 B CN 108650250B CN 201810390940 A CN201810390940 A CN 201810390940A CN 108650250 B CN108650250 B CN 108650250B
Authority
CN
China
Prior art keywords
dom tree
page
depth
dom
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810390940.4A
Other languages
Chinese (zh)
Other versions
CN108650250A (en
Inventor
李忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qianxin Technology Group Co Ltd
Original Assignee
Qianxin Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qianxin Technology Group Co Ltd filed Critical Qianxin Technology Group Co Ltd
Priority to CN201810390940.4A priority Critical patent/CN108650250B/en
Publication of CN108650250A publication Critical patent/CN108650250A/en
Application granted granted Critical
Publication of CN108650250B publication Critical patent/CN108650250B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Abstract

The present disclosure provides an illegal page detection method, including: acquiring first frame characteristic information of a current page; acquiring second frame characteristic information of a preset page; comparing the first frame characteristic information with the second frame characteristic information to obtain the similarity between the current page and the preset page; and judging whether the current page is an illegal page or not based on the similarity. The present disclosure also provides an illegal page detection system, a computer system and a computer-readable storage medium.

Description

Illegal page detection method, system, computer system and readable storage medium
Technical Field
The present disclosure relates to an illegal page detection method, system, computer system and readable storage medium.
Background
Webshell is a command execution environment in the form of a web page file, which can be used to manage a website server. Currently, hackers often employ various means to upload variant webshells to a web server and exploit this administrative property of webshells to invade web sites.
After uploading the Webshell, a hacker must connect the Webshell to realize intrusion, so that whether hacker intrusion exists can be judged by detecting the connection behavior of the Webshell.
Since Http requests and Http responses are usually involved in the process of connecting webshells, and the Http requests and Http responses generally have corresponding text features, the related art generally detects whether hacker intrusion exists based on the text features.
However, in the course of implementing the disclosed concept, the inventors found that there are at least the following drawbacks in the related art: the detection of whether hacker intrusion exists or not based on the text features not only can be easily bypassed by hackers through variant Http requests and Http responses, but also workers need to maintain huge rule bases due to too many variant Http requests and Http responses.
Disclosure of Invention
One aspect of the present disclosure provides an illegal page detection method, including: acquiring first frame characteristic information of a current page; acquiring second frame characteristic information of a preset page; comparing the first frame characteristic information with the second frame characteristic information to obtain the similarity between the current page and the preset page; and judging whether the current page is an illegal page or not based on the similarity.
Optionally, the first frame feature information of the current page includes a depth of a first dom tree of the current page; the second frame characteristic information of the predetermined page comprises the depth of a second dom tree of the predetermined page; the comparing the first frame feature information with the second frame feature information to obtain the similarity between the current page and the predetermined page includes: and comparing the depth of the first dom tree with the depth of the second dom tree to obtain the similarity between the current page and the preset page.
Optionally, the predetermined page includes a plurality of predetermined pages; the depth of the second dom tree comprises a plurality of depths of the second dom tree; each predetermined page corresponds to the depth of a second dom tree; the comparing the depth of the first dom tree with the depth of the second dom tree to obtain the similarity between the current page and the preset page includes: determining a third dom tree of the plurality of second dom trees, wherein the third dom tree is similar to or identical to the first dom tree in type; and comparing the depth of the first dom tree with the depth of the third dom tree to obtain the similarity between the current page and the corresponding page of the third dom tree.
Optionally, determining a third dom tree of the plurality of second dom trees, which is similar to or the same as the first dom tree in type, includes: extracting a fourth dom tree meeting a preset depth from the first dom tree; extracting a fifth dom tree meeting the preset depth from each second dom tree to obtain a plurality of fifth dom trees; determining a target dom tree similar or identical to the fourth dom tree in the fifth dom trees; and determining a second dom tree corresponding to the target dom tree as the third dom tree.
Optionally, the predetermined page includes a plurality of predetermined pages; the depth of the second dom tree comprises a plurality of depths of the second dom tree; each predetermined page corresponds to the depth of a second dom tree; the comparing the depth of the first dom tree with the depth of the second dom tree to obtain the similarity between the current page and the preset page includes: and comparing the depth of the first dom tree with the depth of each of the plurality of second dom trees to obtain the similarity between the current page and the plurality of preset pages.
Optionally, the determining whether the current page is an illegal page based on the similarity includes: judging whether the similarity is greater than a similarity threshold value; and determining that the current page is an illegal page under the condition that the similarity is greater than the similarity threshold.
Another aspect of the present disclosure provides an illegal page detection system, including: the first obtaining module is used for obtaining first frame characteristic information of the current page; the second acquisition module is used for acquiring second frame characteristic information of the preset page; a comparison module, configured to compare the first frame feature information with the second frame feature information to obtain a similarity between the current page and the predetermined page; and the judging module is used for judging whether the current page is an illegal page or not based on the similarity.
Optionally, the first frame feature information of the current page includes a depth of a first dom tree of the current page; the second frame characteristic information of the predetermined page comprises the depth of a second dom tree of the predetermined page; and the comparison module is also used for comparing the depth of the first dom tree with the depth of the second dom tree to obtain the similarity between the current page and the preset page.
Optionally, the predetermined page includes a plurality of predetermined pages; the depth of the second dom tree comprises a plurality of depths of the second dom tree; each predetermined page corresponds to the depth of a second dom tree; the comparison module comprises: a first determining unit, configured to determine a third dom tree of the plurality of second dom trees, which is similar to or the same as the first dom tree in type; and the comparison unit is used for comparing the depth of the first dom tree with the depth of the third dom tree to obtain the similarity between the current page and the corresponding page of the third dom tree.
Optionally, the determining unit includes: a first extraction subunit, configured to extract a fourth dom tree satisfying a preset depth from the first dom tree; the second extraction subunit is used for extracting a fifth dom tree meeting the preset depth from each second dom tree to obtain a plurality of fifth dom trees; a first determining subunit, configured to determine a target dom tree, which is similar to or identical to the fourth dom tree, in the fifth dom trees; and a second determining subunit, configured to determine a second dom tree corresponding to the target dom tree as the third dom tree.
Optionally, the predetermined page includes a plurality of predetermined pages; the depth of the second dom tree comprises a plurality of depths of the second dom tree; each predetermined page corresponds to the depth of a second dom tree; the comparing module is further configured to compare the depth of the first dom tree with the depth of each of the plurality of second dom trees to obtain the similarity between the current page and the plurality of predetermined pages.
Optionally, the determining module includes: a judging unit, configured to judge whether the similarity is greater than a similarity threshold; and a second determining unit configured to determine that the current page is an illegal page if the similarity is greater than the similarity threshold.
Another aspect of the present disclosure provides a computer system comprising: one or more processors; a computer readable storage medium storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the illegitimate pages detection method as described above.
Another aspect of the present disclosure provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to implement the illegitimate pages detection method as described above.
Drawings
For a more complete understanding of the present disclosure and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
fig. 1 schematically illustrates an application scenario of the illegal page detection method and system according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow chart of an illegal page detection method according to an embodiment of the present disclosure;
FIG. 3A schematically illustrates a flow chart for obtaining similarity of a current page and a predetermined page by comparing depths of a dom tree, according to an embodiment of the present disclosure;
FIG. 3B schematically shows a flow chart of determining a third dom tree according to an embodiment of the present disclosure;
FIG. 3C schematically shows a flow chart of determining whether a current page is an illegal page according to an embodiment of the present disclosure;
FIG. 3D schematically illustrates a flow diagram for connecting webshells according to an embodiment of the present disclosure;
FIG. 3E schematically illustrates a flow diagram of an illegitimate page detection method according to another embodiment of the present disclosure;
FIG. 4 schematically illustrates a block diagram of an illegitimate pages detection system according to an embodiment of the present disclosure;
FIG. 5A schematically illustrates a block diagram of a comparison module according to an embodiment of the disclosure;
FIG. 5B schematically shows a block diagram of a determination unit according to an embodiment of the present disclosure;
FIG. 5C schematically illustrates a block diagram of a determination module according to an embodiment of the disclosure; and
FIG. 6 schematically illustrates a block diagram of a computer system suitable for implementing an illegitimate page detection method according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a convention analogous to "A, B or at least one of C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B or C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). It will be further understood by those within the art that virtually any disjunctive word and/or phrase presenting two or more alternative terms, whether in the description, claims, or drawings, should be understood to contemplate the possibilities of including one of the terms, either of the terms, or both terms. For example, the phrase "a or B" should be understood to include the possibility of "a" or "B", or "a and B".
Some block diagrams and/or flow diagrams are shown in the figures. It will be understood that some blocks of the block diagrams and/or flowchart illustrations, or combinations thereof, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the instructions, which execute via the processor, create means for implementing the functions/acts specified in the block diagrams and/or flowchart block or blocks.
Accordingly, the techniques of this disclosure may be implemented in hardware and/or software (including firmware, microcode, etc.). In addition, the techniques of this disclosure may take the form of a computer program product on a computer-readable medium having instructions stored thereon for use by or in connection with an instruction execution system. In the context of this disclosure, a computer-readable medium may be any medium that can contain, store, communicate, propagate, or transport the instructions. For example, the computer readable medium can include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. Specific examples of the computer readable medium include: magnetic storage devices, such as magnetic tape or Hard Disk Drives (HDDs); optical storage devices, such as compact disks (CD-ROMs); a memory, such as a Random Access Memory (RAM) or a flash memory; and/or wired/wireless communication links.
An embodiment of the present disclosure provides a detection method, including: acquiring first frame characteristic information of a current page; acquiring second frame characteristic information of a preset page; comparing the first frame characteristic information with the second frame characteristic information to obtain the similarity between the current page and the preset page; and judging whether the current page is an illegal page or not based on the similarity.
Fig. 1 schematically illustrates an application scenario of the illegal page detection method and system according to the embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a scenario in which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, but does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.
The webshell is a command execution environment in the form of a web page file, and can be used for managing a website server. Currently, hackers often employ various means to upload variant webshells to a website server and exploit this administrative property of the webshells to invade the website. After uploading the Webshell, a hacker must connect the Webshell to realize intrusion, so that whether hacker intrusion exists can be judged by detecting the connection behavior of the Webshell.
As shown in fig. 1, assume that a user 101 wants to connect to a webshell in a server 102 to manage or intrude on the server 102. At this time, whether hacker intrusion exists can be judged through the connection behavior of the webshell.
Currently, the related technology detects whether hacker intrusion exists based on text characteristics carried by http requests and http responses involved in the process of connecting the webshell. However, detecting whether there is a hacker intrusion based on the text features is not only easily bypassed by hackers through variant http requests and http responses, but also requires staff to maintain a huge rule base due to too many variant http requests and http responses.
At this time, the connection behavior of the webshell may be detected by an embodiment of the present disclosure. Specifically, first frame feature information of a current page may be acquired; acquiring second frame characteristic information of a preset page; comparing the first frame characteristic information with the second frame characteristic information to obtain the similarity between the current page and the preset page; and judging whether the current page is an illegal page or not based on the similarity.
Fig. 2 schematically shows a flow chart of an illegal page detection method according to an embodiment of the present disclosure.
As shown in fig. 2, the illegal page detection method may include operations S201 to S204, in which:
in operation S201, first frame feature information of a current page is acquired.
In operation S202, second frame characteristic information of a predetermined page is acquired.
In operation S203, the first frame characteristic information and the second frame characteristic information are compared to obtain a similarity between the current page and the predetermined page.
In operation S204, it is determined whether the current page is an illegal page based on the similarity.
The webshell is a command execution environment in the form of a web page file, which may also be referred to as a web page backdoor, referred to as a web backdoor for short, and may be used to manage a web server. Currently, hackers often employ various means to upload variant webshells to a website server and exploit this administrative property of the webshells to invade the website. However, whether the administrator or the hacker manages or invades the website through the webshell, the webshell must be successfully connected, and after the connection is successful, the server where the webshell is located returns a management page. At this time, the manager or hacker can manage or intrude the web server through the management page. In order to judge whether the current connection behavior aiming at the webshell is the intrusion behavior of the hacker, the judgment can be carried out by detecting whether the management page is an illegal page.
Currently, most of the webshells utilized by hackers are based on certain webshell varieties, which mostly modify the headers of management pages or add a small amount of functionality. Therefore, the frame characteristic information of the management pages of the variant webshells and the frame characteristic information of the management interfaces of the parent webshells basically do not change too much. Therefore, the frame characteristic information of the management pages corresponding to the parent webshells of the variant webshells can be collected, and whether the current management interface is an illegal page or not can be judged by comparing the similarity of the management pages.
In an embodiment of the present disclosure, the current page may include a current management page returned by the server, and the first frame feature information may be used to represent an architectural feature of the current page. The predetermined page may include a previously collected illegal page, such as an illegal management page, and the predetermined page may include one or more pages. The second frame characteristic information may be used to represent architectural characteristics of the predetermined page.
According to the embodiment of the disclosure, after the server returns the current page, the first frame information of the current page returned by the server can be acquired, and the second frame information of the pre-stored predetermined page can be acquired. The current page returned by the server may be returned to the external device, or may be displayed on a display connected to the server. The predetermined page may be stored in the server, or may be stored in an external storage device, which is not limited herein. In addition, the illegal page detection method of the embodiment of the disclosure can be applied to the server, and can also be applied to a detection device with a single program.
According to the embodiment of the disclosure, the first frame characteristic information and the second frame characteristic information can be compared, and then the similarity between the current page and the preset page can be obtained. When the predetermined page includes a plurality of predetermined pages, the first frame feature information of the current page may be compared with the second frame feature information of each predetermined page, and then the similarity between the current page and each predetermined page may be obtained. Further, whether the current page is an illegal page or not can be judged based on the obtained similarity.
Unlike the embodiments of the present disclosure, since http requests and http responses are generally involved in the process of connecting webshells, and the http requests and http responses generally have corresponding text features, the related art generally detects whether there is a hacker intrusion based on the text features. Specifically, the related art compares the text features carried in the http request and the http response with the text features in the pre-stored feature library, and if the comparison is successful, it indicates that the comparison is a connection behavior of a non-legal person, such as a hacker, for the webshell. However, detecting whether there is a hacker intrusion based on the text features is not only easily bypassed by hackers through variant http requests and http responses, but also requires staff to maintain a huge rule base due to too many variant http requests and http responses.
By the embodiment of the disclosure, because the frame characteristic information is relatively complex, even if a hacker makes some adjustment on the frame characteristic information, whether the current page is an illegal page can be accurately judged based on the similarity; and because the parent webshell of the variant webshell utilized by the hacker is not of various types, the frame characteristic information of the management page of the parent webshell is collected as the second frame characteristic information, so that the maintenance amount of workers can be reduced.
As an alternative embodiment, the first frame feature information of the current page may include a depth of a first dom tree of the current page; the second frame characteristic information of the predetermined page may include a depth of a second dom tree of the predetermined page; comparing the first frame characteristic information with the second frame characteristic information to obtain the similarity between the current page and the predetermined page, which may include: and comparing the depth of the first dom tree with the depth of the second dom tree to obtain the similarity between the current page and the preset page.
In the embodiment of the disclosure, a dom tree may refer to an html tag set arranged in an html page in sequence, and the dom tree may describe the characteristics of one html page to some extent.
According to an embodiment of the present disclosure, the first frame feature information may include a first dom tree, which may be a set of page tags arranged in order in the current page; the second frame characteristic information may include a second dom tree, which may be a set of page tags arranged in order in the predetermined page. In addition, the first frame feature information may further include a tag of the current page, and the tags of the current page may not be arranged in sequence; the second frame characteristic information number may further include a tab of a predetermined page, and the tab of the predetermined page may also be unordered.
In implementations of the present disclosure, since each dom tree has a certain depth, the first frame feature information may include the depth of the first dom tree, and the second frame feature information may include the depth of the second dom tree. Comparing the first frame characteristic information with the second frame characteristic information may include comparing a depth of the first dom tree with a depth of the second dom tree, and may further obtain a similarity between the current page and the predetermined page.
According to an embodiment of the present disclosure, comparing the depth of the first dom tree with the depth of the second dom tree may be calculating a levenstein ratio between the depth of the first dom tree and the depth of the second dom tree, and may further represent a similarity of the current page and the predetermined page by the levenstein ratio. The levenshtein ratio may be used to indicate the degree of similarity between two character strings, and a value of the levenshtein ratio closer to 1 indicates that the degree of similarity between two character strings is higher, and in the case where the levenshtein ratio is equal to 1, indicates that the two character strings are equal to each other.
Through the embodiment of the disclosure, the similarity between the current page and the preset page can be determined by comparing the depth of the first dom tree with the depth of the second dom tree, and the dom tree structure is complex, so that a hacker can hardly bypass the dom tree through other characteristic means, and the detection rate of illegal pages can be improved.
As an alternative embodiment, the predetermined page may include a plurality of predetermined pages; the depth of the second dom tree may comprise a plurality of depths of the second dom tree; each predetermined page may correspond to the depth of a second dom tree; comparing the depth of the first dom tree with the depth of the second dom tree to obtain the similarity between the current page and the predetermined page, which may include: determining a third dom tree of the plurality of second dom trees, wherein the type of the third dom tree is similar to or identical to that of the first dom tree; and comparing the depth of the first dom tree with the depth of the third dom tree to obtain the similarity between the current page and the corresponding page of the third dom tree.
Fig. 3A schematically illustrates a flowchart for obtaining similarity between a current page and a predetermined page by comparing depths of a dom tree according to an embodiment of the present disclosure.
As shown in fig. 3A, comparing the depth of the first dom tree with the depth of the second dom tree to obtain the similarity between the current page and the predetermined page may include operations S301 and S302, where:
in operation S301, a third dom tree of the plurality of second dom trees is determined, which is similar or identical in type to the first dom tree.
In operation S302, the depth of the first dom tree is compared with the depth of the third dom tree to obtain the similarity between the current page and the corresponding page of the third dom tree.
In an embodiment of the present disclosure, when the predetermined page includes a plurality of predetermined pages, in order to reduce the amount of calculation, only the predetermined pages having a type similar to or the same as that of the current page may be compared.
Specifically, a third dom tree with a type similar to or the same as that of the first dom tree may be determined from the plurality of second dom trees, and the depth of the first dom tree and the depth of the third dom tree are compared, so that the similarity between the current page and the page corresponding to the third dom tree may be obtained. Wherein comparing the depth of the first dom tree to the depth of the third dom tree may be calculating a levenstein ratio between the depth of the first dom tree and the depth of the third dom tree.
By the embodiment of the disclosure, only the depths of the dom trees with the same or similar types are compared, so that the calculation amount can be reduced, and the detection speed can be improved.
Fig. 3B schematically illustrates a flow chart of determining a third dom tree according to an embodiment of the present disclosure.
As shown in fig. 3B, determining a third dom tree of the plurality of second dom trees that is similar or identical to the first dom tree type may include operations S401 to S404, in which:
in operation S401, a fourth dom tree satisfying a preset depth is extracted from the first dom tree.
In operation S402, a fifth dom tree satisfying a preset depth is extracted from each second dom tree, resulting in a plurality of fifth dom trees.
In operation S403, a target dom tree of the fifth dom trees that is similar or identical to the fourth dom tree is determined.
In operation S404, a second dom tree corresponding to the target dom tree is determined as a third dom tree.
In an embodiment of the present disclosure, the fourth dom tree may refer to a portion of the first dom tree between a start position and a preset depth, and the fifth dom tree may refer to a portion of the second dom tree between a start position and a preset depth, wherein the preset depth is less than the depth of each of the dom trees.
According to an embodiment of the present disclosure, determining a third dom tree that is similar or identical to the first dom tree type may be comparing a fourth dom tree with each fifth dom tree, and regarding a plurality of fifth dom trees as target dom trees that are similar or identical to the fourth dom tree, and then may determine a second dom tree corresponding to the target dom tree as the third dom tree.
Because the dom tree may refer to the sequentially arranged tag sets in the page, the fourth dom tree may refer to the sequentially arranged tag sets from the start position to the preset depth in the first dom tree, and the fifth dom tree may refer to the sequentially arranged tag sets from the start position to the preset depth in the second dom tree. Further, comparing the fourth dom tree with each fifth dom tree may be comparing a set of sequentially arranged tags from a start position to a preset depth in the first dom tree with a set of sequentially arranged tags from a start position to a preset depth in each second dom tree.
According to the embodiment of the disclosure, a similar or identical dom tree in the fifth dom trees and the fourth dom tree is used as the target dom tree, and the second dom tree corresponding to the target dom tree is determined as the third dom tree, so that only the depth of the first dom tree and the depth of the third dom tree can be compared, the calculation amount of a system is reduced, and the detection speed of the system is improved.
As an alternative embodiment, the predetermined page may include a plurality of predetermined pages; the depth of the second dom tree may comprise a plurality of depths of the second dom tree; each predetermined page corresponds to the depth of a second dom tree; comparing the depth of the first dom tree with the depth of the second dom tree to obtain the similarity between the current page and the predetermined page, which may include: and comparing the depth of the first dom tree with the depth of each of the plurality of second dom trees to obtain the similarity between the current page and a plurality of preset pages.
In an embodiment of the present disclosure, when the predetermined page includes a plurality of predetermined pages, the current page may be compared with each of the predetermined pages in order to improve accuracy.
Specifically, the depth of the first dom tree may be compared with the depth of each second dom tree, and then the similarity between the current page and each predetermined page may be obtained, where the similarity may include multiple similarities. Wherein comparing the depth of the first dom tree with the depth of each second dom tree may be calculating a levenstein ratio between the depth of the first dom tree and the depth of each second dom tree, and may further obtain a plurality of levenstein ratios.
Through the embodiment of the disclosure, the first dom tree is compared with each second dom tree, omission or misjudgment can be prevented, and the detection rate of illegal pages is improved.
Fig. 3C schematically shows a flowchart for determining whether a current page is an illegal page according to an embodiment of the present disclosure.
As shown in fig. 3C, determining whether the current page is an illegal page based on the similarity may include operations S501 and S502, in which:
in operation S501, it is determined whether the similarity is greater than a similarity threshold.
In operation S502, in the case where the similarity is greater than the similarity threshold, it is determined that the current page is an illegal page.
In an embodiment of the present disclosure, the similarity may be represented by a levenstein ratio, the similarity threshold may be 0.9, and in the case that the similarity is greater than 0.9, it may be determined that the current page is an illegal page. The threshold of 0.9 set in the embodiment of the present disclosure is calculated by comparing with more than 10 ten thousand random normal pages.
According to the embodiment of the disclosure, the depth of the first dom tree is compared with the depth of the third dom tree, so that the first similarity between the current page and the preset page can be obtained. If the current page is an illegal page, judging whether the current page is the illegal page based on the similarity, judging whether the first similarity is greater than a first similarity threshold, and determining that the current page is the illegal page under the condition that the first similarity is greater than the first similarity threshold. The first similarity may be a first levenstein ratio between the depth of the first dom tree and the depth of the third dom tree, and the first similarity threshold may be 0.9, and if the first levenstein ratio is greater than 0.9, it may be determined that the current page is an illegal page.
According to the embodiment of the disclosure, the depth of the first dom tree is compared with the depth of each second dom tree, so that the second similarity between the current page and each preset page in a plurality of preset pages can be obtained, and a plurality of second similarities are obtained. If the current page is an illegal page, the method may include determining whether the current page is an illegal page based on the similarity, determining whether a similarity greater than a second similarity threshold exists in the plurality of second similarities, and determining that the current page is an illegal page if the similarity greater than the second similarity threshold exists in the plurality of second similarities. Wherein the plurality of second similarities may be a second levenstein ratio between the depth of the first dom tree and the depth of the plurality of second dom trees, the second levenstein ratio includes a plurality of second levenstein ratios, the first similarity threshold may be 0.9, and in a case where there is a levenstein ratio greater than 0.9 among the plurality of second levenstein ratios, it may be determined that the current page is an illegal page.
Through the embodiment of the disclosure, under the condition that the similarity is greater than the similarity threshold value, the current page can be determined to be an illegal page, and an alarm can be sent or prompt information can be displayed through a display screen, so that management personnel can timely stop the illegal invasion of hackers.
The following describes the illegal page detection method provided by the present disclosure in detail by taking hacking as an example.
At present, hackers mainly control websites through webshells, which can be divided into small horses and big horses. The simple code of the pony function is short and short, and is generally used for uploading a function or executing a command function; the function of the big horse is very rich, and the big horse can be used for controlling and managing websites. Therefore, in the process of intrusion, a hacker usually uploads a small horse first and then a big horse, and the webshell referred to in the embodiment of the present disclosure may be a big horse.
The hacker finally needs to try connection for the uploaded webshell, which is a necessary step, so that it is a critical point to detect the connection behavior of the webshell. Reference may be made to the process of connecting webshells as described in figure 3D.
Figure 3D schematically illustrates a flow diagram for connecting webshells according to an embodiment of the present disclosure.
As shown in fig. 3D, wherein:
in operation S601, the hacker initiates a webshell request.
In operation S602, the server returns a login page.
In operation S603, the hacker inputs a previously set password.
In operation S604, it is detected whether the password is correct.
In operation S605, the server returns an administration page to the hacker.
In the embodiment of the disclosure, a hacker accesses the webshell uploaded to the server by other means in advance, then the server where the webshell is located returns a login page to the hacker, the hacker inputs a preset password, and if the password is correct, the server can return a management page of the webshell; if the password is incorrect, the server may continue to return to the login page.
According to the embodiment of the disclosure, it can be seen that once a hacker connects a webshell, the hacker needs to access a web page (also called a management page) with a management interface, so that the behavior of the webshell connection can be detected according to the dom tree structure characteristics of the management page. It should be noted that the present disclosure is primarily directed to detecting big horses in webshells.
Fig. 3E schematically shows a flow chart of an illegal page detection method according to another embodiment of the present disclosure.
As shown in fig. 3E, wherein:
in operation S701, extracting a management page returned by http;
in operation S702, a dom tree of the management page is extracted;
in operation S703, calculating a levenstein ratio of the extracted dom tree and the dom trees in the sample library;
in operation S704, it is determined whether the levenstein ratio is greater than 0.9;
in operation S705, if yes, it is determined that the connection behavior is webshell;
in operation S706, if not, it is determined that the connection behavior is not a webshell.
In embodiments of the present disclosure, webshells commonly used on hackers may be collected and processed for deduplication. And then extracting the dom trees of the management pages of the webshells and storing the dom trees in a sample library. After the server returns the management page, the dom tree of the management page may be extracted, and then the levenstein ratio of the extracted dom tree and the dom trees in the sample library is calculated. The preset Laves ratio threshold value is 0.9, when the Laves ratio of the extracted dom tree and the dom tree in the sample library is larger than 0.9, the webshell corresponding to the management page is judged to be a webshell in the sample library, and then the webshell is judged to be a connection behavior of the webshell, namely the management page is an illegal page. Otherwise, judging that the connection behavior is not the webshell connection behavior, namely the management page is not an illegal page.
According to an embodiment of the present disclosure, a dom tree of a management page may be extracted by a pre-programmed module. Specifically, all the tag nodes of the management page may be extracted and arranged in sequence, and further, the sequentially arranged tag nodes may be compressed to remove duplicate tag nodes, for example, remove duplicate table tag nodes, and use the remaining sequentially arranged tag nodes as a dom tree of the management page.
FIG. 4 schematically illustrates a block diagram of an illegal page detection system according to an embodiment of the present disclosure.
As shown in fig. 4, the illegal page detection system 400 may include a first obtaining module 410, a second obtaining module 420, a comparing module 430 and a judging module 440, wherein:
the first obtaining module 410 is configured to obtain first frame feature information of a current page.
The second obtaining module 420 is configured to obtain second frame feature information of the predetermined page.
The comparing module 430 is configured to compare the first frame characteristic information with the second frame characteristic information to obtain a similarity between the current page and the predetermined page.
The determining module 440 is configured to determine whether the current page is an illegal page based on the similarity.
According to the embodiment of the disclosure, because the frame characteristic information is relatively complex, even if a hacker makes some adjustment on the frame characteristic information, whether the current page is an illegal page can be accurately judged based on the similarity; and because the parent webshell of the variant webshell utilized by the hacker is not of various types, the frame characteristic information of the management page of the parent webshell is collected as the second frame characteristic information, so that the maintenance amount of workers can be reduced.
As an alternative embodiment, the first frame feature information of the current page includes a depth of a first dom tree of the current page; the second frame characteristic information of the predetermined page comprises the depth of a second dom tree of the predetermined page; the comparison module is further used for comparing the depth of the first dom tree with the depth of the second dom tree to obtain the similarity between the current page and the preset page.
Through the embodiment of the disclosure, the similarity between the current page and the preset page can be determined by comparing the depth of the first dom tree with the depth of the second dom tree, and the dom tree structure is complex, so that a hacker can hardly bypass the dom tree through other characteristic means, and the detection rate of illegal pages can be improved.
As an alternative embodiment, the predetermined page includes a plurality of predetermined pages; the depth of the second dom tree comprises a plurality of depths of the second dom tree; each predetermined page corresponds to the depth of a second dom tree; the comparison module comprises: the first determining unit is used for determining a third dom tree which is similar to or the same as the first dom tree in the plurality of second dom trees; and the comparison unit is used for comparing the depth of the first dom tree with the depth of the third dom tree to obtain the similarity between the current page and the corresponding page of the third dom tree.
Fig. 5A schematically illustrates a block diagram of a comparison module according to an embodiment of the disclosure.
As shown in fig. 5A, the comparing module 430 may include a first determining unit 431 and a comparing unit 432, wherein:
the first determining unit 431 is configured to determine a third dom tree of the plurality of second dom trees, which is similar to or the same as the first dom tree in type.
The comparing unit 432 is configured to compare the depth of the first dom tree with the depth of the third dom tree to obtain a similarity between the current page and a corresponding page of the third dom tree.
By the embodiment of the disclosure, only the depths of the dom trees with the same or similar types are compared, so that the calculation amount can be reduced, and the detection speed can be improved.
Fig. 5B schematically illustrates a block diagram of determining a cell according to an embodiment of the disclosure.
As shown in fig. 5B, the first determining unit 431 may include a first extracting sub-unit 4311, a second extracting sub-unit 4312, a first determining sub-unit 4313, and a second determining sub-unit 4314, wherein:
the first extraction subunit 4311 is configured to extract a fourth dom tree satisfying a preset depth from the first dom tree.
The second extracting subunit 4312 is configured to extract a fifth dom tree that meets a preset depth from each second dom tree, to obtain a plurality of fifth dom trees.
The first determining subunit 4313 is configured to determine a target dom tree, which is similar to or the same as the fourth dom tree, in the fifth dom trees.
The second determining subunit 4314 is configured to determine a second dom tree corresponding to the target dom tree as a third dom tree.
According to the embodiment of the disclosure, a similar or identical dom tree in the fifth dom trees and the fourth dom tree is used as the target dom tree, and the second dom tree corresponding to the target dom tree is determined as the third dom tree, so that only the depth of the first dom tree and the depth of the third dom tree can be compared, the calculation amount of a system is reduced, and the detection speed of the system is improved.
As an alternative embodiment, the predetermined page includes a plurality of predetermined pages; the depth of the second dom tree comprises a plurality of depths of the second dom tree; each predetermined page corresponds to the depth of a second dom tree; the comparison module is further configured to compare the depth of the first dom tree with the depth of each of the plurality of second dom trees to obtain similarity between the current page and the plurality of predetermined pages.
Through the embodiment of the disclosure, the first dom tree is compared with each second dom tree, omission or misjudgment can be prevented, and the detection rate of illegal pages is improved.
FIG. 5C schematically illustrates a block diagram of a determination module according to an embodiment of the disclosure
As shown in fig. 5C, the judging module 440 may include a judging unit 441 and a second determining unit 442, wherein:
the judging unit 441 is used to judge whether the similarity is greater than a similarity threshold.
The second determining unit 442 is configured to determine that the current page is an illegal page if the similarity is greater than the similarity threshold.
Through the embodiment of the disclosure, under the condition that the similarity is greater than the similarity threshold value, the current page can be determined to be an illegal page, and an alarm can be sent or prompt information can be displayed through a display screen, so that management personnel can timely stop the illegal invasion of hackers.
Any number of modules, sub-modules, units, sub-units, or at least part of the functionality of any number thereof according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, sub-modules, units, and sub-units according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in any other reasonable manner of hardware or firmware by integrating or packaging a circuit, or in any one of or a suitable combination of software, hardware, and firmware implementations. Alternatively, one or more of the modules, sub-modules, units, sub-units according to embodiments of the disclosure may be at least partially implemented as a computer program module, which when executed may perform the corresponding functions.
For example, any plurality of the first obtaining module 410, the second obtaining module 420, the comparing module 430, the judging module 440, the first determining unit 431, the comparing unit 432, the judging unit 441, the second determining unit 442, the first extracting sub-unit 4311, the second extracting sub-unit 4312, the first determining sub-unit 4313, and the second determining sub-unit 4314 may be combined and implemented in one module, or any one of them may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to the embodiment of the present disclosure, at least one of the first obtaining module 410, the second obtaining module 420, the comparing module 430, the judging module 440, the first determining unit 431, the comparing unit 432, the judging unit 441, the second determining unit 442, the first extracting sub-unit 4311, the second extracting sub-unit 4312, the first determining sub-unit 4313 and the second determining sub-unit 4314 may be at least partially implemented as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware such as any other reasonable manner of integrating or packaging a circuit, or implemented by any one of three implementations of software, hardware and firmware, or an appropriate combination of any of them. Alternatively, at least one of the first obtaining module 410, the second obtaining module 420, the comparing module 430, the judging module 440, the first determining unit 431, the comparing unit 432, the judging unit 441, the second determining unit 442, the first extracting sub-unit 4311, the second extracting sub-unit 4312, the first determining sub-unit 4313 and the second determining sub-unit 4314 may be at least partially implemented as a computer program module, which may perform corresponding functions when executed.
FIG. 6 schematically illustrates a block diagram of a computer system suitable for implementing an illegitimate page detection method according to an embodiment of the present disclosure. The computer system illustrated in FIG. 6 is only one example and should not impose any limitations on the scope of use or functionality of embodiments of the disclosure.
As shown in fig. 6, computer system 600 includes a processor 610 and a computer-readable storage medium 620. The computer system 600 may perform a method according to an embodiment of the disclosure.
In particular, the processor 610 may comprise, for example, a general purpose microprocessor, an instruction set processor and/or related chip set and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. The processor 610 may also include onboard memory for caching purposes. The processor 610 may be a single processing unit or a plurality of processing units for performing the different actions of the method flows according to embodiments of the present disclosure.
Computer-readable storage medium 620 may be, for example, any medium that can contain, store, communicate, propagate, or transport the instructions. For example, a readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. Specific examples of the readable storage medium include: magnetic storage devices, such as magnetic tape or Hard Disk Drives (HDDs); optical storage devices, such as compact disks (CD-ROMs); a memory, such as a Random Access Memory (RAM) or a flash memory; and/or wired/wireless communication links.
The computer-readable storage medium 620 may include a computer program 621, which computer program 621 may include code/computer-executable instructions that, when executed by the processor 610, cause the processor 610 to perform a method according to an embodiment of the disclosure, or any variation thereof.
The computer program 621 may be configured with, for example, computer program code comprising computer program modules. For example, in an example embodiment, code in computer program 621 may include one or more program modules, including for example module 621A, module 621B, … …. It should be noted that the division and number of the modules are not fixed, and those skilled in the art may use suitable program modules or program module combinations according to actual situations, so that the processor 610 may execute the method according to the embodiment of the present disclosure or any variation thereof when the program modules are executed by the processor 610.
The present disclosure also provides a computer-readable medium, which may be embodied in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer readable medium carries one or more programs which, when executed, implement: acquiring first frame characteristic information of a current page; acquiring second frame characteristic information of a preset page; comparing the first frame characteristic information with the second frame characteristic information to obtain the similarity between the current page and the preset page; and judging whether the current page is an illegal page or not based on the similarity.
According to embodiments of the present disclosure, a computer readable medium may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, optical fiber cable, radio frequency signals, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.
While the disclosure has been shown and described with reference to certain exemplary embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the appended claims and their equivalents. Accordingly, the scope of the present disclosure should not be limited to the above-described embodiments, but should be defined not only by the appended claims, but also by equivalents thereof.

Claims (10)

1. An illegal page detection method, comprising:
acquiring first frame characteristic information of a current page, wherein the first frame characteristic information of the current page comprises the depth of a first dom tree of the current page;
acquiring second frame characteristic information of a preset page, wherein the second frame characteristic information of the preset page comprises the depth of a second dom tree of the preset page, and the preset page is an illegal page collected in advance;
comparing the depth of the first dom tree with the depth of the second dom tree to obtain the similarity between the current page and the preset page; and
judging whether the current page is an illegal page or not based on the similarity;
wherein the predetermined page comprises a plurality of predetermined pages;
the depth of the second dom tree comprises a depth of a plurality of second dom trees;
each predetermined page corresponds to the depth of a second dom tree;
the step of comparing the depth of the first dom tree with the depth of the second dom tree to obtain the similarity between the current page and the preset page comprises the following steps:
determining a third dom tree of the plurality of second dom trees that is similar or identical in type to the first dom tree; and
and comparing the depth of the first dom tree with the depth of the third dom tree to obtain the similarity between the current page and the corresponding page of the third dom tree.
2. The method of claim 1, wherein determining a third dom tree of the plurality of second dom trees that is of a similar or same type as the first dom tree comprises:
extracting a fourth dom tree meeting a preset depth from the first dom tree;
extracting a fifth dom tree meeting the preset depth from each second dom tree to obtain a plurality of fifth dom trees;
determining a target dom tree of the fifth plurality of dom trees that is similar or identical to the fourth dom tree; and
and determining a second dom tree corresponding to the target dom tree as the third dom tree.
3. The method of claim 1, wherein:
the predetermined page comprises a plurality of predetermined pages;
the depth of the second dom tree comprises a depth of a plurality of second dom trees;
each predetermined page corresponds to the depth of a second dom tree;
the step of comparing the depth of the first dom tree with the depth of the second dom tree to obtain the similarity between the current page and the preset page comprises the following steps:
and comparing the depth of the first dom tree with the depth of each of the plurality of second dom trees to obtain the similarity between the current page and the plurality of preset pages.
4. The method of claim 1, wherein the determining whether the current page is an illegal page based on the similarity comprises:
judging whether the similarity is greater than a similarity threshold value; and
and under the condition that the similarity is greater than the similarity threshold value, determining that the current page is an illegal page.
5. An illegitimate pages detection system comprising:
the first obtaining module is used for obtaining first frame characteristic information of a current page, wherein the first frame characteristic information of the current page comprises the depth of a first dom tree of the current page;
the second obtaining module is used for obtaining second frame characteristic information of a preset page, the second frame characteristic information of the preset page comprises the depth of a second dom tree of the preset page, and the preset page is an illegal page collected in advance;
the comparison module is used for comparing the depth of the first dom tree with the depth of the second dom tree to obtain the similarity between the current page and the preset page; and
the judging module is used for judging whether the current page is an illegal page or not based on the similarity;
wherein the predetermined page comprises a plurality of predetermined pages;
the depth of the second dom tree comprises a depth of a plurality of second dom trees;
each predetermined page corresponds to the depth of a second dom tree;
the comparison module comprises:
a first determining unit, configured to determine a third dom tree of the plurality of second dom trees, which is similar to or the same as the first dom tree in type; and
and the comparison unit is used for comparing the depth of the first dom tree with the depth of the third dom tree to obtain the similarity between the current page and the corresponding page of the third dom tree.
6. The system of claim 5, wherein the first determination unit comprises:
the first extraction subunit is used for extracting a fourth dom tree meeting a preset depth from the first dom tree;
the second extraction subunit is used for extracting a fifth dom tree meeting the preset depth from each second dom tree to obtain a plurality of fifth dom trees;
a first determining subunit, configured to determine a target dom tree, which is similar to or the same as the fourth dom tree, in the fifth dom trees; and
and the second determining subunit is used for determining a second dom tree corresponding to the target dom tree as the third dom tree.
7. The system of claim 5, wherein:
the predetermined page comprises a plurality of predetermined pages;
the depth of the second dom tree comprises a depth of a plurality of second dom trees;
each predetermined page corresponds to the depth of a second dom tree;
the comparison module is further configured to compare the depth of the first dom tree with the depth of each of the plurality of second dom trees to obtain the similarity between the current page and the plurality of predetermined pages.
8. The system of claim 5, wherein the determining module comprises:
the judging unit is used for judging whether the similarity is greater than a similarity threshold value; and
and the second determining unit is used for determining that the current page is an illegal page under the condition that the similarity is greater than the similarity threshold value.
9. A computer system, comprising:
one or more processors;
a computer-readable storage medium for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the illegitimate pages detection method of any one of claims 1-4.
10. A computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to implement the illegitimate pages detection method of any one of claims 1-4.
CN201810390940.4A 2018-04-27 2018-04-27 Illegal page detection method, system, computer system and readable storage medium Active CN108650250B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810390940.4A CN108650250B (en) 2018-04-27 2018-04-27 Illegal page detection method, system, computer system and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810390940.4A CN108650250B (en) 2018-04-27 2018-04-27 Illegal page detection method, system, computer system and readable storage medium

Publications (2)

Publication Number Publication Date
CN108650250A CN108650250A (en) 2018-10-12
CN108650250B true CN108650250B (en) 2021-07-23

Family

ID=63748251

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810390940.4A Active CN108650250B (en) 2018-04-27 2018-04-27 Illegal page detection method, system, computer system and readable storage medium

Country Status (1)

Country Link
CN (1) CN108650250B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110191124B (en) * 2019-05-29 2022-02-22 安天科技集团股份有限公司 Web front-end development data-based website identification method and device and storage equipment
CN111597107B (en) * 2020-04-22 2023-04-28 北京字节跳动网络技术有限公司 Information output method and device and electronic equipment

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101510887B (en) * 2009-03-27 2012-01-25 腾讯科技(深圳)有限公司 Method and device for identifying website
CN102129528B (en) * 2010-01-19 2013-05-15 北京启明星辰信息技术股份有限公司 WEB page tampering identification method and system
CN102316081A (en) * 2010-06-30 2012-01-11 北京启明星辰信息技术股份有限公司 Method and device for identifying similar webpage
JP5695586B2 (en) * 2012-02-24 2015-04-08 株式会社日立製作所 XML document search apparatus and program
US9509715B2 (en) * 2014-08-21 2016-11-29 Salesforce.Com, Inc. Phishing and threat detection and prevention
US9578048B1 (en) * 2015-09-16 2017-02-21 RiskIQ Inc. Identifying phishing websites using DOM characteristics
CN107204960B (en) * 2016-03-16 2020-11-24 阿里巴巴集团控股有限公司 Webpage identification method and device and server
CN107612908B (en) * 2017-09-15 2020-06-05 杭州安恒信息技术股份有限公司 Webpage tampering monitoring method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Sensitive Information Acquisition Based on Machine Learning";Wenqian Shang;《2012 International Conference on Industrial Control and Electronics Engineering》;20121004;全文 *
基于集成学习的钓鱼网页深度检测系统;冯庆等;《计算机系统应用》;20161015(第10期);全文 *

Also Published As

Publication number Publication date
CN108650250A (en) 2018-10-12

Similar Documents

Publication Publication Date Title
TWI706273B (en) Uniform resource locator (URL) attack detection method, device and electronic equipment
EP3561708B1 (en) Method and device for classifying uniform resource locators based on content in corresponding websites
US8949981B1 (en) Techniques for providing protection against unsafe links on a social networking website
CN109446819B (en) Unauthorized vulnerability detection method and device
US9208235B1 (en) Systems and methods for profiling web applications
CN106561025B (en) For providing the system and method for computer network security
US11704373B2 (en) Methods and systems for generating custom content using universal deep linking across web and mobile applications
US20130246943A1 (en) Central Logout from Multiple Websites
US11106754B1 (en) Methods and systems for hyperlinking user-specific content on a website or mobile applications
CN108268635B (en) Method and apparatus for acquiring data
CN110417718B (en) Method, device, equipment and storage medium for processing risk data in website
CN107508809B (en) Method and device for identifying website type
CN108650250B (en) Illegal page detection method, system, computer system and readable storage medium
WO2018085499A1 (en) Techniques for classifying a web page based upon functions used to render the web page
US9923896B2 (en) Providing access to a restricted resource via a persistent authenticated device network
CN109729095B (en) Data processing method, data processing device, computing equipment and media
US20190132356A1 (en) Systems and Methods to Detect and Notify Victims of Phishing Activities
CN111371778B (en) Attack group identification method, device, computing equipment and medium
CN107992738A (en) A kind of account logs in method for detecting abnormality, device and electronic equipment
US9972043B2 (en) Credibility enhancement for online comments and recommendations
US10042825B2 (en) Detection and elimination for inapplicable hyperlinks
US10621337B1 (en) Application-to-application device ID sharing
US11005877B2 (en) Persistent cross-site scripting vulnerability detection
CN104573486B (en) leak detection method and device
US9727726B1 (en) Intrusion detection using bus snooping

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 100088 Building 3 332, 102, 28 Xinjiekouwai Street, Xicheng District, Beijing

Applicant after: Qianxin Technology Group Co., Ltd.

Address before: 100016 15, 17 floor 1701-26, 3 building, 10 Jiuxianqiao Road, Chaoyang District, Beijing.

Applicant before: BEIJING QI'ANXIN SCIENCE & TECHNOLOGY CO., LTD.

GR01 Patent grant
GR01 Patent grant