CN113014601B - Communication detection method, device, equipment and medium - Google Patents

Communication detection method, device, equipment and medium Download PDF

Info

Publication number
CN113014601B
CN113014601B CN202110325650.3A CN202110325650A CN113014601B CN 113014601 B CN113014601 B CN 113014601B CN 202110325650 A CN202110325650 A CN 202110325650A CN 113014601 B CN113014601 B CN 113014601B
Authority
CN
China
Prior art keywords
communication
source addresses
access
resource identifier
score
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110325650.3A
Other languages
Chinese (zh)
Other versions
CN113014601A (en
Inventor
周运金
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sangfor Technologies Co Ltd
Original Assignee
Sangfor Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sangfor Technologies Co Ltd filed Critical Sangfor Technologies Co Ltd
Priority to CN202110325650.3A priority Critical patent/CN113014601B/en
Publication of CN113014601A publication Critical patent/CN113014601A/en
Application granted granted Critical
Publication of CN113014601B publication Critical patent/CN113014601B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The embodiment of the application discloses a communication detection method, a device, equipment and a medium, wherein initial log data acquired in preset time are classified according to resource identifiers to obtain log data corresponding to each resource identifier. The analysis process of the log data corresponding to each resource identifier is the same, taking any one of all the resource identifiers, namely the target resource identifier as an example, and counting the log data of the target resource identifier to obtain behavior characteristic information. The behavior feature information may include resource information, and it is determined whether the communication corresponding to the target resource identifier is a webshell communication based on the resource information. By analyzing the access behaviors in the log data, the effective detection of webshell communication is realized.

Description

Communication detection method, device, equipment and medium
Technical Field
The present application relates to the field of security defense technologies, and in particular, to a communication detection method, apparatus, device, and computer readable storage medium.
Background
Webshell is a scripting attack tool for web page (web) intrusion. The ASP (Active Server Pages, dynamic server web page) is a web application that can be used to create dynamic interactive web pages and build powerful web applications. PHP (Hypertext Preprocessor, hypertext pre-processing language) is a universal open source scripting language suitable for use in the field of web development. After a hacker invades a website, the ASP or PHP backdoor file is mixed with the normal webpage file under the web directory of the website server, and then a browser can be used to access the ASP or PHP backdoor to obtain a command execution environment so as to achieve the purpose of controlling the website server.
In the traditional mode, the communication content is analyzed, so that the detection of webshell communication is realized. In the attack and defense countermeasure, webshell communication increasingly tends to encrypt communication, and the webshell tools for encrypting the communication of ice scorpions and gosla have become the first choice of attackers. The conventional method for detecting webshell communication based on communication content cannot decrypt the communication content, so that encrypted webshell communication cannot be detected.
It can be seen that how to effectively detect encrypted webshell communications is a problem that needs to be solved by those skilled in the art.
Disclosure of Invention
An object of the embodiments of the present application is to provide a communication detection method, apparatus, device, and computer readable storage medium, which can implement effective detection on encrypted webshell communication.
In order to solve the above technical problems, an embodiment of the present application provides a communication detection method, including:
classifying the initial log data acquired in the preset time according to the resource identifiers to obtain log data corresponding to each resource identifier;
counting the log data of the target resource identifier to obtain behavior characteristic information; wherein the target resource identifier is any one of all the resource identifiers; the behavior characteristic information comprises resource information;
And determining whether the communication corresponding to the target resource identifier is webshell communication or not based on the behavior characteristic information.
Optionally, the behavior feature information further includes a type and number of access source addresses; the type of the access source address comprises an intranet source address and an extranet source address;
the determining whether the communication corresponding to the target resource identifier is webshell communication based on the behavior feature information includes:
and determining whether the communication corresponding to the target resource identifier is webshell communication or not based on the resource information, the type and the number of the access source addresses.
Optionally, the determining whether the communication corresponding to the target resource identifier is webshell communication based on the resource information, the type and the number of the access source addresses includes:
judging whether all the access source addresses are external network source addresses or not under the condition that the resource information meets the preset resource loading condition;
if all the access source addresses are external network source addresses, judging that the communication corresponding to the target resource identifier is webshell communication;
if all the access source addresses are uneven external network source addresses, judging whether the number of internal network source addresses in all the access source addresses is smaller than a preset threshold value;
If the number of intranet source addresses in all the access source addresses is smaller than a preset threshold, determining that the communication corresponding to the target resource identifier is webshell communication.
Optionally, the behavior feature information further comprises page information and access information;
the determining whether the communication corresponding to the target resource identifier is webshell communication based on the behavior feature information includes:
according to the corresponding relation between the preset behavior characteristic information and the weight score, calculating to obtain a target score corresponding to the target resource identifier;
and if the target score meets a preset score condition, judging that the communication corresponding to the target resource identifier is webshell communication.
Optionally, the page information includes page skip times and page penetration times; the access information comprises IP access quantity; the resource information includes static resource loads.
Optionally, the calculating, according to the preset correspondence between the behavior feature information and the weight score, the target score corresponding to the target resource identifier includes:
inquiring a first score corresponding to the page jump times from a pre-established jump times score list;
Inquiring a second score corresponding to the page income degree from a pre-established income degree score list;
inquiring a third score corresponding to the IP access amount from a pre-established access amount score list;
querying a fourth score corresponding to the static resource loading amount from a pre-established resource loading score list;
and determining a target score corresponding to the target resource identifier based on the first score, the second score, the third score and the fourth score.
Optionally, the method further comprises:
and deleting the file corresponding to the target resource identifier under the condition that the communication corresponding to the target resource identifier is determined to be webshell communication.
The embodiment of the application also provides a communication detection method, which comprises the following steps:
counting the type and the number of access source addresses of the access target resource identifiers; the type of the access source address comprises an intranet source address and an extranet source address;
if the type and the number of the access source addresses meet a preset access source distribution rule of webshell communication, judging that the communication corresponding to the target resource identifier is webshell communication.
Optionally, if the type and the number of the access source addresses meet a preset access source distribution rule of webshell communication, determining that the communication corresponding to the target resource identifier is webshell communication includes:
Judging whether all the access source addresses are external network source addresses or not;
if all the access source addresses are external network source addresses, judging that the communication corresponding to the target resource identifier is webshell communication;
if all the access source addresses are uneven external network source addresses, judging whether the number of internal network source addresses in all the access source addresses is smaller than a preset threshold value;
if the number of intranet source addresses in all the access source addresses is smaller than a preset threshold, determining that the communication corresponding to the target resource identifier is webshell communication.
The embodiment of the application also provides a communication detection device, which comprises a classification unit, a statistics unit and a determination unit;
the classifying unit is used for classifying the initial log data acquired in the preset time according to the resource identifiers to obtain the log data corresponding to each resource identifier;
the statistics unit is used for carrying out statistics on the log data of the target resource identifier to obtain behavior characteristic information; wherein the target resource identifier is any one of all the resource identifiers; the behavior characteristic information comprises resource information;
the determining unit is configured to determine, based on the behavior feature information, whether the communication corresponding to the target resource identifier is webshell communication.
Optionally, the behavior feature information further includes a type and number of access source addresses; the type of the access source address comprises an intranet source address and an extranet source address;
the determining unit is configured to determine whether the communication corresponding to the target resource identifier is webshell communication based on the resource information, the type and the number of the access source addresses.
Optionally, the determining unit includes a first judging subunit, a second judging subunit and a second judging subunit;
the first judging subunit is configured to judge whether all the access source addresses are external network source addresses if the resource information meets a preset resource loading condition;
the first judging subunit is configured to judge that the communication corresponding to the target resource identifier is webshell communication if all the access source addresses are external network source addresses;
the second judging subunit is configured to judge whether the number of intranet source addresses in all the access source addresses is less than a preset threshold if all the access source addresses are not uniform and are extranet source addresses;
and the second judging subunit is configured to judge that the communication corresponding to the target resource identifier is webshell communication if the number of intranet source addresses in all the access source addresses is smaller than a preset threshold.
Optionally, the behavior feature information further comprises page information and access information; the determining unit comprises a calculating subunit and a judging subunit;
the calculating subunit is used for calculating and obtaining a target score corresponding to the target resource identifier according to a preset corresponding relation between behavior characteristic information and the weight score;
and the judging subunit is configured to judge that the communication corresponding to the target resource identifier is webshell communication if the target score meets a preset score condition.
Optionally, the page information includes page skip times and page penetration times; the access information comprises IP access quantity; the resource information includes static resource loads.
Optionally, the calculating subunit is configured to query, from a pre-established hop count score list, a first score corresponding to the page hop count; inquiring a second score corresponding to the page income degree from a pre-established income degree score list; inquiring a third score corresponding to the IP access amount from a pre-established access amount score list; querying a fourth score corresponding to the static resource loading amount from a pre-established resource loading score list; and determining a target score corresponding to the target resource identifier based on the first score, the second score, the third score and the fourth score.
Optionally, the device further comprises a deleting unit;
the deleting unit is configured to delete a file corresponding to the target resource identifier when it is determined that the communication corresponding to the target resource identifier is webshell communication.
The embodiment of the application also provides a communication detection device, which comprises a statistics unit and an analysis unit;
the statistics unit is used for counting the type and the number of the access source addresses of the access target resource identifiers; the type of the access source address comprises an intranet source address and an extranet source address;
the analysis unit is configured to determine that the communication corresponding to the target resource identifier is webshell communication if the type and the number of the access source addresses meet a preset access source distribution rule of webshell communication.
Optionally, the analysis unit includes a first judgment subunit, a judgment subunit, and a second judgment subunit:
the first judging subunit is configured to judge whether all the access source addresses are external network source addresses;
the judging subunit is configured to judge that the communication corresponding to the target resource identifier is webshell communication if all the access source addresses are external network source addresses;
The second judging subunit is configured to judge whether the number of intranet source addresses in all the access source addresses is less than a preset threshold if all the access source addresses are not uniform and are extranet source addresses;
and the judging subunit is configured to judge that the communication corresponding to the target resource identifier is webshell communication if the number of intranet source addresses in all the access source addresses is less than a preset threshold.
The embodiment of the application also provides a communication detection device, which comprises:
a memory for storing a computer program;
a processor for executing the computer program to implement the steps of the communication detection method as described in any one of the above.
Embodiments of the present application also provide a computer readable storage medium having a computer program stored thereon, which when executed by a processor, implements the steps of the communication detection method according to any of the above.
According to the technical scheme, the initial log data acquired in the preset time are classified according to the resource identifiers to obtain the log data corresponding to each resource identifier; a webshell has a resource identifier that it uniquely corresponds to. The log data is classified according to the resource identifiers, and the log data of the same resource identifier can be summarized, so that the access behavior contained in the log data of each resource identifier is analyzed. The analysis process of the log data corresponding to each resource identifier is the same, taking any one of all the resource identifiers, namely the target resource identifier as an example, and counting the log data of the target resource identifier to obtain behavior characteristic information. The behavior feature information may include resource information, and it is determined whether the communication corresponding to the target resource identifier is a webshell communication based on the resource information. In the technical scheme, the analysis of the encrypted communication content is bypassed, the effective detection of the encrypted webshell communication is realized by analyzing the access behavior in the log data, and the effective detection of the unencrypted webshell communication can be realized.
Drawings
For a clearer description of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described, it being apparent that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a communication detection method provided in an embodiment of the present application;
fig. 2 is a flowchart of a webshell communication dual detection method provided in an embodiment of the present application;
FIG. 3 is a flowchart of another communication detection method according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a communication detection device according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of another communication detection device according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a communication detection device according to an embodiment of the present application.
Detailed Description
The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments herein without making any inventive effort are intended to fall within the scope of the present application.
In order to provide a better understanding of the present application, those skilled in the art will now make further details of the present application with reference to the drawings and detailed description.
Next, a communication detection method provided in the embodiments of the present application will be described in detail. Fig. 1 is a flowchart of a communication detection method provided in an embodiment of the present application, where the method includes:
s101: classifying the initial log data acquired in the preset time according to the resource identifiers to obtain the log data corresponding to each resource identifier.
The value of the preset time can be set according to actual requirements, and is not limited herein. For example, the preset time may be a day time or a week time, or the like.
The communication detection method provided by the embodiment of the application is suitable for various IT assets or devices, such as servers, terminal devices, network devices and the like. For ease of description, servers are used as examples.
An attacker accesses the server through the extranet equipment, establishes webshell communication with the server, and writes webshell files into a web service directory of the server in a file uploading or command writing mode.
A webshell has a resource identifier that it uniquely corresponds to. Other devices also have corresponding resource identifiers when accessing the server, and the initial log data recorded in the server often contains log data corresponding to a plurality of resource identifiers.
When a server records log data in time series, there is often a case where log data of a plurality of resource identifiers are recorded together. In the embodiment of the present application, in order to distinguish the access behaviors of different devices to a server, the initial log data may be classified according to the resource identifiers, so as to obtain log data corresponding to each resource identifier.
The initial log data refers to log data recorded by the server. The hypertext transfer protocol (Hyper Text Transfer Protocol, HTTP) is a network transport protocol that is widely used on the internet. Taking the HTTP protocol as an example, the log data recorded by the server is an HTTP log. In practical application, after the server reads the HTTP log data, the log data of the same resource identifier may be summarized according to all the resource identifiers included in the log data, so as to analyze the access behavior included in the log data of each resource identifier.
S102: and counting the log data of the target resource identifier to obtain behavior characteristic information.
The analysis process of the log data corresponding to each resource identifier is the same, and any one of all the resource identifiers, namely, the target resource identifier, is taken as an example to develop and introduce, and the analysis process of the log data corresponding to other resource identifiers can refer to the analysis process of the log data of the target resource identifier, which is not described herein.
Compared with a conventional secure communication mode, the webshell communication access behaviors have isolation, and the log data contains information capable of reflecting the visitor access behaviors.
S103: and determining whether the communication corresponding to the target resource identifier is webshell communication or not based on the behavior characteristic information.
The behavior feature information may include resource information according to the distribution feature of webshell communication isolation.
The resource information is used to indicate the number of files loaded. Since a general webshell has only one file, and the file includes contents such as cascading style sheets (Cascading Style Sheet, CSS) and scripting languages (JavaScript, JS) which are object-and event-driven and have security properties, other static resources do not need to be loaded, so if the current communication is webshell communication, the static resource loading amount is almost zero. Thus in practical applications, the resource information may include static resource loads.
In the embodiment of the present application, it may be determined whether the resource information satisfies a preset resource loading condition, where the resource loading condition may be that the static resource loading amount is smaller than or equal to a preset resource value. The specific value of the preset resource value can be set according to actual requirements, can be set to 0, and can also be set to a numerical value slightly larger than 0, and is not limited herein.
When the resource information satisfies a preset resource loading condition, that is, when the static resource loading amount of the resource information is smaller than or equal to a preset resource value, it may be determined that the communication corresponding to the target resource identifier is webshell communication.
According to the technical scheme, the initial log data acquired in the preset time are classified according to the resource identifiers to obtain the log data corresponding to each resource identifier; a webshell has a resource identifier that it uniquely corresponds to. The log data is classified according to the resource identifiers, and the log data of the same resource identifier can be summarized, so that the access behavior contained in the log data of each resource identifier is analyzed. The analysis process of the log data corresponding to each resource identifier is the same, taking any one of all the resource identifiers, namely the target resource identifier as an example, and counting the log data of the target resource identifier to obtain behavior characteristic information. The behavior feature information may include resource information, and it is determined whether the communication corresponding to the target resource identifier is a webshell communication based on the resource information. In the technical scheme, the analysis of the encrypted communication content is bypassed, the effective detection of the encrypted webshell communication is realized by analyzing the access behavior in the log data, and the effective detection of the unencrypted webshell communication can be realized.
In practical application, the behavior characteristic information can be various, and the behavior characteristic information can also comprise page information and access information besides resource information.
The page information includes hops between pages, access conditions of pages, and the like, and thus, in practical applications, the page information may include the number of page hops and the number of page entries.
The number of page jumps refers to the number of times of jumping from one page to another, and the webshell file is isolated, so that the situation of jumping from the current page to other pages hardly occurs, and therefore, if the current communication is webshell communication, the value of the number of page jumps is almost zero.
The number of page entries refers to how many pages enter the current page, and because the webshell file is isolated, if the current communication is webshell communication, the value of the number of page entries is almost zero.
The access information is used to indicate access to the target resource identifier. Since webshell files are generally only accessed by an attacker, if the current communication is webshell communication, the amount of IP access is small. The IP access amount refers to the number of source IPs accessed through the target resource identifier, and thus in practical applications, the access information may include the IP access amount.
In order to comprehensively consider the influence of all behavior characteristic information on the access behavior, a target score corresponding to the target resource identifier can be calculated according to a preset corresponding relation between the behavior characteristic information and the weight score. And if the target score meets the preset score condition, judging that the communication corresponding to the target resource identifier is webshell communication.
Taking the above-mentioned four types of behavior feature information, i.e., the number of page hops, the number of page entries, the IP access amount, and the static resource loading amount as examples, in practical application, a corresponding score list may be set for each type of distribution of behavior feature information.
For the page skip times, a skip times score list may be pre-established, and the skip times score list may include score values corresponding to different values of the page skip times. Considering that if the current communication is webshell communication, the value of the page jump number is almost zero, the score value corresponding to the page jump number when the value of the page jump number is zero may be set to be the highest, the score value corresponding to the page jump number may be smaller as the value of the page jump number increases, and the score value may be set to be zero when the set page jump threshold is exceeded.
For the page income degree times, a income degree times score list can be pre-established, and the income degree times score list can comprise score values corresponding to different values of the page income degree times. Considering that if the current communication is webshell communication, the value of the page importation number is almost zero, so the score value corresponding to the page importation number when the value of the page importation number is zero can be set to be the highest, the smaller the score value corresponding to the page importation number can be along with the increase of the page importation number, and when the set page importation threshold value is exceeded, the score value can be set to be zero.
For the IP access amount, an access amount score list may be pre-established, and the access amount score list may include score values corresponding to different values of the IP access amount. Considering that the value of the IP access amount is low if the current communication is webshell communication, the score value corresponding to the IP access amount value of 1 may be set to be highest, and the score value corresponding to the IP access amount value may be smaller as the IP access amount value increases, and may be set to be zero when the set access threshold is exceeded.
For static resource loading, a resource loading score list may be pre-established, where the resource loading score list may include score values corresponding to different values of the static resource loading. Considering that the value of the static resource load is almost zero if the current communication is webshell communication, the score value corresponding to the static resource load value zero may be set to be highest, and the score value corresponding to the static resource load value may be smaller as the static resource load value increases, and may be set to be zero when the set threshold value is exceeded.
When calculating the target score corresponding to the target resource identifier, the difference of the standards for quantifying the scores of the behavior feature information of different types is considered, so that the score list corresponding to each type of behavior feature information can be set in the embodiment of the application.
Taking the behavior characteristic information including the page jump times, the page access times, the IP access amount and the static resource loading amount as an example, in practical application, a first score corresponding to the page jump times can be inquired from a pre-established jump times score list; inquiring a second score corresponding to the page degree number from a pre-established degree number score list; inquiring a third score corresponding to the IP access amount from a pre-established access amount score list; querying a fourth score corresponding to the static resource loading amount from a pre-established resource loading score list; and determining a target score corresponding to the target resource identifier based on the first score, the second score, the third score and the fourth score.
In practical applications, the sum of the first score, the second score, the third score and the fourth score may be used as the target score corresponding to the target resource identifier.
Considering that the degree of influence of different types of behavior feature information on the access behavior is different, when a score list is set for each type of behavior feature information, the value of the score value can be set differently. Assuming that the page entry number, the IP access amount, the page jump number and the static resource loading amount are sequentially arranged in order from high to low according to the influence degree of the access behavior, the highest scores of the four types of behavior characteristic information can be sequentially set to be 4, 3, 2 and 1 respectively according to the ten-system setting.
The corresponding score list is established aiming at various behavior characteristic information, so that the quantitative processing of various behavior characteristic information can be realized, and the influence of all behavior characteristic information on access behaviors can be comprehensively considered. And the value of each score list can be set according to the influence degree of various behavior characteristic information on the access behavior, so that the quantification of the various behavior characteristic information is more fit with the actual analysis requirement.
If the target score meets the preset score condition, the communication corresponding to the target resource identifier can be judged to be webshell communication.
Taking an example that the score is higher when the access behavior accords with the isolation of webshell communication, if the preset score condition is that the target score is larger than or equal to the first preset score, the communication corresponding to the target resource identifier is determined to be webshell communication.
Taking the example that the score is lower when the access behavior accords with the isolation of webshell communication, the preset score condition can be that the target score is smaller than the second preset score, and then the communication corresponding to the target resource identifier is determined to be webshell communication.
By counting the behavior characteristic information from a plurality of layers such as the page skip times, the page entrance times, the IP access amount, the static resource loading amount and the like, the behavior characteristics corresponding to the resource identifiers can be reflected more comprehensively, and therefore the accuracy of webshell communication identification is ensured.
In the description, the effective detection of webshell communication is realized through analysis of access behaviors in log data. In the embodiment of the present application, in order to further improve accuracy of the detection result, access source analysis may be added on the basis of the above access behavior analysis, where the access source analysis may include analysis of types and numbers of access source addresses. Fig. 2 is a flowchart of a webshell communication dual detection method provided in an embodiment of the present application, where the method includes:
s201: classifying the initial log data acquired in the preset time according to the resource identifiers to obtain the log data corresponding to each resource identifier.
S202: and counting the log data of the target resource identifier to obtain behavior characteristic information.
The implementation manner of S201-S202 may be referred to the description of S101-S102 above, and will not be repeated here.
S203: and determining whether the communication corresponding to the target resource identifier is webshell communication or not based on the resource information and the type and the number of the access source addresses.
The access source analysis of webshell communication may be to analyze the type of access source address and the number of access source addresses of each type.
The type of the access source address comprises an intranet source address and an extranet source address.
In practical application, an attacker generally attacks from an external network server, so the access source address is an external network source address. In addition, it may happen that an attacker has entered the intranet and taken down the first springboard, so that there is an intranet source address in the access source address in addition to the extranet source address.
In the embodiment of the application, whether all access source addresses are extranet source addresses can be judged under the condition that the resource information meets the preset resource loading condition; if all the access source addresses are external network source addresses, the communication corresponding to the target resource identifier is determined to be webshell communication. If all the access source addresses are not uniform external network source addresses, the access source addresses are indicated to contain both external network source addresses and internal network source addresses, and whether the number of the internal network source addresses in all the access source addresses is smaller than a preset threshold value can be further judged at this time; if the number of the intranet source addresses in all the access source addresses is smaller than a preset threshold, determining that the communication corresponding to the target resource identifier is webshell communication.
The value of the preset threshold may be set according to the number of intranet source addresses existing during webshell communication, and the preset threshold is generally set to 1.
In the embodiment of the application, the access source analysis is added on the basis of the access behavior analysis, so that double detection of webshell communication can be realized, the false alarm condition is reduced, and the accuracy of webshell communication is further improved.
In the embodiment of the present application, after determining that the communication corresponding to the target resource identifier is webshell communication, in order to reduce the influence of webshell communication on the server, the file corresponding to the target resource identifier may be deleted.
Fig. 3 is a flowchart of another webshell communication dual detection method provided in an embodiment of the present application, where the method includes:
s301: the type and the number of the access source addresses accessing the target resource identifier are counted.
The communication detection method provided by the embodiment of the application is suitable for various IT assets or devices, such as servers, terminal devices, network devices and the like. For ease of description, servers are used as examples.
An attacker accesses the server through the extranet equipment, establishes webshell communication with the server, and writes webshell files into a web service directory of the server in a file uploading or command writing mode.
A webshell has a resource identifier that it uniquely corresponds to. The analysis process of the access source corresponding to each resource identifier is the same, and any one of all the resource identifiers, that is, the target resource identifier, is taken as an example to develop and introduce, and the analysis process of the access source corresponding to the other resource identifiers can refer to the analysis process of the access source of the target resource identifier, which is not described herein.
In embodiments of the present application, the analysis of the access source may include an analysis of the type and number of access source addresses.
The type of the access source address may include an intranet source address and an extranet source address.
S302: if the type and the number of the access source addresses meet the access source distribution rule of the preset webshell communication, determining that the communication corresponding to the target resource identifier is webshell communication.
The access source distribution rule of webshell communication may be a constraint on the type of access source address and the number of access source addresses of each type.
In practical application, an attacker generally attacks from an external network server, so the access source address is an external network source address. In addition, it may happen that an attacker has entered the intranet and taken down the first springboard, so that there is an intranet source address in the access source address in addition to the extranet source address.
In the embodiment of the application, whether all the access source addresses are external network source addresses can be judged; if all the access source addresses are external network source addresses, the communication corresponding to the target resource identifier is determined to be webshell communication. If all the access source addresses are not uniform external network source addresses, the access source addresses are indicated to contain both external network source addresses and internal network source addresses, and whether the number of the internal network source addresses in all the access source addresses is smaller than a preset threshold value can be further judged at this time; if the number of the intranet source addresses in all the access source addresses is smaller than a preset threshold, determining that the communication corresponding to the target resource identifier is webshell communication.
The value of the preset threshold may be set according to the number of intranet source addresses existing during webshell communication, and the preset threshold is generally set to 1.
According to the technical scheme, one webshell is provided with one unique corresponding resource identifier, the analysis process of the access source corresponding to each resource identifier is the same, and the type and the number of the access source addresses accessing to the target resource identifier are counted by taking any one of all the resource identifiers, namely the target resource identifier as an example. The type of the access source address may include an intranet source address and an extranet source address. If the type and the number of the access source addresses meet the access source distribution rule of the preset webshell communication, the characteristic of the access source address of the target resource identifier accords with the webshell communication, and at the moment, the communication corresponding to the target resource identifier can be judged to be the webshell communication. In the technical scheme, the analysis of the encrypted communication content is bypassed, the effective detection of the encrypted webshell communication is realized through the analysis of the type and the number of the access source addresses, and the effective detection of the unencrypted webshell communication can be realized.
Fig. 4 is a schematic structural diagram of a communication detection device according to an embodiment of the present application, which includes a classification unit 41, a statistics unit 42, and a determination unit 43;
A classification unit 41, configured to classify the initial log data acquired in the preset time according to the resource identifiers, so as to obtain log data corresponding to each resource identifier;
a statistics unit 42, configured to perform statistics on log data of the target resource identifier to obtain behavior feature information; wherein the target resource identifier is any one of all the resource identifiers; the behavior characteristic information comprises resource information;
the determining unit 43 is configured to determine whether the communication corresponding to the target resource identifier is webshell communication based on the behavior feature information.
Optionally, the behavior feature information further includes a type and a number of access source addresses; the type of the access source address comprises an intranet source address and an extranet source address;
the determining unit is used for determining whether the communication corresponding to the target resource identifier is webshell communication or not based on the resource information and the type and the number of the access source addresses.
Optionally, the determining unit includes a first judging subunit, a second judging subunit, and a second judging subunit;
the first judging subunit is used for judging whether all access source addresses are external network source addresses or not under the condition that the resource information meets the preset resource loading condition;
The first judging subunit is configured to judge that the communication corresponding to the target resource identifier is webshell communication if all the access source addresses are external network source addresses;
the second judging subunit is configured to judge whether the number of intranet source addresses in all the access source addresses is less than a preset threshold value if all the access source addresses are not uniform and are extranet source addresses;
and the second judging subunit is configured to judge that the communication corresponding to the target resource identifier is webshell communication if the number of intranet source addresses in all the access source addresses is smaller than a preset threshold.
Optionally, the behavior feature information further includes page information and access information; the determining unit comprises a calculating subunit and a judging subunit;
the calculating subunit is used for calculating and obtaining a target score corresponding to the target resource identifier according to the preset corresponding relation between the behavior characteristic information and the weight score;
and the judging subunit is used for judging that the communication corresponding to the target resource identifier is webshell communication if the target score meets the preset score condition.
Optionally, the page information includes page skip times and page entry times; the access information includes an IP access amount; the resource information includes static resource loads.
Optionally, the calculating subunit is configured to query, from a pre-established hop count score list, a first score corresponding to the hop count of the page; inquiring a second score corresponding to the page degree number from a pre-established degree number score list; inquiring a third score corresponding to the IP access amount from a pre-established access amount score list; querying a fourth score corresponding to the static resource loading amount from a pre-established resource loading score list; and determining a target score corresponding to the target resource identifier based on the first score, the second score, the third score and the fourth score.
Optionally, the device further comprises a deleting unit;
and the deleting unit is used for deleting the file corresponding to the target resource identifier when the communication corresponding to the target resource identifier is determined to be webshell communication.
The description of the features in the embodiment corresponding to fig. 4 may be referred to the related description of the embodiment corresponding to fig. 1 and 2, and will not be repeated here.
According to the technical scheme, the initial log data acquired in the preset time are classified according to the resource identifiers to obtain the log data corresponding to each resource identifier; a webshell has a resource identifier that it uniquely corresponds to. The log data is classified according to the resource identifiers, and the log data of the same resource identifier can be summarized, so that the access behavior contained in the log data of each resource identifier is analyzed. The analysis process of the log data corresponding to each resource identifier is the same, taking any one of all the resource identifiers, namely the target resource identifier as an example, and counting the log data of the target resource identifier to obtain behavior characteristic information. The behavior feature information includes resource information, and whether communication corresponding to the target resource identifier is webshell communication is determined based on the resource information. In the technical scheme, the analysis of the encrypted communication content is bypassed, the effective detection of the encrypted webshell communication is realized by analyzing the access behavior in the log data, and the effective detection of the unencrypted webshell communication can be realized.
Fig. 5 is a schematic structural diagram of another communication detection device according to an embodiment of the present application, including a statistics unit 51 and an analysis unit 52;
a statistics unit 51, configured to count types and numbers of access source addresses of the access target resource identifiers; the type of the access source address comprises an intranet source address and an extranet source address;
the analyzing unit 52 is configured to determine that the communication corresponding to the target resource identifier is webshell communication if the type and the number of the access source addresses satisfy the access source distribution rule of the preset webshell communication.
Optionally, the analysis unit includes a first judgment subunit, a judgment subunit, and a second judgment subunit:
the first judging subunit is used for judging whether all the access source addresses are external network source addresses or not;
the judging subunit is used for judging that the communication corresponding to the target resource identifier is webshell communication if all the access source addresses are external network source addresses;
the second judging subunit is configured to judge whether the number of intranet source addresses in all the access source addresses is less than a preset threshold value if all the access source addresses are not uniform and are extranet source addresses;
and the judging subunit is used for judging that the communication corresponding to the target resource identifier is webshell communication if the number of the intranet source addresses in all the access source addresses is smaller than a preset threshold value.
The description of the features in the embodiment corresponding to fig. 5 may be referred to the related description of the embodiment corresponding to fig. 3, which is not repeated here.
According to the technical scheme, one webshell is provided with one unique corresponding resource identifier, the analysis process of the access source corresponding to each resource identifier is the same, and the type and the number of the access source addresses accessing to the target resource identifier are counted by taking any one of all the resource identifiers, namely the target resource identifier as an example. The type of the access source address may include an intranet source address and an extranet source address. If the type and the number of the access source addresses meet the access source distribution rule of the preset webshell communication, the characteristic of the access source address of the target resource identifier accords with the webshell communication, and at the moment, the communication corresponding to the target resource identifier can be judged to be the webshell communication. In the technical scheme, the analysis of the encrypted communication content is bypassed, the effective detection of the encrypted webshell communication is realized through the analysis of the type and the number of the access source addresses, and the effective detection of the unencrypted webshell communication can be realized.
Fig. 6 is a schematic hardware structure of a communication detection device 60 according to an embodiment of the present application, including:
A memory 61 for storing a computer program;
a processor 62 for executing a computer program to implement the steps of the communication detection method as described in any of the embodiments above.
The embodiment of the application further provides a computer readable storage medium, and a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the steps of the communication detection method according to any embodiment are implemented.
The foregoing describes in detail a communication detection method, apparatus, device and computer readable storage medium provided in embodiments of the present application. In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section. It should be noted that it would be obvious to those skilled in the art that various improvements and modifications can be made to the present application without departing from the principles of the present application, and such improvements and modifications fall within the scope of the claims of the present application.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

Claims (9)

1. A communication detection method, comprising:
Classifying the initial log data acquired in the preset time according to the resource identifiers to obtain log data corresponding to each resource identifier;
counting the log data of the target resource identifier to obtain behavior characteristic information; wherein the target resource identifier is any one of all the resource identifiers; the behavior characteristic information comprises resource information and also comprises the type and the number of access source addresses; the type of the access source address comprises an intranet source address and an extranet source address;
determining whether the communication corresponding to the target resource identifier is webshell communication based on the behavior feature information comprises:
judging whether all the access source addresses are external network source addresses or not under the condition that the resource information meets the preset resource loading condition;
if all the access source addresses are external network source addresses, judging that the communication corresponding to the target resource identifier is webshell communication;
if all the access source addresses are uneven external network source addresses, judging whether the number of internal network source addresses in all the access source addresses is smaller than a preset threshold value;
if the number of intranet source addresses in all the access source addresses is smaller than a preset threshold, determining that the communication corresponding to the target resource identifier is webshell communication.
2. The communication detection method according to claim 1, wherein the behavior feature information further includes page information and access information;
the determining whether the communication corresponding to the target resource identifier is webshell communication based on the behavior feature information includes:
according to the corresponding relation between the preset behavior characteristic information and the weight score, calculating to obtain a target score corresponding to the target resource identifier;
and if the target score meets a preset score condition, judging that the communication corresponding to the target resource identifier is webshell communication.
3. The communication detection method according to claim 2, wherein the page information includes a number of page hops and a number of page entries; the access information comprises IP access quantity; the resource information includes static resource loads.
4. The communication detection method according to claim 3, wherein the calculating the target score corresponding to the target resource identifier according to the preset correspondence between behavior feature information and weight score includes:
inquiring a first score corresponding to the page jump times from a pre-established jump times score list;
Inquiring a second score corresponding to the page income degree from a pre-established income degree score list;
inquiring a third score corresponding to the IP access amount from a pre-established access amount score list;
querying a fourth score corresponding to the static resource loading amount from a pre-established resource loading score list;
and determining a target score corresponding to the target resource identifier based on the first score, the second score, the third score and the fourth score.
5. The communication detection method according to any one of claims 1 to 4, characterized by further comprising:
and deleting the file corresponding to the target resource identifier under the condition that the communication corresponding to the target resource identifier is determined to be webshell communication.
6. A communication detection method, comprising:
counting the type and the number of access source addresses of the access target resource identifiers; the behavior characteristic information comprises the type and the number of access source addresses; the type of the access source address comprises an intranet source address and an extranet source address;
judging whether all the access source addresses are external network source addresses or not;
if all the access source addresses are external network source addresses, judging that the communication corresponding to the target resource identifier is webshell communication;
If all the access source addresses are uneven external network source addresses, judging whether the number of internal network source addresses in all the access source addresses is smaller than a preset threshold value;
if the number of intranet source addresses in all the access source addresses is smaller than a preset threshold, determining that the communication corresponding to the target resource identifier is webshell communication.
7. The communication detection device is characterized by comprising a classification unit, a statistics unit and a determination unit; the determining unit comprises a first judging subunit, a second judging subunit and a second judging subunit;
the classifying unit is used for classifying the initial log data acquired in the preset time according to the resource identifiers to obtain the log data corresponding to each resource identifier;
the statistics unit is used for carrying out statistics on the log data of the target resource identifier to obtain behavior characteristic information; wherein the target resource identifier is any one of all the resource identifiers; the behavior characteristic information comprises resource information and also comprises the type and the number of access source addresses; the type of the access source address comprises an intranet source address and an extranet source address;
The first judging subunit is configured to judge whether all the access source addresses are external network source addresses if the resource information meets a preset resource loading condition;
the first judging subunit is configured to judge that the communication corresponding to the target resource identifier is webshell communication if all the access source addresses are external network source addresses;
the second judging subunit is configured to judge whether the number of intranet source addresses in all the access source addresses is less than a preset threshold if all the access source addresses are not uniform and are extranet source addresses;
and the second judging subunit is configured to judge that the communication corresponding to the target resource identifier is webshell communication if the number of intranet source addresses in all the access source addresses is smaller than a preset threshold.
8. A communication detection apparatus, characterized by comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the steps of the communication detection method according to any one of claims 1 to 5 or claim 6.
9. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the communication detection method according to any of claims 1 to 5 or claim 6.
CN202110325650.3A 2021-03-26 2021-03-26 Communication detection method, device, equipment and medium Active CN113014601B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110325650.3A CN113014601B (en) 2021-03-26 2021-03-26 Communication detection method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110325650.3A CN113014601B (en) 2021-03-26 2021-03-26 Communication detection method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN113014601A CN113014601A (en) 2021-06-22
CN113014601B true CN113014601B (en) 2023-07-14

Family

ID=76407681

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110325650.3A Active CN113014601B (en) 2021-03-26 2021-03-26 Communication detection method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN113014601B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115225385B (en) * 2022-07-20 2024-02-23 深信服科技股份有限公司 Flow monitoring method, system, equipment and computer readable storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107231364A (en) * 2017-06-13 2017-10-03 深信服科技股份有限公司 A kind of website vulnerability detection method and device, computer installation and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105812196A (en) * 2014-12-30 2016-07-27 中国移动通信集团公司 WebShell detection method and electronic device
CN108206802B (en) * 2016-12-16 2020-11-17 华为技术有限公司 Method and device for detecting webpage backdoor
CN111031025B (en) * 2019-12-07 2022-04-29 杭州安恒信息技术股份有限公司 Method and device for automatically detecting and verifying Webshell
CN111800405A (en) * 2020-06-29 2020-10-20 深信服科技股份有限公司 Detection method, detection device and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107231364A (en) * 2017-06-13 2017-10-03 深信服科技股份有限公司 A kind of website vulnerability detection method and device, computer installation and storage medium

Also Published As

Publication number Publication date
CN113014601A (en) 2021-06-22

Similar Documents

Publication Publication Date Title
EP3588898B1 (en) Defense against apt attack
CN109831465B (en) Website intrusion detection method based on big data log analysis
US8438386B2 (en) System and method for developing a risk profile for an internet service
CN105635126B (en) Malice network address accesses means of defence, client, security server and system
US9147067B2 (en) Security method and apparatus
CN106992981B (en) Website backdoor detection method and device and computing equipment
US20180191765A1 (en) Method and apparatus for calculating risk of cyber attack
CN107992738B (en) Account login abnormity detection method and device and electronic equipment
US20190222587A1 (en) System and method for detection of attacks in a computer network using deception elements
WO2016121348A1 (en) Anti-malware device, anti-malware system, anti-malware method, and recording medium in which anti-malware program is stored
CN113518077A (en) Malicious web crawler detection method, device, equipment and storage medium
CN109067794B (en) Network behavior detection method and device
CN106685899A (en) Method and device for identifying malicious access
US20140330759A1 (en) System and method for developing a risk profile for an internet service
CN112784281A (en) Safety assessment method, device, equipment and storage medium for industrial internet
CN113014601B (en) Communication detection method, device, equipment and medium
US8364776B1 (en) Method and system for employing user input for website classification
EP3688950B1 (en) Intrusion detection
CN111131166A (en) User behavior prejudging method and related equipment
CN114363002B (en) Method and device for generating network attack relation diagram
CN107438053B (en) Domain name identification method and device and server
CN114500122A (en) Specific network behavior analysis method and system based on multi-source data fusion
Bo et al. Tom: A threat operating model for early warning of cyber security threats
CN115964582B (en) Network security risk assessment method and system
CN116094847B (en) Honeypot identification method, honeypot identification device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant