CN115398861A - Abnormal file detection method and related product - Google Patents

Abnormal file detection method and related product Download PDF

Info

Publication number
CN115398861A
CN115398861A CN202080099571.9A CN202080099571A CN115398861A CN 115398861 A CN115398861 A CN 115398861A CN 202080099571 A CN202080099571 A CN 202080099571A CN 115398861 A CN115398861 A CN 115398861A
Authority
CN
China
Prior art keywords
target
file
access
extracting
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202080099571.9A
Other languages
Chinese (zh)
Other versions
CN115398861B (en
Inventor
蔡杰
沈璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Shenzhen Huantai Technology Co Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Shenzhen Huantai Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd, Shenzhen Huantai Technology Co Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Publication of CN115398861A publication Critical patent/CN115398861A/en
Application granted granted Critical
Publication of CN115398861B publication Critical patent/CN115398861B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/554Detecting local intrusion or implementing counter-measures involving event detection and direct action
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/552Detecting local intrusion or implementing counter-measures involving long-term monitoring or reporting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0227Filtering policies
    • H04L63/0236Filtering by address, protocol, port number or service, e.g. IP-address or URL
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the application discloses an abnormal file detection method and a related product, wherein the method comprises the following steps: acquiring the flow data of all hosts in and out in a preset range, and extracting a target access relationship from the flow data of all hosts, wherein the target access relationship is at least one of the following: the access relation between the files and the IP addresses; determining an access relation graph according to the target access relation; extracting a target suspicious file according to the access relation graph; and detecting the target suspicious file to obtain a detection result, and outputting the detection result. By adopting the method and the device, the abnormal file detection efficiency can be improved.

Description

Abnormal file detection method and related product Technical Field
The application relates to the field of computers, in particular to an abnormal file detection method and a related product.
Background
The webshell is a command execution environment in the form of a web page file such as asp, php, jsp or cgi, and can also be called a web page backdoor. After a hacker invades a website, the asp or php backdoor file and a normal webpage file under a WEB directory of a website server are mixed together, and then a browser can be used to access the asp or php backdoor to obtain a command execution environment so as to achieve the purpose of controlling the website server.
Disclosure of Invention
The embodiment of the application provides an abnormal file detection method and a related product, and abnormal file detection efficiency can be improved.
In a first aspect, an abnormal file detection method in an embodiment of the present application is applied to an electronic device, and includes:
acquiring the flow data of all hosts in and out in a preset range, and extracting a target access relationship from the flow data of all hosts, wherein the target access relationship is at least one of the following: the access relation between the files and the IP addresses;
determining an access relation graph according to the target access relation;
extracting a target suspicious file according to the access relation graph;
and detecting the target suspicious file to obtain a detection result, and outputting the detection result.
In a second aspect, an embodiment of the present application provides an abnormal file detection apparatus, which is applied to an electronic device, and includes: an acquisition unit, a determination unit, an extraction unit and a detection unit, wherein,
the acquiring unit is configured to acquire flow data of all hosts in a preset range, and extract a target access relationship from the flow data, where the target access relationship is at least one of the following: the access relation between the files and the IP addresses;
the determining unit is used for determining an access relation graph according to the target access relation;
the extraction unit is used for extracting the target suspicious file according to the access relation graph;
the detection unit is used for detecting the target suspicious file to obtain a detection result and outputting the detection result.
In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a memory, a communication interface, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor, and the programs include instructions for executing the steps in the first aspect of the embodiment of the present application.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program for electronic data exchange, where the computer program enables a computer to perform some or all of the steps described in the first aspect of the embodiment of the present application.
In a fifth aspect, embodiments of the present application provide a computer program product, where the computer program product includes a non-transitory computer-readable storage medium storing a computer program, where the computer program is operable to cause a computer to perform some or all of the steps as described in the first aspect of the embodiments of the present application. The computer program product may be a software installation package.
Drawings
Reference will now be made in brief to the drawings that are needed in describing embodiments or prior art.
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1A is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure;
FIG. 1B is a schematic diagram of an architecture for implementing an abnormal file detection method according to an embodiment of the present application;
fig. 1C is a schematic flowchart of an abnormal file detection method disclosed in the embodiment of the present application;
FIG. 1D is a schematic illustration of an exemplary access relationship diagram disclosed herein;
FIG. 1E is a schematic illustration of an alternative access relationship diagram disclosed in an embodiment of the present application;
FIG. 1F is a schematic flow diagram of a naive Bayes classification procedure disclosed in an embodiment of the present application;
FIG. 2 is a schematic flowchart of another abnormal document detection method disclosed in the embodiment of the present application;
fig. 3 is a schematic structural diagram of another electronic device disclosed in an embodiment of the present application;
fig. 4 is a schematic structural diagram of an abnormal file detection apparatus according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions of the present application better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
The terms "first," "second," and the like in the description and claims of the present application and in the foregoing drawings are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The electronic devices involved in the embodiments of the present application may include various handheld devices, vehicle-mounted devices, wearable devices (smartwatches, wireless headsets), computing devices or other processing devices connected to a wireless modem, and various forms of User Equipment (UE), mobile Stations (MSs), terminal devices (terminal devices), and so on, which have wireless communication functions. For convenience of description, the above-mentioned devices are collectively referred to as electronic devices. The electronic device can also be a server, a gateway or an intelligent home device.
The smart home device may be at least one of: the intelligent electric cooker comprises an intelligent sound box, an intelligent camera, an intelligent electric cooker, an intelligent wheelchair, an intelligent massage chair, intelligent furniture, an intelligent dish washer, an intelligent television, an intelligent refrigerator, an intelligent electric fan, an intelligent heater, an intelligent clothes hanger, an intelligent lamp, an intelligent router, an intelligent switch panel, an intelligent humidifier, an intelligent air conditioner, an intelligent door, an intelligent window, an intelligent cooking bench, an intelligent disinfection cabinet, an intelligent closestool, a floor sweeping robot and the like, and the intelligent electric cooker is not limited herein.
The following describes embodiments of the present application in detail.
Referring to fig. 1A, fig. 1A is a schematic structural diagram of an electronic device according to an embodiment of the disclosure, and the electronic device 100 may include a control circuit, which may include a storage and processing circuit 110. The storage and processing circuitry 110 may be memory such as hard disk drive memory, non-volatile memory (e.g., flash memory or other electronically programmable read only memory used to form solid state drives, etc.), volatile memory (e.g., static or dynamic random access memory, etc.), etc., and the embodiments are not limited in this respect. Processing circuitry in storage and processing circuitry 110 may be used to control the operation of electronic device 100. The processing circuit may be implemented based on one or more microprocessors, microcontrollers, baseband processors, power management units, audio codec chips, application specific integrated circuits, display driver integrated circuits, and the like.
The storage and processing circuitry 110 may be used to run software in the electronic device 100, such as an internet browsing application, a Voice Over Internet Protocol (VOIP) telephone call application, an email application, a media playing application, operating system functions, and so forth. The software may be used to perform control operations such as, for example, camera-based image capture, ambient light measurement based on an ambient light sensor, proximity sensor measurement based on a proximity sensor, information display functions implemented based on status indicators such as status indicator lights of light emitting diodes, touch event detection based on a touch sensor, functions associated with displaying information on multiple (e.g., layered) displays, operations associated with performing wireless communication functions, operations associated with collecting and generating audio signals, control operations associated with collecting and processing button press event data, and other functions in the electronic device 100, without limitation.
The electronic device 100 may also include input-output circuitry 150. The input-output circuit 150 may be used to enable the electronic device 100 to input and output data, i.e., to allow the electronic device 100 to receive data from an external device and also to allow the electronic device 100 to output data from the electronic device 100 to the external device. The input-output circuit 150 may further include a sensor 170. The sensors 170 may include ambient light sensors, proximity sensors based on light and capacitance, touch sensors (e.g., based on optical touch sensors and/or capacitive touch sensors, where the touch sensors may be part of a touch display screen or used independently as a touch sensor structure), acceleration sensors, gravity sensors, and other sensors, among others.
Input-output circuitry 150 may also include one or more displays, such as display 130. Display 130 may include one or a combination of liquid crystal displays, organic light emitting diode displays, electronic ink displays, plasma displays, displays using other display technologies. Display 130 may include an array of touch sensors (i.e., display 130 may be a touch display screen). The touch sensor may be a capacitive touch sensor formed by a transparent touch sensor electrode (e.g., an Indium Tin Oxide (ITO) electrode) array, or may be a touch sensor formed using other touch technologies, such as acoustic wave touch, pressure sensitive touch, resistive touch, optical touch, and the like, and embodiments of the present application are not limited thereto.
The audio component 140 may be used to provide audio input and output functionality for the electronic device 100. The audio components 140 in the electronic device 100 may include a speaker, a microphone, a buzzer, a tone generator, and other components for generating and detecting sound.
The communication circuit 120 may be used to provide the electronic device 100 with the capability to communicate with external devices. The communication circuit 120 may include analog and digital input-output interface circuits, and wireless communication circuits based on radio frequency signals and/or optical signals. The wireless communication circuitry in communication circuitry 120 may include radio-frequency transceiver circuitry, power amplifier circuitry, low noise amplifiers, switches, filters, and antennas. For example, the wireless communication circuitry in communication circuitry 120 may include circuitry to support Near Field Communication (NFC) by transmitting and receiving near field coupled electromagnetic signals. For example, the communication circuit 120 may include a near field communication antenna and a near field communication transceiver. The communications circuitry 120 may also include a cellular telephone transceiver and antenna, a wireless local area network transceiver circuitry and antenna, and so forth.
The electronic device 100 may further include a battery, power management circuitry, and other input-output units 160. The input-output unit 160 may include buttons, joysticks, click wheels, scroll wheels, touch pads, keypads, keyboards, cameras, light emitting diodes and other status indicators, and the like.
A user may input commands through input-output circuitry 150 to control operation of electronic device 100, and may use output data of input-output circuitry 150 to enable receipt of status information and other outputs from electronic device 100.
In the related art, the detection methods of webshell can be mainly divided into three categories: static detection, dynamic detection and log analysis. The static detection judges whether the file is webshell or not by matching the static attributes of the files such as the feature codes, the feature values, the danger functions and the like. The common static detection method is rule matching, such as yara matching, and the perfection of the rule greatly affects the detection accuracy and the missing report rate. At present, in a large enterprise, the common idea of static detection is to adopt strong and weak feature matching: the hit strong feature rule is regarded as being necessarily the webshell, and the hit weak feature rule is handed to manual judgment of misjudgment or webshell. The dynamic detection is to upload a file to a server, the characteristics expressed when the file is executed are called dynamic characteristics, and the server judges whether the file is a webshell or not by monitoring various dynamic characteristics. The principle of log detection is that webshell leaves access data and data submission records of webshell in a web log, and a request model can be established through a large number of log files so as to detect abnormal files.
In the related art, three major detection methods have obvious defects. The quality of static detection greatly depends on rules, the missing report rate and the false report rate are high, and for 0day type webshell, the webshell which is encrypted, confused and deformed almost has no detection capability, and a large amount of manpower can be occupied for the strong and weak feature matching scheme adopted by the current large-scale enterprise. The dynamic detection monitors the file in the running process, and has the advantages of improving the accuracy, occupying more CPU and memory and having slower detection speed than the static detection. Dynamic detection often monitors a host directory in real time, and for a directory which often has a large amount of file changes, a large amount of files need to be uploaded, so that not only are consumed resources greatly increased, but also more false alarms are caused. When the access amount of the website reaches a certain magnitude, the result of the log detection method has a high reference value, but a certain false alarm exists, and the processing capacity and efficiency of detection are low for a large number of access logs.
Based on this, please refer to fig. 1B, where an electronic device is taken as an example, fig. 1B provides a system architecture for implementing the method according to the embodiment of the present application, and the method according to the embodiment of the present application may be applied to a server, where the server and a host side may communicate with each other, and further, a suspicious file (webshell target file) uploaded by the electronic device is obtained.
The embodiment of the application can be applied to sentinels of host security products, and is divided into a host end and a service end (as shown in FIG. 1B). The host end can operate a file uploading module for receiving the instruction transmitted back by the server and uploading the appointed file, and the server end can comprise three parts: a suspicious file separation engine, an opcode extraction engine, and a machine learning engine.
In a specific implementation, the suspicious file detection process can be detected serially by the server.
The suspicious file separation engine can be deployed on an independent server, can acquire the incoming and outgoing flow data of all hosts from a gateway, extract the access relations between the files and the IP, draw an access relation graph, separate the suspicious files from the access relation graph, send instructions to the host end of the suspicious files and prompt the host end to send the suspicious files to the server end. The host end is only responsible for receiving the instructions of the server end and uploading files under the appointed directory to the server, the occupation of the host end resources by the method is basically 0, and the normal operation of the host service is ensured to the maximum extent. After the server receives the suspicious file sent by the host, the opcode extraction engine acquires the suspicious file and extracts the opcode of the executable file, and sends the suspicious file to the machine learning engine, and the machine learning engine judges the suspicious file and obtains the final result.
Specifically, an embodiment of the present application provides an abnormal file detection method, which is applied to an electronic device, and may include the following steps:
acquiring the flow data of all hosts in and out in a preset range, and extracting a target access relationship from the flow data of all hosts, wherein the target access relationship is at least one of the following: the access relation between the files and the IP addresses;
determining an access relation graph according to the target access relation;
extracting a target suspicious file according to the access relation graph;
and detecting the target suspicious file to obtain a detection result, and outputting the detection result.
It can be seen that the abnormal file detection method described in the embodiment of the present application is applied to an electronic device, obtains the ingress and egress traffic data of all hosts within a preset range, and extracts a target access relationship from the ingress and egress traffic data, where the target access relationship is at least one of the following: the method comprises the steps of determining an access relation graph according to a target access relation between files and an access relation between the files and an IP address, extracting a target suspicious file according to the access relation graph, detecting the target suspicious file to obtain a detection result, and outputting the detection result.
Referring to fig. 1C, fig. 1C is a schematic flowchart of an abnormal file detection method according to an embodiment of the present application, where the abnormal file detection method described in the embodiment is applied to the electronic device shown in fig. 1A or the system architecture shown in fig. 1B, and the abnormal file detection method includes:
101. acquiring the flow data of all hosts in and out in a preset range, and extracting a target access relationship from the flow data of all hosts, wherein the target access relationship is at least one of the following: the access relation between the files and the IP addresses.
The preset range may be set by a user or default to a system, for example, a local area network, where the electronic device and all hosts are in the same local area network, and for example, a city, where all hosts are in the same city. When the electronic device is a server, it may obtain the ingress and egress traffic data of all hosts within the preset range through the gateway, and when the electronic device is not a server, for example, when the electronic device is a gateway, it may obtain the ingress and egress traffic data of all hosts within the preset range.
In a specific implementation, the electronic device may obtain the traffic data of all hosts within a preset range, and extract a target access relationship from the traffic data, where the target access relationship may be at least one of the following: for example, a device may perform networking or interaction with other devices through the IP address, for example, a certain web page may be entered, a certain link below the web page may be clicked, and then, the access sequence between the files and the IP address may be extracted to obtain the target access relationship.
In a possible example, the step 101 of acquiring the ingress and egress traffic data of all hosts within the preset range may include the following steps:
11. displaying a host distribution map on a display screen;
12. acquiring a touch track, and determining a closed area formed by the touch track;
13. and acquiring the incoming and outgoing flow data of all the hosts in the range of the closed area.
The electronic device can receive a touch track, and the touch track can be obtained by touch operation of a user, for example, touch is realized by a touch pen or by a finger, and the touch track can form a closed area, so that the electronic device can obtain flow data of all hosts in the closed area range in the host distribution map.
102. And determining an access relation graph according to the target access relation.
Because the access has a certain directionality, for example, a accesses B, or B accesses a, the electronic device may generate an access relationship graph according to the access relationship, and the access relationship graph may be a directed graph. Assuming v, w are two vertices of the access relationship graph, v- > w represents an edge pointing from v to w, and in a directed graph, the relationship of the two vertices may have the following four cases:
1. no edges are connected;
2. there is an edge from v to w: v- > w;
3. there is an edge from w to v: w- > v;
4. both v- > w and w- > v exist, i.e. one bidirectional edge.
Further, as shown in FIG. 1D, FIG. 1D provides an access relationship graph, each vertex of the access relationship graph can be a file or an IP address, and a1, a2, \ 8230, and a10 are the vertices of the access relationship graph.
In one possible example, the step 102 of determining the access relationship graph according to the target access relationship may include the following steps:
21. extracting a first file identifier, a first IP address and an access direction in the target access relation;
22. screening the first file identifier and the first IP address to obtain a second file identifier and a second IP address;
23. and determining the access relation graph according to the second file identifier, the second IP address and the access direction.
In specific implementation, the electronic device may extract the first file identifier, the first IP address, and the access relationship in the target access relationship, and certainly, since some secure file identifiers may exist in the file identifiers or some secure IP addresses, the file identifiers or the IP addresses may be screened to obtain the second file identifier and the second IP address, and the electronic device may determine the access relationship diagram according to the second file identifier, the second IP address, and the access direction, that is, perform arrow connection on the second file identifier and the second IP according to the access direction to obtain the access relationship diagram.
103. And extracting the target suspicious file according to the access relation graph.
In a specific implementation, taking a server as an example, the server may extract a target suspicious file according to the access relationship graph through a suspicious file separation engine, where the target suspicious file may be located in at least one of the hosts. The target suspicious file may be a webshell file.
In specific implementation, the main principle of the suspicious file separation engine may be to screen out suspicious files by drawing a directed graph according to the relationships between files and an IP. The directed graph is different from the undirected graph in that the edges of the directed graph are unidirectional, two vertexes connected by each edge are all an ordered pair, and the adjacency of the two vertexes is unidirectional. In the directed graph, a directed edge is pointed out by a first vertex and points to a second vertex, and the out degree of one vertex is the total number of the edges pointed out by the vertex; the in-degree of a vertex is the total number of edges pointing to the vertex.
In a specific implementation, according to the property of webshell, the executable file embedded in the intranet host by a hacker usually does not access other web files in the web directory, but directly communicates with the IP of the hacker, and normal files usually have access and interaction relation with other files in the web directory. If the access relations between the webshell and the files and the IP are drawn into a directed graph, the point represented by the webshell in the formed graph is separated by a group (the out-degree-in degree is 1) because the point interacts with a single object (the hacker IP), and the separation of the webshell and the normal files is realized. As shown in FIG. 1E below: each point in the graph represents a file or an IP, and as can be seen from the graph, each file forms a group due to the access relationship with other files or IPs, files with complex access relationship form a large group, files with less complex access relationship form a small group, and due to the independence of webshell files, the last webshell file forms a single point with access relationship with only a specific few IPs, and finally the single points (i.e., suspicious files) are uploaded to the cloud for further detection. In real-world applications, there may be a problem: when the access amount of a file (such as an index file, etc.) may be very large (different IP access amounts in tens of millions of levels), a huge access cluster may be formed when the graph is drawn, which may cause a particularly large amount of computation in data processing and may even cause a server crash. For this problem, it can be solved by defining white dots, for example, defining the home page file as a white dot file, defaulting the file with interaction with the white dot or the IP as a normal file, and not displaying on the diagram, then only the file without relation to the white dot needs to be processed, and for the setting of the white dot, it can be automatically set to automatically turn to the white dot when the access amount of a certain web file reaches a certain threshold, thus greatly reducing the calculation amount of the server.
In a possible example, the step 103 of extracting the target suspicious file according to the access relationship graph may include the following steps:
31. determining the number of interactive objects of each vertex in the access relation graph according to the access relation graph to obtain a plurality of numerical values, wherein the access relation graph is a directed graph and comprises a plurality of vertices;
32. and selecting a target numerical value smaller than a preset threshold value from the plurality of numerical values, and acquiring a file corresponding to the target numerical value as the target suspicious file.
The preset threshold value can be set by the user or defaulted by the system. The electronic equipment can determine the number of the interactive objects of each vertex in the access relation graph according to the access relation graph to obtain a plurality of numerical values, the access relation graph is a directed graph and comprises a plurality of vertexes, the vertexes can be files or IP addresses, the interactive objects can also be vertexes, further, a target numerical value smaller than a preset threshold value can be selected from the plurality of numerical values, and the file corresponding to the target numerical value is obtained to serve as a target suspicious file.
Further, in a possible example, in the step 32, acquiring a file corresponding to the target value as the target suspicious file may include the following steps:
321. sending an obtaining instruction to a target host corresponding to the target numerical value, wherein the obtaining instruction is used for obtaining at least one file related to the target host in the ingress and egress flow data;
322. receiving the at least one file fed back by the target host;
323. extracting the target suspect file from the at least one file.
In a specific implementation, the electronic device may send an obtaining instruction to a target host corresponding to a target numerical value, where the obtaining instruction may be used to obtain at least one file related to the target host in the ingress and egress traffic data, the target host may send the at least one file to the electronic device, and the electronic device may receive the at least one file fed back by the target host and may extract a target suspicious file from the at least one file.
104. And detecting the target suspicious file to obtain a detection result, and outputting the detection result.
In the embodiment of the application, the electronic device may detect the target suspicious file to obtain a detection result, where the detection result may indicate that the target suspicious file is an abnormal file, or that the target suspicious file is not an abnormal file.
In a possible example, the step 104 of detecting the target suspicious file to obtain a detection result may include the following steps:
41. acquiring a target operation code of the target suspicious file;
42. and inputting the target operation code into a preset machine learning model to obtain the detection result.
In the concrete implementation, the target operation code is opcode, the opcode is the operation code of the system, the interpreter executes the minimum optimized opcode in a basic unit op _ array, and executes the minimum optimized opcode in sequence, and the current opcode is executed, so that the next opcode is prefetched until the last RETRUN, which is a special opcode, returns to exit. The opcode processing engine is specially used for processing the files transmitted from the first part, extracting opcodes of the executable files, storing the opcodes and transmitting the opcodes to the cloud. At present, most types of executable files have corresponding opcode extraction plug-ins, taking php as an example, php has a vld extension tool, assuming that 1.Php files exist in a current directory, an execution php-dvld.active = 1.Php command can generate an execution code, and finally, the opcode of the php files can be obtained by extracting the code through the opcode, and the opcode is uploaded to a cloud and is handed to a machine learning engine for processing.
In this embodiment of the application, the preset machine learning model may be at least one of the following: a neural network model, a genetic algorithm model, a bayesian classification algorithm, etc., wherein the neural network model is not limited herein and may be at least one of the following: a fully-connected neural network model, a recurrent neural network model, a convolutional neural network model, a spiking neural network model, etc., without limitation.
In a specific implementation, the electronic device may obtain a target operation code of the target suspicious file, specifically, may parse the target suspicious file to obtain the target operation code, and further, may input the target operation code to a preset machine learning model to obtain a detection result.
For example, the preset machine learning model may adopt a naive bayes supervision algorithm, and the electronic device may adopt the naive bayes supervision algorithm to detect the extracted opcode.
The Bayes classification algorithm is a general name of a class of algorithms, and is based on Bayes theorem, and the naive Bayes algorithm is a very common one in the Bayes classification algorithms. The principle of the naive bayes algorithm is in short: for a given item to be classified, the probability of occurrence of each class under the condition of occurrence of the item is solved, and which is the largest is regarded as the item belonging to which class (the mathematical derivation process is skipped here). The naive bayes classification flow can be represented by fig. 1F:
the steps S1 and S2 are a preparation working stage, in this stage, the characteristics of the opcode of the Webshell need to be determined, the opcode of the Webshell generally appears in a combination form, some specific combinations are rarely seen in normal files and often appear in the Webshell, and therefore the combinations of various opcodes are used as the characteristics;
further, the steps S3 and S4 are a classifier training stage, the task of the stage is to calculate the occurrence frequency of each class in the training sample and the probability estimation of each class by each characteristic attribute partition, and record the result, wherein the input of the result is the characteristic attribute and the training sample, and the output of the result is the classifier;
secondly, the step S5 and the step S6 are application stages, the opcode extracted from the suspicious file uploaded by the host end is put into a trained model, and a judgment result is output;
and finally, the judgment result of the opcode corresponds to the judgment result of the original executable file, and finally, the judgment result is stored in a database and reported to the host, and the host determines whether to process the executable files.
In fig. 1F, X is a feature attribute, Y is a category, i is any category, and different features may correspond to different categories.
Through the detection process, the webshell can be identified at a higher speed, with a lower host resource occupation and with a higher accuracy, and the three engines at the back end can be distributed and deployed on different servers, so that the detection efficiency of the webshell is improved to a certain extent.
Further, in a possible example, the step 41 of inputting the target operation code into a preset machine learning model to obtain the detection result may include the following steps:
411. extracting the characteristics of the target operation code to obtain target characteristic parameters;
412. and inputting the target characteristic parameters into the preset machine learning model to obtain the detection result.
In specific implementation, the electronic device can perform feature extraction on the target operation code to obtain target feature parameters, and then the target feature parameters can be input into a preset machine learning model to obtain a detection result.
In the embodiment of the application, the advantages of each engine can be exerted by using the detection method, and three great benefits are obtained. Firstly, the host machine resource occupation is small, and the set of detection system does not need to deploy a complicated webshell detection system on the host machine, so that the occupation of irrelevant services on the resources is greatly reduced, the normal operation of online services is ensured to the greatest extent, and the system is very suitable for large-scale enterprises; secondly, the method has a high killing rate for the deformed, encrypted and confused webshell, namely a high detection rate for the webshell. The method mainly comprises the steps that no matter how the file is deformed, encrypted and confused, the opcode is finally executed with a section of risky codes and is a product of the section of risky codes, so that the opcode is directly extracted, and no matter how the file is deformed, encrypted and confused, the opcode has high killing rate. And thirdly, the method has higher speed and timeliness. Because the files are filtered at the flow end, the webshell is detected by the system when being executed, and immediately handed to the detection engine for processing, so that the host is prevented from being further invaded by a hacker.
In addition, the abnormal file detection method provided by the embodiment of the application is used for solving the problems of low accuracy of static detection and large resource occupation and low speed of dynamic detection. The method can identify the webshell with higher accuracy under the condition of keeping high speed and low resource use. Aiming at the problem of low accuracy of static detection, the embodiment of the application adopts a method for extracting opcode for judgment, wherein opcode is generated when a file is dynamically executed, and has a good killing rate for encrypted, confused and deformed webshell. Aiming at the problem that dynamic detection occupies larger host resources, the embodiment of the application puts main calculation work on the server, the server draws a directed graph and screens suspicious files according to the host flow, the host side only needs to upload the corresponding suspicious files to the server for judgment according to the instructions of the server, the whole process hardly occupies the host resources, and the service on the host is not influenced. And because the suspicious files are filtered and then enter the machine learning engine, the detection speed of the machine learning engine can be improved.
In a possible example, after determining the access relationship diagram according to the target access relationship in step 102 and before extracting the target suspicious file according to the access relationship diagram in step 103, the method may further include the following steps:
a1, detecting whether an isolated vertex exists in the access relation graph or not;
and A2, when the isolated vertex exists in the access relation graph, executing the step of extracting the target suspicious file according to the access relation graph.
The electronic device may detect an isolated vertex condition of the access relationship graph, and if an isolated vertex exists in the access relationship graph, it indicates that an abnormal file may exist, step 103 may be executed, otherwise, it indicates that the system is safe, and step 103 may not be executed.
In one possible example, before step 101, the following steps may be further included:
b1, detecting a network environment to obtain target network parameters;
b2, determining a target security level according to the target network parameters;
and B3, when the target security level is lower than a preset security level, executing the step of acquiring the incoming and outgoing flow data of all the hosts in the preset range.
The preset security level can be set by the user or defaulted by the system. The network parameter may be at least one of: the method includes the steps of determining a target security level corresponding to a target network evaluation value according to a mapping relationship between a preset network evaluation value and a security level, and when the target security level is lower than the preset security level, executing step 101, otherwise, not executing step 101, and thus, when a network security is in a crisis, performing abnormal file detection.
In one possible example, before step 101, the following steps may be further included:
c1, acquiring target physiological state parameters of a user;
c2, determining a target emotion type corresponding to the target physiological state parameter;
and C3, when the target emotion type is a preset emotion type, executing the step of acquiring the flow data of all hosts in the preset range.
In this embodiment, the physiological status parameter may be various parameters for reflecting the physiological function of the user, and the physiological status parameter may be at least one of the following parameters: heart rate, blood pressure, blood temperature, blood lipid level, blood glucose level, thyroxine level, epinephrine level, platelet level, blood oxygen level, and the like, without limitation. The preset emotion type may be set by the user or by the system default. The preset emotion type may be at least one of: heaviness, crying, calmness, irritability, excitement, depression, and the like, without limitation.
In specific implementation, the electronic device may acquire a target physiological state parameter of the user through a wearable device capable of performing communication connection with the electronic device, different physiological state parameters reflect an emotion type of the user, a mapping relationship between the physiological state parameter and the emotion type may be stored in the electronic device in advance, and then the target emotion type corresponding to the target physiological state parameter may be determined according to the mapping relationship, and then, when the target emotion type is the preset emotion type, step 101 may be executed, otherwise, step 101 may not be executed.
In a possible example, when the target physiological state parameter is a heart rate variation curve in a specified time period, the step C1 of determining the target emotion type corresponding to the target physiological state parameter may be implemented as follows:
c11, sampling the heart rate change curve to obtain a plurality of heart rate values;
c12, performing mean value operation according to the plurality of heart rate values to obtain an average heart rate value;
c13, determining a target heart rate grade corresponding to the average heart rate value;
c14, determining a target first emotion value corresponding to the target heart rate grade according to a mapping relation between a preset heart rate grade and the first emotion value;
c15, performing mean square error operation according to the plurality of heart rate values to obtain a target mean square error;
c16, determining a target second emotion value corresponding to the target mean square error according to a mapping relation between a preset mean square error and the second emotion value;
c17, determining a target weight value pair corresponding to the target heart rate level according to a mapping relation between a preset heart rate level and the weight value pair, wherein the weight value pair comprises a first weight value and a second weight value, the first weight value is a weight value corresponding to the first emotion value, and the second weight value is a weight value corresponding to the second emotion value;
c18, carrying out weighted operation according to the target first emotion value, the target second emotion value and the target weight value to obtain a final emotion value;
and C19, determining the target emotion type corresponding to the target emotion value according to a preset mapping relation between the emotion value and the emotion type.
The specified time period can be set by a user or default by a system, a mapping relationship between a preset heart rate level and a first emotion value, a mapping relationship between a preset mean square error and a second emotion value, a mapping relationship between a preset heart rate level and a weight pair, and a mapping relationship between a preset emotion value and an emotion type can be stored in the electronic device in advance, the weight pair can comprise a first weight and a second weight, the first weight is a weight corresponding to the first emotion value, the second weight is a weight corresponding to the second emotion value, the sum of the first weight and the second weight can be 1, and the value ranges of the first weight and the second weight are both 0-1. In the embodiment of the application, the emotion can be evaluated through a heart rate variation curve.
In specific implementation, the electronic device may sample the heart rate change curve, and the specific sampling mode may be: the method comprises the steps of uniformly sampling or randomly sampling to obtain a plurality of heart rate values, performing mean value operation according to the plurality of heart rate values to obtain an average heart rate value, pre-storing a mapping relation between the heart rate value and a heart rate grade in the electronic equipment, determining a target heart rate grade corresponding to the average heart rate value according to the mapping relation, determining a target first emotion value corresponding to the target heart rate grade according to the mapping relation between the preset heart rate grade and the first emotion value, performing mean square error operation according to the plurality of heart rate values to obtain a target mean square error, and determining a target second emotion value corresponding to the target mean square error according to the mapping relation between the preset mean square error and the second emotion value.
Further, the electronic device may further determine a target weight pair corresponding to the target heart rate level according to the mapping relationship between the preset heart rate level and the weight pair, where the target weight pair may include a target first weight and a target first weight, the target first weight is a weight corresponding to a target first emotion value, and the target second weight is a weight corresponding to a target second emotion value, and further, the electronic device may perform weighting operation according to the target first emotion value, the target second emotion value, the target first weight and the target second weight to obtain a final emotion value, and a specific calculation formula is as follows:
final emotion value = target first emotion value + target second weight
And then, determining a target emotion type corresponding to the target emotion value according to the preset mapping relation between the emotion value and the emotion type. Wherein, above-mentioned average heart rate has reflected user's heart rate value, and the mean square error of heart rate has reflected heart rate stability, has reflected user's mood through two dimensions of average heart rate and mean square error, can accurate definite user's mood type.
It can be seen that the abnormal file detection method described in the embodiment of the present application is applied to an electronic device, obtains the ingress and egress traffic data of all hosts within a preset range, and extracts a target access relationship from the ingress and egress traffic data, where the target access relationship is at least one of the following: the method comprises the steps of determining an access relation graph according to a target access relation between files and an access relation between the files and an IP address, extracting a target suspicious file according to the access relation graph, detecting the target suspicious file to obtain a detection result, and outputting the detection result.
In accordance with the above, please refer to fig. 2, fig. 2 is a schematic flowchart of another abnormal file detection method provided in the embodiment of the present application, and the abnormal file detection method described in the embodiment is applied to the electronic device shown in fig. 1A or the system architecture shown in fig. 1B, and the method may include the following steps:
201. and carrying out network environment detection to obtain target network parameters.
202. And determining a target security level according to the target network parameters.
203. When the target security level is lower than a preset security level, acquiring the flow data of all hosts in a preset range, and extracting a target access relationship from the flow data, wherein the target access relationship is at least one of the following: the access relation between the files and the IP addresses.
204. And determining an access relation graph according to the target access relation.
205. And extracting the target suspicious file according to the access relation graph.
206. And detecting the target suspicious file to obtain a detection result, and outputting the detection result.
The detailed description of steps 201 to 206 may refer to the abnormal file detection method shown in fig. 1C, and is not described herein again.
It can be seen that the abnormal file detection method described in the embodiment of the present application is applied to an electronic device, obtains the traffic data of all hosts in a preset range, and extracts a target access relationship from the traffic data, where the target access relationship is at least one of the following: the method comprises the steps of determining an access relation graph according to a target access relation between files and an access relation between the files and an IP address, extracting a target suspicious file according to the access relation graph, detecting the target suspicious file to obtain a detection result, and outputting the detection result.
The following is a device for implementing the above abnormal file detection method, specifically as follows:
in accordance with the above, please refer to fig. 3, in which fig. 3 is an electronic device according to an embodiment of the present application, including: a processor and a memory; and one or more programs stored in the memory and configured to be executed by the processor, the programs including instructions for performing the steps of:
acquiring the flow data of all hosts in and out in a preset range, and extracting a target access relationship from the flow data of all hosts, wherein the target access relationship is at least one of the following: the access relation between the files and the IP addresses;
determining an access relation graph according to the target access relation;
extracting a target suspicious file according to the access relation graph;
and detecting the target suspicious file to obtain a detection result, and outputting the detection result.
It can be seen that, in the electronic device described in the embodiment of the present application, the ingress and egress traffic data of all hosts in the preset range is obtained, and the target access relationship is extracted from the ingress and egress traffic data, where the target access relationship is at least one of the following: the method comprises the steps of determining an access relation graph according to a target access relation between files and an access relation between the files and an IP address, extracting a target suspicious file according to the access relation graph, detecting the target suspicious file to obtain a detection result, and outputting the detection result.
In one possible example, in the detecting the target suspicious file to obtain a detection result, the program includes instructions for performing the following steps:
acquiring a target operation code of the target suspicious file;
and inputting the target operation code into a preset machine learning model to obtain the detection result.
In one possible example, in the aspect of inputting the target operation code into a preset machine learning model to obtain the detection result, the program includes instructions for performing the following steps:
extracting the characteristics of the target operation code to obtain target characteristic parameters;
and inputting the target characteristic parameters into the preset machine learning model to obtain the detection result.
In one possible example, in said extracting a target suspect file from said access relationship graph, said program comprises instructions for:
determining the number of interactive objects of each vertex in the access relation graph according to the access relation graph to obtain a plurality of numerical values, wherein the access relation graph is a directed graph and comprises a plurality of vertices;
and selecting a target numerical value smaller than a preset threshold value from the plurality of numerical values, and acquiring a file corresponding to the target numerical value as the target suspicious file.
In one possible example, in the aspect of obtaining the file corresponding to the target value as the target suspicious file, the program includes instructions for performing the following steps:
sending an acquisition instruction to a target host corresponding to the target numerical value, wherein the acquisition instruction is used for acquiring at least one file related to the target host in the ingress and egress flow data;
receiving the at least one file fed back by the target host;
extracting the target suspect file from the at least one file.
In one possible example, in the determining an access relationship graph from the target access relationship, the program includes instructions for:
extracting a first file identifier, a first IP address and an access direction in the target access relation;
screening the first file identifier and the first IP address to obtain a second file identifier and a second IP address;
and determining the access relation graph according to the second file identifier, the second IP address and the access direction.
In one possible example, in the obtaining of the incoming and outgoing flow data of all hosts within the preset range, the program includes instructions for:
displaying a host distribution map on a display screen;
acquiring a touch track, and determining a closed area formed by the touch track;
and acquiring the incoming and outgoing flow data of all the hosts in the range of the closed area.
In one possible example, after the determining an access relationship graph from the target access relationships and before the extracting the target suspect file from the access relationship graph, the program further includes instructions for:
detecting whether isolated vertexes exist in the access relation graph or not;
and when the isolated vertex exists in the access relation graph, executing the step of extracting the target suspicious file according to the access relation graph.
In one possible example, the program further comprises instructions for performing the steps of:
carrying out network environment detection to obtain target network parameters;
determining a target security level according to the target network parameters;
and when the target security level is lower than a preset security level, executing the step of acquiring the flow data of all hosts in the preset range.
Referring to fig. 4, fig. 4 is a schematic structural diagram of an abnormal file detection apparatus provided in this embodiment. The abnormal file detection device is applied to the electronic equipment shown in fig. 1A or the system architecture shown in fig. 1B, and comprises the following components: an acquisition unit 401, a determination unit 402, an extraction unit 403, and a detection unit 404, wherein,
the obtaining unit 401 is configured to obtain flow data of all hosts in a preset range, and extract a target access relationship from the flow data, where the target access relationship is at least one of the following: the access relation between the files and the IP addresses;
the determining unit 402 is configured to determine an access relationship graph according to the target access relationship;
the extracting unit 403 is configured to extract a target suspicious file according to the access relationship graph;
the detecting unit 404 is configured to detect the target suspicious file, obtain a detection result, and output the detection result.
It can be seen that the abnormal file detection apparatus described in the embodiment of the present application is applied to an electronic device, obtains the traffic data of all hosts in a preset range, and extracts a target access relationship from the traffic data, where the target access relationship is at least one of the following: the method comprises the steps of determining an access relation graph according to a target access relation between files and an access relation between the files and an IP address, extracting a target suspicious file according to the access relation graph, detecting the target suspicious file to obtain a detection result, and outputting the detection result.
In a possible example, in the aspect of detecting the target suspicious file to obtain a detection result, the detecting unit 404 is specifically configured to:
acquiring a target operation code of the target suspicious file;
and inputting the target operation code into a preset machine learning model to obtain the detection result.
In a possible example, in terms of inputting the target operation code into a preset machine learning model to obtain the detection result, the detection unit 404 is specifically configured to:
extracting the characteristics of the target operation code to obtain target characteristic parameters;
and inputting the target characteristic parameters into the preset machine learning model to obtain the detection result.
In a possible example, in terms of extracting the target suspicious file according to the access relationship graph, the extracting unit 403 is specifically configured to:
determining the number of interactive objects of each vertex in the access relation graph according to the access relation graph to obtain a plurality of values, wherein the access relation graph is a directed graph and comprises a plurality of vertices;
and selecting a target value smaller than a preset threshold value from the plurality of values, and acquiring a file corresponding to the target value as the target suspicious file.
In a possible example, in terms of obtaining the file corresponding to the target value as the target suspicious file, the extracting unit 403 is specifically configured to:
sending an acquisition instruction to a target host corresponding to the target numerical value, wherein the acquisition instruction is used for acquiring at least one file related to the target host in the ingress and egress flow data;
receiving the at least one file fed back by the target host;
extracting the target suspect file from the at least one file.
In a possible example, in the aspect of determining the access relationship graph according to the target access relationship, the determining unit 402 is specifically configured to:
extracting a first file identifier, a first IP address and an access direction in the target access relation;
screening the first file identifier and the first IP address to obtain a second file identifier and a second IP address;
and determining the access relation graph according to the second file identifier, the second IP address and the access direction.
In one possible example, in the aspect of acquiring the ingress and egress flow data of all hosts within the preset range, the acquiring unit 401 is specifically configured to:
displaying a host distribution map on a display screen;
acquiring a touch track, and determining a closed area formed by the touch track;
and acquiring the data of the incoming and outgoing flow of all the hosts in the range of the closed area.
In one possible example, the following is specified:
the detecting unit 404 is further configured to detect whether an isolated vertex exists in the access relationship graph;
when the isolated vertex exists in the access relationship graph, the extracting unit 403 executes the step of extracting the target suspicious file according to the access relationship graph.
In one possible example, the following is specified:
the detecting unit 404 is further configured to perform network environment detection to obtain a target network parameter;
the determining unit 402 is further configured to determine a target security level according to the target network parameter;
when the target security level is lower than a preset security level, the obtaining unit 401 executes the step of obtaining the ingress and egress flow data of all hosts within a preset range.
It can be understood that the functions of each program module of the abnormal file detecting apparatus in this embodiment may be specifically implemented according to the method in the foregoing method embodiment, and the specific implementation process may refer to the related description of the foregoing method embodiment, which is not described herein again.
Embodiments of the present application also provide a computer storage medium, wherein the computer storage medium stores a computer program for electronic data exchange, and the computer program enables a computer to execute part or all of the steps of any one of the abnormal file detection methods described in the above method embodiments.
Embodiments of the present application also provide a computer program product, which includes a non-transitory computer-readable storage medium storing a computer program, and the computer program is operable to cause a computer to execute part or all of the steps of any one of the abnormal file detecting methods as described in the above method embodiments.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed coupling or direct coupling or communication connection between each other may be through some interfaces, indirect coupling or communication connection between devices or units, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may be implemented in the form of a software program module.
The integrated unit, if implemented in the form of a software program module and sold or used as a stand-alone product, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned memory comprises: various media capable of storing program codes, such as a usb disk, a read-only memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and the like.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash disk, ROM, RAM, magnetic or optical disk, and the like.
The foregoing embodiments have been described in detail, and specific examples are used herein to explain the principles and implementations of the present application, where the above description of the embodiments is only intended to help understand the method and its core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, the specific implementation manner and the application scope may be changed, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (20)

  1. An abnormal file detection method is applied to electronic equipment and comprises the following steps:
    acquiring the flow data of all hosts in and out in a preset range, and extracting a target access relationship from the flow data of all hosts, wherein the target access relationship is at least one of the following: the access relation between the files and the IP addresses;
    determining an access relation graph according to the target access relation;
    extracting a target suspicious file according to the access relation graph;
    and detecting the target suspicious file to obtain a detection result, and outputting the detection result.
  2. The method of claim 1, wherein the detecting the target suspicious file to obtain a detection result comprises:
    acquiring a target operation code of the target suspicious file;
    and inputting the target operation code into a preset machine learning model to obtain the detection result.
  3. The method of claim 2, wherein inputting the target operation code into a preset machine learning model to obtain the detection result comprises:
    extracting the characteristics of the target operation code to obtain target characteristic parameters;
    and inputting the target characteristic parameters into the preset machine learning model to obtain the detection result.
  4. The method according to any one of claims 1-3, wherein said extracting the target suspicious file according to the access relationship graph comprises:
    determining the number of interactive objects of each vertex in the access relation graph according to the access relation graph to obtain a plurality of numerical values, wherein the access relation graph is a directed graph and comprises a plurality of vertices;
    and selecting a target numerical value smaller than a preset threshold value from the plurality of numerical values, and acquiring a file corresponding to the target numerical value as the target suspicious file.
  5. The method of claim 4, wherein the obtaining the file corresponding to the target value as the target suspicious file comprises:
    sending an acquisition instruction to a target host corresponding to the target numerical value, wherein the acquisition instruction is used for acquiring at least one file related to the target host in the ingress and egress flow data;
    receiving the at least one file fed back by the target host;
    extracting the target suspect file from the at least one file.
  6. The method according to any one of claims 1-5, wherein the determining an access relationship graph from the target access relationship comprises:
    extracting a first file identifier, a first IP address and an access direction in the target access relation;
    screening the first file identifier and the first IP address to obtain a second file identifier and a second IP address;
    and determining the access relation graph according to the second file identifier, the second IP address and the access direction.
  7. The method according to any one of claims 1 to 6, wherein the obtaining the incoming and outgoing flow data of all hosts within the preset range comprises:
    displaying a host distribution map on a display screen;
    acquiring a touch track, and determining a closed area formed by the touch track;
    and acquiring the incoming and outgoing flow data of all the hosts in the range of the closed area.
  8. The method according to any of claims 1-7, wherein after said determining an access relationship graph according to said target access relationship and before said extracting a target suspicious file according to said access relationship graph, said method further comprises:
    detecting whether isolated vertexes exist in the access relation graph or not;
    and when the isolated vertex exists in the access relation graph, executing the step of extracting the target suspicious file according to the access relation graph.
  9. The method according to any one of claims 1-8, further comprising:
    carrying out network environment detection to obtain target network parameters;
    determining a target security level according to the target network parameters;
    and when the target security level is lower than a preset security level, executing the step of acquiring the flow data of all hosts in the preset range.
  10. An abnormal file detection device, applied to an electronic device, the device comprising: an acquisition unit, a determination unit, an extraction unit and a detection unit, wherein,
    the acquiring unit is configured to acquire flow data of all hosts within a preset range, and extract a target access relationship from the flow data, where the target access relationship is at least one of the following: the access relation between the files and the IP addresses;
    the determining unit is used for determining an access relation graph according to the target access relation;
    the extraction unit is used for extracting the target suspicious file according to the access relation graph;
    the detection unit is used for detecting the target suspicious file to obtain a detection result and outputting the detection result.
  11. The apparatus according to claim 10, wherein in said detecting the target suspicious file to obtain a detection result, said detecting unit is specifically configured to:
    acquiring a target operation code of the target suspicious file;
    and inputting the target operation code into a preset machine learning model to obtain the detection result.
  12. The apparatus according to claim 11, wherein in terms of the input of the target operation code into a preset machine learning model to obtain the detection result, the detection unit is specifically configured to:
    extracting the characteristics of the target operation code to obtain target characteristic parameters;
    and inputting the target characteristic parameters into the preset machine learning model to obtain the detection result.
  13. The apparatus according to any one of claims 10 to 12, wherein, in said extracting the target suspicious file according to the access relationship graph, the extracting unit is specifically configured to:
    determining the number of interactive objects of each vertex in the access relation graph according to the access relation graph to obtain a plurality of numerical values, wherein the access relation graph is a directed graph and comprises a plurality of vertices;
    and selecting a target numerical value smaller than a preset threshold value from the plurality of numerical values, and acquiring a file corresponding to the target numerical value as the target suspicious file.
  14. The apparatus according to claim 13, wherein in the aspect of acquiring the file corresponding to the target value as the target suspicious file, the extracting unit is specifically configured to:
    sending an acquisition instruction to a target host corresponding to the target numerical value, wherein the acquisition instruction is used for acquiring at least one file related to the target host in the ingress and egress flow data;
    receiving the at least one file fed back by the target host;
    extracting the target suspect file from the at least one file.
  15. The apparatus according to any of claims 10-14, wherein, in said determining an access relationship graph from the target access relationship, the determining unit is specifically configured to:
    extracting a first file identifier, a first IP address and an access direction in the target access relation;
    screening the first file identifier and the first IP address to obtain a second file identifier and a second IP address;
    and determining the access relation graph according to the second file identifier, the second IP address and the access direction.
  16. The apparatus according to any one of claims 10 to 15, wherein in the acquiring the ingress and egress traffic data of all hosts within the preset range, the acquiring unit is specifically configured to:
    displaying a host distribution map on a display screen;
    acquiring a touch track, and determining a closed area formed by the touch track;
    and acquiring the incoming and outgoing flow data of all the hosts in the range of the closed area.
  17. The apparatus according to any one of claims 10 to 16,
    the detection unit is further used for detecting whether an isolated vertex exists in the access relation graph;
    and when the isolated vertex exists in the access relation graph, the extracting unit executes the step of extracting the target suspicious file according to the access relation graph.
  18. An electronic device comprising a processor, a memory, a communication interface, and one or more programs stored in the memory and configured to be executed by the processor, the programs comprising instructions for performing the steps in the method of any of claims 1-9.
  19. A computer-readable storage medium, characterized in that a computer program for electronic data exchange is stored, wherein the computer program causes a computer to perform the method according to any one of claims 1-9.
  20. A computer program product, characterized in that the computer program product comprises a non-transitory computer-readable storage medium storing a computer program operable to cause a computer to perform the method of any of claims 1-9.
CN202080099571.9A 2020-05-07 2020-05-07 Abnormal file detection method and related product Active CN115398861B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/089033 WO2021223177A1 (en) 2020-05-07 2020-05-07 Abnormal file detection method and related product

Publications (2)

Publication Number Publication Date
CN115398861A true CN115398861A (en) 2022-11-25
CN115398861B CN115398861B (en) 2023-06-27

Family

ID=78467766

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202080099571.9A Active CN115398861B (en) 2020-05-07 2020-05-07 Abnormal file detection method and related product

Country Status (2)

Country Link
CN (1) CN115398861B (en)
WO (1) WO2021223177A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114363212B (en) * 2021-12-27 2023-12-26 绿盟科技集团股份有限公司 Equipment detection method, device, equipment and storage medium
CN114650187B (en) * 2022-04-29 2024-02-23 深信服科技股份有限公司 Abnormal access detection method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140215619A1 (en) * 2013-01-28 2014-07-31 Infosec Co., Ltd. Webshell detection and response system
CN107135199A (en) * 2017-03-29 2017-09-05 国家电网公司 The detection method and device at webpage back door
CN107294982A (en) * 2017-06-29 2017-10-24 深信服科技股份有限公司 Webpage back door detection method, device and computer-readable recording medium
US10237294B1 (en) * 2017-01-30 2019-03-19 Splunk Inc. Fingerprinting entities based on activity in an information technology environment
CN109831429A (en) * 2019-01-30 2019-05-31 新华三信息安全技术有限公司 A kind of Webshell detection method and device
CN110162973A (en) * 2019-05-24 2019-08-23 新华三信息安全技术有限公司 A kind of Webshell file test method and device
CN110807194A (en) * 2019-10-17 2020-02-18 新华三信息安全技术有限公司 Webshell detection method and device
CN110855661A (en) * 2019-11-11 2020-02-28 杭州安恒信息技术股份有限公司 WebShell detection method, device, equipment and medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108337218A (en) * 2017-07-20 2018-07-27 北京安天网络安全技术有限公司 A kind of method and system identifying webshell based on page access measure feature

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140215619A1 (en) * 2013-01-28 2014-07-31 Infosec Co., Ltd. Webshell detection and response system
US10237294B1 (en) * 2017-01-30 2019-03-19 Splunk Inc. Fingerprinting entities based on activity in an information technology environment
CN107135199A (en) * 2017-03-29 2017-09-05 国家电网公司 The detection method and device at webpage back door
CN107294982A (en) * 2017-06-29 2017-10-24 深信服科技股份有限公司 Webpage back door detection method, device and computer-readable recording medium
CN109831429A (en) * 2019-01-30 2019-05-31 新华三信息安全技术有限公司 A kind of Webshell detection method and device
CN110162973A (en) * 2019-05-24 2019-08-23 新华三信息安全技术有限公司 A kind of Webshell file test method and device
CN110807194A (en) * 2019-10-17 2020-02-18 新华三信息安全技术有限公司 Webshell detection method and device
CN110855661A (en) * 2019-11-11 2020-02-28 杭州安恒信息技术股份有限公司 WebShell detection method, device, equipment and medium

Also Published As

Publication number Publication date
CN115398861B (en) 2023-06-27
WO2021223177A1 (en) 2021-11-11

Similar Documents

Publication Publication Date Title
US20160241589A1 (en) Method and apparatus for identifying malicious website
CN111368290B (en) Data anomaly detection method and device and terminal equipment
CN105867751B (en) Operation information processing method and device
CN109947650B (en) Script step processing method, device and system
CN110995810B (en) Object identification method based on artificial intelligence and related device
US20180120778A1 (en) Adaptive control systems for buildings with dual band slot antenna
CN106874936B (en) Image propagation monitoring method and device
US10394194B2 (en) Adaptive control methods for buildings with security
CN107171894A (en) The method of terminal device, distributed high in the clouds detecting system and pattern detection
CA3076319C (en) Systems and methods for device recognition
CN115398861B (en) Abnormal file detection method and related product
CN111125523A (en) Searching method, searching device, terminal equipment and storage medium
US20180120779A1 (en) Adaptive control systems for buildings with redundant circuitry
US20180120783A1 (en) Adaptive control systems methods for buildings with security
US20180120780A1 (en) Adaptive control methods for buildings with redundant circuitry
CN109450853B (en) Malicious website determination method and device, terminal and server
CN112256748A (en) Abnormity detection method and device, electronic equipment and storage medium
CN109657469B (en) Script detection method and device
CN116307394A (en) Product user experience scoring method, device, medium and equipment
CN108112016A (en) Wireless LAN safety appraisal procedure and device
CN107948460B (en) Image processing method and device, computer equipment, computer readable storage medium
CN116959059A (en) Living body detection method, living body detection device and storage medium
CN114840570A (en) Data processing method and device, electronic equipment and storage medium
CN110856173B (en) Network access method and device and electronic equipment
CN113918757A (en) Application recommendation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant