CN108229159B - Malicious code detection method and system - Google Patents

Malicious code detection method and system Download PDF

Info

Publication number
CN108229159B
CN108229159B CN201611128576.1A CN201611128576A CN108229159B CN 108229159 B CN108229159 B CN 108229159B CN 201611128576 A CN201611128576 A CN 201611128576A CN 108229159 B CN108229159 B CN 108229159B
Authority
CN
China
Prior art keywords
data
detection
real
http
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611128576.1A
Other languages
Chinese (zh)
Other versions
CN108229159A (en
Inventor
胡雪飞
冯泽
乔伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Antiy Mobile Security Co ltd
Original Assignee
Wuhan Antiy Mobile Security Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Antiy Mobile Security Co ltd filed Critical Wuhan Antiy Mobile Security Co ltd
Priority to CN201611128576.1A priority Critical patent/CN108229159B/en
Publication of CN108229159A publication Critical patent/CN108229159A/en
Application granted granted Critical
Publication of CN108229159B publication Critical patent/CN108229159B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/18Protocol analysers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/161Implementation details of TCP/IP or UDP/IP stack architecture; Specification of modified or new header fields
    • H04L69/162Implementation details of TCP/IP or UDP/IP stack architecture; Specification of modified or new header fields involving adaptations of sockets based mechanisms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/163In-band adaptation of TCP data exchange; In-band control procedures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers

Abstract

The invention discloses a method and a system for detecting malicious codes, wherein the method comprises the following steps: acquiring real-time flow data on mobile terminal equipment; analyzing and filtering the acquired real-time flow data in real time, and extracting characteristic data of the real-time flow data; matching the extracted feature data with a preset malicious code rule feature library; if the codes match, the malicious codes are detected. The invention has good detection effect on malicious codes of data uploaded or downloaded through a network; malicious behaviors can be discovered early to inform a user of processing, and loss is reduced.

Description

Malicious code detection method and system
Technical Field
The invention relates to the technical field of mobile terminal malicious code detection, in particular to a malicious code detection method and system.
Background
Conventional mobile terminal malware detection techniques can be broadly divided into two categories, namely static detection and dynamic detection.
(1) The static detection technology is to scan a source program of the malicious software, perform decompilation and disassembly by using a reverse engineering technology, and match static data with a malicious code feature library to find out the malicious code. The technology is easy to implement in the mobile terminal, so that the security products of the mobile terminals of all the large security companies adopt the mode at present. However, with the development of malicious codes, countertechnologies such as code obfuscation and shell adding are presented, so that decompilation is more and more difficult, and the flooding of the malicious codes also makes the feature matching of the malicious codes more and more huge, so that the resource consumption is more and more large, and the efficiency is more and more low.
(2) The dynamic detection technology achieves the aim of identification by running application software and monitoring the calling of the application software to sensitive resources of the system. The dynamic method has certain discovery capability on unknown malicious applications, but the technology is relatively complex to implement on a mobile terminal, has huge resource consumption on a user terminal, and is difficult to ensure the stability of the system.
Disclosure of Invention
The invention aims to solve the technical problems of insufficient capability of discovering unknown malicious software and huge resource consumption in the prior art, and provides a malicious code detection method and a system.
The technical scheme adopted by the invention for solving the technical problems is as follows:
the invention provides a malicious code detection method, which comprises the following steps:
acquiring real-time flow data on mobile terminal equipment;
analyzing and filtering the acquired real-time flow data in real time, and extracting characteristic data of the real-time flow data;
matching the extracted feature data with a preset malicious code rule feature library; if the codes match, the malicious codes are detected.
Further, when the real-time traffic data on the mobile terminal device is acquired, the application information of each application program corresponding to the real-time traffic data is also acquired; and when the extracted feature data is matched with a preset malicious code rule feature library, detecting the application program with malicious codes.
Further, the method for matching the feature data with the rule feature library of the invention comprises the following steps: a rapid detection method and a deep detection method.
Further, the rapid detection method of the present invention specifically includes:
acquiring the analyzed data, and processing the data to obtain: at least one detection information of protocol type, ip address, port number, domain name, url composed of uri, host and uri, url calculated value, protocol method or command type, uri aram, httparam, affiliated file type, filehash, user name, password and mail subject;
and matching the detection information obtained by processing with the corresponding type in the rule feature library of the rapid detection, and when the hit rate of each detection information in the single rule reaches a threshold value, indicating that the malicious code is detected.
Further, the deep detection method of the present invention specifically includes:
adding at least one of the following detection information: http, FTP and accessory files of email, hash values of filecontent, length of filecontent, mail text, and output results are detected quickly;
and matching the added detection information and the detection information obtained in the rapid detection method together as to-be-detected information of deep detection with corresponding types in a rule feature library of deep detection, and when the hit rate of each type in a single rule reaches a threshold value, indicating that malicious codes are detected.
The invention provides a malicious code detection system, which comprises:
the flow acquisition module is used for acquiring real-time flow data on the mobile terminal equipment and corresponding application information thereof;
the flow analysis module is used for analyzing and filtering the acquired real-time flow data in real time and extracting the characteristic data of the real-time flow data;
the detection module is used for matching the extracted feature data with a preset malicious code rule feature library;
and the output control module is used for judging whether the characteristic data is matched with the malicious code information in the rule characteristic library or not, and if so, indicating that the malicious code is detected.
Further, when the traffic acquisition module acquires real-time traffic data on the mobile terminal device, the traffic acquisition module is further configured to acquire application information of each application program corresponding to the real-time traffic data; and when the extracted feature data is matched with a preset malicious code rule feature library, the output control module is used for detecting the application program with the malicious code.
Further, the detection module of the invention comprises a rapid detection module and a deep detection module.
The invention has the following beneficial effects: the malicious code detection method disclosed by the invention is based on the detection of real-time flow data, can effectively avoid antivirus detection technologies such as malicious code confusion and reinforcement, and particularly has a good detection effect on malicious codes of data uploaded or downloaded through a network; the method can discover the malicious behavior early and inform the user of processing so as to reduce the loss of the user; the method can carry out deep detection on the uploaded and downloaded data, and provides more comprehensive malicious code detection capability by combining a static detection technology.
Drawings
The invention will be further described with reference to the accompanying drawings and examples, in which:
FIG. 1 is a flow chart of a method of an embodiment of the present invention;
FIG. 2 is a flow chart of traffic capture according to an embodiment of the present invention;
FIG. 3 is a flow chart of traffic resolution for an embodiment of the present invention;
fig. 4 is a schematic system structure according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, the malicious code detection method according to the embodiment of the present invention includes the following steps:
s1, acquiring real-time flow data on the mobile terminal equipment and application information corresponding to each application program;
the method for acquiring the real-time traffic data in step S1 specifically includes:
s11, judging whether the user sets the monitoring network card, if so, executing the step S12; if not, setting the monitoring mode to an automatic mode, reading the information of each network card in sequence, monitoring the currently opened network card, and setting a timing task to monitor the state of the network card;
s12, creating a socket, setting a filter, and establishing a ring buffer of an interchangeable memory, wherein the filter is set to filter out some useless network data due to the consideration of performance;
s13, mapping the ring buffer area into a user space, binding the socket to the monitoring network card, and setting a monitoring mode of the network card;
s14, when the ring buffer area has readable flow data, analyzing the data in real time;
and S15, if the network card switching signal is received, quitting the acquisition of the current network card flow data, and after the network card is switched, acquiring the real-time flow data again until the quitting signal is received, and stopping acquiring the real-time flow data.
S2, analyzing and filtering the acquired real-time flow data in real time, and extracting the characteristic data of the real-time flow data;
the method for acquiring real-time flow data on the mobile terminal equipment and application information corresponding to the application information comprises the following steps:
the method comprises the following steps: and acquiring the ip port with the network behavior and the corresponding application information thereof in real time through the hook system api.
The method 2 comprises the following steps: and acquiring application information corresponding to the flow by inquiring a weblog file output by the system in real time.
The method for analyzing the real-time traffic data in step S2 specifically includes:
s21, reading the network data packet header of the real-time flow data, further identifying a transport layer protocol, and if the protocol is a TCP protocol or a UDP protocol, executing the step S22;
s22, matching the IP of the real-time traffic data with a preset IP white list, and if the IP white list is in accordance with the IP white list, directly returning the corresponding name in the IP white list; if the IP white list is not met, carrying out corresponding protocol analysis according to a transport layer protocol;
s23, if the transmission layer protocol is a TCP protocol, carrying out TCP recombination, identifying an application layer protocol, and extracting characteristic data;
if the application layer protocol is an http protocol, extracting and recording at least one of ip, uri, port, host, http method name, http method parameter and http content header information;
if the application layer protocol is an FTP protocol, extracting and recording at least one information of ip, port number, user name, password, FTP instruction type and FTP file name;
if the application layer protocol is SMTP, POP3 or IMAP4 protocol, extracting and recording at least one information of ip, port number, user name, password, sender, receiver and subject;
and S24, if the transmission layer protocol is UDP, identifying the dns application layer protocol, analyzing, recording dns query domain names and ip, and detecting the analysis result in real time.
In the step S23, the packet reassembly process identifies packets of the same connection, and then determines the sequence of data at the host and the remote end according to the sequence number and the acknowledgment number of the TCP, thereby implementing packet packing. According to the difference of detection environments of the local flow data packet and the non-local flow data packet, identifying the data packets of the same connection and respectively processing the data packets by two different methods:
for traffic on the local machine:
a. and acquiring the ip address of the local machine, judging whether the port number of the local machine of the network connection is a source TCP port number or a destination TCP port number by judging whether the source ip address, the destination ip address and the destination address in the data packet are the same, and further extracting the port number of the local machine, the remote ip and the remote port number.
b. And further verifying whether the remote ip and the port number are consistent or not for the data packets with the same local port number, and if so, considering the data packets as the data packets of the same connection.
For traffic on non-native:
a. splicing a source ip, a destination ip, a source port number and a destination port number according to a specific sequence; a small ip, corresponding ip and port number are placed in front, or a large ip, corresponding ip and port number are placed in front.
b. And judging that the spliced data are consistent to be the same connection.
S3, matching the extracted feature data with a preset malicious code rule feature library;
the method for matching the feature data with the rule feature library in step S3 includes a fast detection method and a deep detection method.
The rapid detection method specifically comprises the following steps:
s311, acquiring the analyzed data, and processing the data to obtain: at least one detection information of a protocol type, an ip address, a port number, a domain name, a uniform resource descriptor in an http header, url consisting of host and uri, a url calculated value, a protocol method or command type, a parameter behind the uri in the http header, an entity header field in a request message header in the http header and the Content thereof, an attached file type, a Content-MD5 value in the http entity header, a user name, a password and a mail subject;
the method for processing the analyzed data in step S311 specifically includes:
a. the uri and the host are combined to output the url, and the hash value urlhash of the url is calculated;
b. identifying the file type of the http protocol file according to the uri suffix or the content-type or the file header, and marking the file type as filetype;
c. calculating hash of the content of each protocol attachment file, and marking the hash as filehash;
d. the following parameter content is extracted from the uri of http, and is marked as uri aram.
And S312, matching the detection information obtained by processing with the corresponding type in the rule feature library for rapid detection, and indicating that the malicious code is detected when the hit rate of each type in the single rule reaches a threshold value.
The deep detection method specifically comprises the following steps:
s321, adding at least one of the following detection information: http, FTP and accessory files of email, hash values of filecontent, length of filecontent, mail text, and output results are detected quickly;
the method for processing the analyzed data in step S321 specifically includes:
a. detecting files sent by an http protocol, files downloaded by an FTP protocol and files received by a mail by using a static engine, and outputting a virus name if a virus is detected;
b. aiming at post attachment of http, STOR attachment of FTP and smtp mail content, detecting:
i. decompressing or correspondingly transcoding the data marked by the Content-Encoding format, judging the Content format of the new file, and updating the filetype;
ii. Decompressing against a compression format for which the filetype is known;
iii, marking the new file content as fileontent;
iv, calculating a new filehash;
c. and detecting the final files filecontent, filetype and filehash and the fast detection data together by the deep detection rule.
And S322, the added detection information and the detection information obtained in the rapid detection method are jointly used as to-be-detected information of deep detection, the to-be-detected information is matched with corresponding types in a rule feature library of deep detection, and when the hit rate of each type in a single rule reaches a threshold value, malicious codes are detected.
And S4, if the feature data is consistent with the malicious code information in the rule feature library, the malicious code is detected.
Since the application information of each application corresponding to the real-time traffic data is acquired in S1, the application having the malicious code can be detected at the same time as the malicious code is detected.
The malicious code detection system of the embodiment of the invention comprises:
the flow acquisition module is used for acquiring real-time flow data on the mobile terminal equipment and corresponding application information thereof;
the flow analysis module is used for analyzing and filtering the acquired real-time flow data in real time and extracting the characteristic data of the real-time flow data;
the detection module is used for matching the extracted feature data with a preset malicious code rule feature library;
and the output control module is used for judging whether the characteristic data is matched with the malicious code information in the rule characteristic library or not, and if so, indicating that the malicious code is detected.
It can be understood that, if the traffic acquisition module further acquires the application information of each application corresponding to the real-time traffic data, the output control module may also detect the application having the malicious code after detecting the malicious code.
Wherein, the detection module comprises a rapid detection module and a deep detection module.
The rapid detection module is used for acquiring the analyzed data and processing the data to obtain: at least one detection information of a protocol type, an ip address, a port number, a domain name, a uniform resource descriptor in an http header, url consisting of host and uri, a url calculated value, a protocol method or command type, a parameter behind the uri in the http header, an entity header field in a request message header in the http header and the Content thereof, an attached file type, a Content-MD5 value in the http entity header, a user name, a password and a mail subject; and matching the detection information obtained by processing with the corresponding type in the rule feature library for quick detection, and when the hit rate of each type in the single rule reaches a threshold value, indicating that the malicious code is detected.
The deep detection module is used for processing the analyzed data and adding at least one of the following detection information: http, FTP and attached files of email, hash values of filecontent, length of filecontent, mail text and output names are quickly detected; and the added detection information and the detection information obtained in the rapid detection module are jointly used as the information to be detected of the deep detection and are matched with the corresponding types in the rule feature library of the deep detection, and when the hit rate of each type in the single rule reaches a threshold value, malicious codes are detected. Since the application information of each application corresponding to the real-time traffic data is acquired, the application having the malicious code can be detected after the malicious code is detected.
In another embodiment of the present invention, a malicious code detection method includes the following steps:
a) real-time traffic data is captured.
b) And analyzing and filtering the captured flow data in real time and extracting characteristic data.
c) And matching the feature data with the rule feature library, and outputting a malicious code detection name if the feature data is hit.
The real-time traffic capture process is as follows:
1) if the user sets the monitor network card port, this step is skipped. If not, the monitoring mode is set to be an automatic mode, the information of each network card is sequentially read in the automatic mode, the currently opened network card is judged, the currently opened network card is monitored, and if no network card is opened, the wlan0 network card monitoring is carried out by default. And setting a timing task to monitor the state of the network card, and sending a network card switching signal if the network card is found to be opened to change.
2) A socket is created.
3) A BPF filter is provided.
4) A ring buffer of an un-swappable memory is established.
5) The ring buffer is mapped in user space by a mmap function.
6) And binding the socket to the monitoring network card.
7) And setting a network card monitoring mode.
8) And checking whether the ring buffer has readable traffic data, and if so, transmitting the data to the analysis module.
9) And judging whether an exit signal is received or not, exiting the current capturing program, or exiting the capturing of the current network card when a network card switching signal is received, and then re-entering the monitoring flow.
The real-time flow analysis process is as follows:
1) and reading the tpacket _ hdr network data packet header.
2) The Ethernet header is read to distinguish ipv4 from ipv 6.
3) Reading the IP header, identifying the transport layer protocol, and entering the next step if the protocol is TCP or UDP.
4) And matching the IP with the IP white list, and if the IP is hit, directly returning the corresponding white name. Otherwise, entering the corresponding protocol analysis according to the transport layer protocol. Here matching is performed for non-native IP, and a binary search method is used for matching to improve efficiency.
5) For the TCP protocol, TCP reassembly is performed to recognize application layer protocols HTTP, FTP, SMTP, IMAP4, POP3, and the like. The following information is extracted from some information of the head established for each protocol connection and submitted to a detection module for rapid detection.
Aiming at the http protocol, extracting the partial Content of the header of the http Content, the url, the port, the host, the name of the http method (GET, POST, etc.), the parameters of the http method (including Content-Type, Content-Length, Content-Language, Content-Encoding, Content-Location, Content _ Range, Content-MD5, etc.).
For the FTP protocol, contents such as ip, port number, user name, password, FTP instruction type (LIST, STOR, RETR command), FTP file name, and the like are extracted and recorded.
For SMTP, POP3 and IMAP4 protocols, contents such as ip, port number, user name, password, sender, receiver, subject and the like are extracted.
And submitting the extracted information to an analysis module so as to enter a rapid detection branch. And after the http, FTP auxiliary files, smtp, pop3 and imap4 mail contents are completely received, submitting the analyzed contents and the auxiliary files thereof to an analysis module, and entering a deep detection branch.
6) And aiming at the UDP protocol, identifying the dns application layer protocol, analyzing, recording dns query domain names and ip, and submitting an analysis result to the detection module.
As shown in fig. 4, the malicious code detection system includes the following modules:
a) and the flow capturing module is used for capturing real-time flow data.
b) And the flow analysis module is used for analyzing the real-time flow data and extracting the characteristic data.
c) And the detection module is used for matching the feature data with the rule feature library so as to find the malicious code.
d) And the output control module is used for controlling exit of the program, output of the detected name of the malicious code, processing operation of the malicious code and the like.
The working flow of the detection module is as follows:
1. the rapid detection process is as follows: further processing the data transmitted by the analysis module;
a) uri is combined with host to output url, and the hash value of url, urlhash, is calculated.
b) The http protocol file identifies the file type according to the uri suffix or content-type or file header, and is marked as filetype.
c) The contents of each protocol attachment file compute a hash, marked as a filehash.
d) The contents of the latter parameters are extracted from uri of http, and labeled uri aram.
All data and types after processing are described as follows:
Figure BDA0001175693230000101
Figure BDA0001175693230000111
and matching the information with the rapid rule base according to types, and when the hit rate of each type of the single rule reaches a certain proportion, determining that the rule is detected, and outputting the virus name to the control module for corresponding processing.
2. The deep detection process comprises the following steps:
a) and detecting files sent by the http server, files downloaded by the FTP and attached files of the received mails by using a static engine, and submitting the virus names to a control module for corresponding processing if the virus names are detected.
b) Aiming at http post attachment, FTP STOR attachment and smtp mail content, the following detection flows are carried out,
the step is divided into the following two steps:
i. decompressing or transcoding the data marked by Content-Encoding format, judging the Content format of new file, and updating the filetype
Decompress for compression format known to the filetype.
Marking the content of the new file as filekentent
Calculating a new filehash
c) And detecting the final file filecontent, filetype, filehash and the fast detection data together with the deep detection rule.
d) Compared with the rapid detection, the detection data type of the deep detection adds the following contents:
Figure BDA0001175693230000121
like rapid detection, when the hit rate of the rule reaches a certain ratio, the rule is considered to be detected, and the detected name is output to the control module for subsequent processing
3. The rapid detection and the deep detection are two branches, and a user can set whether to perform rapid detection or deep detection or to perform combined detection of the rapid detection and the deep detection according to needs.
In another embodiment of the present invention, the real-time flow capturing process is shown in fig. 2 below, and the specific implementation steps are as follows:
1) if the user sets the monitor network card port, this step is skipped. If not, the monitoring mode is set to be an automatic mode, the information of each network card is sequentially read in the automatic mode, the currently opened network card is judged, the currently opened network card is monitored, and if no network card is opened, the wlan0 network card monitoring is carried out by default.
2) And setting a timing task to monitor the state of the network card, and sending a network card switching signal if the network card is found to be opened to change.
3) Creating a socket, and if the link layer information needs to be captured, setting mode to be SOCK _ RAW; if no link layer information is needed, mode is set to SOCK _ DGRAM and the kernel provides a dummy header.
4) The BPF filtering rule is parsed and the BPF FILTER (SOL _ SOCKET, SO _ ATTACH _ FILTER) is set by setsockopt.
5) If the system supports setting the hard timestamp, then the hard timestamp is set via ioctl (SIOCHWTSTAMP).
6) Reading bps and system pages of a monitoring network card, calculating the sizes of block data and frame data, setting a struct socket _ req structural body, and setting and establishing a RING buffer of an un-exchangeable (unsnapable) memory through setsockopt (SOL _ PACKET, PACKET _ RX _ RING). The structure version is slightly different according to different system versions, for example, the latest structure at present is struct packet _ req 3. The VERSION of tpacket is set by setsockopt (SOL _ PACKET, PACKET _ VERSION).
7) The ring buffer is mapped and used in user space by mmap functions.
8) Bind function is used to bind sock to the monitoring network card (to all addresses).
9) Setting an interface mode and a hybrid mode;
10) the status field of each frame is checked in turn (see the tp _ status value of structure tpacket _ hdr). If STATUS is TP _ STATUS _ USER, the data pointer corresponding to the frame is passed to the traffic resolution module. Then, STATUS is set to TP _ STATUS _ KERNEL.
11) Calling a poll function to poll the created socket, and then repeatedly executing the step 10), exiting the current capturing program until an exit signal is received, or exiting the capturing of the current network card when a network card switching signal is received, and then re-entering the flow 3).
The real-time flow analysis is shown in fig. 3, and the specific flow is as follows:
1) and reading the tpacket _ hdr network data packet header.
2) The Ethernet header is read to distinguish ipv4 from ipv 6.
3) Reading the IP header, identifying the transport layer protocol, and entering the next step if the protocol is TCP or UDP.
4) And matching the IP with the IP white list, and if the IP is hit, directly returning the corresponding white name. Otherwise, entering the corresponding protocol analysis according to the transport layer protocol. Here matching is performed for non-native IP, and a binary search method is used for matching to improve efficiency.
5) For the TCP protocol, TCP reassembly is performed to recognize application layer protocols HTTP, FTP, SMTP, IMAP4, POP3, and the like. The following information is extracted from some information of the head established for each protocol connection and submitted to a detection module for rapid detection.
Aiming at the http protocol, extracting the partial Content of the header of the http Content, the url, the port, the host, the name of the http method (GET, POST, etc.), the parameters of the http method (including Content-Type, Content-Length, Content-Language, Content-Encoding, Content-Location, Content _ Range, Content-MD5, etc.).
For the FTP protocol, contents such as ip, port number, user name, password, FTP instruction type (LIST, STOR, RETR command), FTP file name, and the like are extracted and recorded.
Extracting ip, port number, user name, password, sender, receiver, subject and the like according to SMTP, POP3 and IMAP4 protocols.
And submitting the extracted information to an analysis module so as to enter a rapid detection branch. And after the http, FTP auxiliary files and the contents of smtp, pop3 and imap4 mails are completely received, submitting the analyzed contents and the auxiliary files thereof to an analysis module, and entering a deep detection branch.
6) And aiming at the UDP protocol, identifying the dns application layer protocol, analyzing, recording dns query domain names and ip, and submitting an analysis result to the detection module.
The detection module flow is as follows:
1. the rapid detection process is as follows: the data transmitted by the analysis module is further processed,
a) uri is combined with host to output url, and the hash value of url, urlhash, is calculated.
b) The http protocol file identifies the file type according to the uri suffix or content-type or file header, and is marked as filetype.
c) The contents of each protocol attachment file compute a hash, marked as a filehash.
d) The contents of the latter parameters are extracted from uri of http, and labeled uri aram.
All data and types after processing are described as follows:
Figure BDA0001175693230000141
Figure BDA0001175693230000151
Figure BDA0001175693230000161
and matching the information with the rapid rule base according to types, and when the hit rate of each type of the single rule reaches a certain proportion, determining that the rule is detected, and outputting the virus name to the control module for corresponding processing.
2. The deep detection process comprises the following steps:
a) and detecting files sent by the http server, files downloaded by the FTP and attached files of the received mails by using a static engine, and submitting the virus names to a control module for corresponding processing if the virus names are detected.
b) Aiming at http post attachment, FTP STOR attachment and smtp mail content, the following detection flows are carried out,
the step is divided into the following two steps:
i. decompressing or transcoding the data marked by Content-Encoding format, judging the Content format of new file, and updating the filetype
Decompress for compression format known to the filetype.
Marking the content of the new file as filekentent
Calculating a new filehash
c) And detecting the final file filecontent, filetype, filehash and the fast detection data together with the deep detection rule.
d) Compared with the rapid detection, the detection data type of the deep detection adds the following contents:
Figure BDA0001175693230000162
Figure BDA0001175693230000171
like rapid detection, when the hit rate of the rule reaches a certain ratio, the rule is considered to be detected, and the detected name is output to the control module for subsequent processing
3. The rapid detection and the deep detection are two branches, and a user can set whether to perform rapid detection or deep detection or to perform combined detection of the rapid detection and the deep detection according to needs.
The scheme is based on the detection of real-time flow data, can effectively avoid malicious codes from being confused, can reinforce and other antivirus detection technologies, and particularly has a good detection effect on malicious codes uploading or downloading data through a network. The rapid detection in the scheme can discover malicious behaviors earlier, give an alarm, inform a user of processing and reduce loss. According to the scheme, deep detection can be performed on uploaded and downloaded data, and a more comprehensive malicious code detection capability is provided by combining a static detection technology.
It will be understood that modifications and variations can be made by persons skilled in the art in light of the above teachings and all such modifications and variations are intended to be included within the scope of the invention as defined in the appended claims.

Claims (5)

1. A malicious code detection method, comprising the steps of:
acquiring real-time flow data on mobile terminal equipment;
analyzing and filtering the acquired real-time flow data in real time, and extracting characteristic data of the real-time flow data;
matching the extracted feature data with a preset malicious code rule feature library, and if the extracted feature data is matched with the preset malicious code rule feature library, judging that a malicious code exists;
the method for matching the feature data with the rule feature library comprises the following steps: a rapid detection method and a deep detection method;
the rapid detection method specifically comprises the following steps:
acquiring the analyzed data, and processing the data to obtain: at least one detection information of a protocol type, an ip address, a port number, a domain name, a uniform resource descriptor in an http header, url consisting of host and uri, a url calculated value, a protocol method or command type, a parameter behind the uri in the http header, an entity header field in a request message header in the http header and the Content thereof, an attached file type, a Content-MD5 value in the http entity header, a user name, a password and a mail subject;
the method for processing the analyzed data specifically comprises the following steps:
the uri and the host are combined to output the url, and the hash value urlhash of the url is calculated;
identifying the file type filetype of the http protocol file according to the uri;
calculating the file hash filehash of the file content of each protocol attachment;
extracting parameter content uriparam in uri of http;
detecting files sent by an http protocol, files downloaded by an FTP protocol and files received by mails by using a static engine, and outputting a detection result if malicious codes are detected;
decompressing or transcoding files in http post attachment, FTP STOR attachment and smtp mail content, judging file formats and updating the value of file type; updating the value of the auxiliary file filecontent according to the new file content, and updating the value of the filehash according to the value of the auxiliary file filecontent;
matching the detection information obtained by processing with the corresponding types in the rule feature library which is rapidly detected, and indicating that malicious codes are detected when the hit rate of each type in the single rule reaches a threshold value;
the deep detection method specifically comprises the following steps:
adding at least one of the following detection information: http, FTP and accessory files of email, hash values of filecontent, length of filecontent, mail text, and output results are detected quickly;
and matching the added detection information and the detection information obtained in the rapid detection method together as to-be-detected information of deep detection with corresponding types in a rule feature library of deep detection, and when the hit rate of each type in a single rule reaches a threshold value, indicating that malicious codes are detected.
2. The malicious code detection method according to claim 1, wherein when acquiring real-time traffic data on the mobile terminal device, application information of each application program corresponding to the real-time traffic data is also acquired; and when the extracted feature data is matched with a preset malicious code rule feature library, detecting the application program with malicious codes.
3. The malicious code detection method according to claim 1 or 2, wherein the method of acquiring real-time traffic data specifically includes:
after setting a monitoring network card, creating a socket, setting a filter, and establishing a ring buffer area of an interchangeable memory;
mapping the ring buffer area into a user space, binding the socket to a monitoring network card, and setting a monitoring mode of the network card;
when the ring buffer area has readable flow data, analyzing the data in real time;
and stopping acquiring the real-time flow data after receiving the exit signal.
4. A malicious code detection system, comprising:
the flow acquisition module is used for acquiring real-time flow data on the mobile terminal equipment and corresponding application information thereof;
the flow analysis module is used for analyzing and filtering the acquired real-time flow data in real time and extracting the characteristic data of the real-time flow data;
the detection module is used for matching the extracted feature data with a preset malicious code rule feature library;
the output control module is used for judging whether the characteristic data is matched with the malicious code information in the rule characteristic library or not, and if so, the malicious code is detected;
the detection module comprises a rapid detection module and a deep detection module, the rapid detection module is used for realizing rapid detection, and the deep detection module is used for realizing deep detection;
the implementation method of the rapid detection module comprises the following steps:
acquiring the analyzed data, and processing the data to obtain: at least one detection information of a protocol type, an ip address, a port number, a domain name, a uniform resource descriptor in an http header, url consisting of host and uri, a url calculated value, a protocol method or command type, a parameter behind the uri in the http header, an entity header field in a request message header in the http header and the Content thereof, an attached file type, a Content-MD5 value in the http entity header, a user name, a password and a mail subject;
the method for processing the analyzed data specifically comprises the following steps:
the uri and the host are combined to output the url, and the hash value urlhash of the url is calculated;
identifying the file type filetype of the http protocol file according to the uri;
calculating the file hash filehash of the file content of each protocol attachment;
extracting parameter content uriparam in uri of http;
detecting files sent by an http protocol, files downloaded by an FTP protocol and files received by mails by using a static engine, and outputting a detection result if malicious codes are detected;
decompressing or transcoding files in http post attachment, FTP STOR attachment and smtp mail content, judging file formats and updating the value of file type; updating the value of the auxiliary file filecontent according to the new file content, and updating the value of the filehash according to the value of the auxiliary file filecontent;
matching the detection information obtained by processing with the corresponding types in the rule feature library which is rapidly detected, and indicating that malicious codes are detected when the hit rate of each type in the single rule reaches a threshold value;
the implementation method of the deep detection module comprises the following steps:
adding at least one of the following detection information: http, FTP and accessory files of email, hash values of filecontent, length of filecontent, mail text, and output results are detected quickly;
and matching the added detection information and the detection information obtained in the rapid detection method together as to-be-detected information of deep detection with corresponding types in a rule feature library of deep detection, and when the hit rate of each type in a single rule reaches a threshold value, indicating that malicious codes are detected.
5. The malicious code detection system according to claim 4, wherein the traffic acquisition module is further configured to acquire application information of each application program corresponding to the real-time traffic data when acquiring the real-time traffic data on the mobile terminal device; and when the extracted feature data is matched with a preset malicious code rule feature library, the output control module is used for detecting the application program with the malicious code.
CN201611128576.1A 2016-12-09 2016-12-09 Malicious code detection method and system Active CN108229159B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611128576.1A CN108229159B (en) 2016-12-09 2016-12-09 Malicious code detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611128576.1A CN108229159B (en) 2016-12-09 2016-12-09 Malicious code detection method and system

Publications (2)

Publication Number Publication Date
CN108229159A CN108229159A (en) 2018-06-29
CN108229159B true CN108229159B (en) 2022-04-01

Family

ID=62637162

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611128576.1A Active CN108229159B (en) 2016-12-09 2016-12-09 Malicious code detection method and system

Country Status (1)

Country Link
CN (1) CN108229159B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109167783A (en) * 2018-08-31 2019-01-08 杭州迪普科技股份有限公司 A kind of method and apparatus identifying mail virus
CN109327453B (en) * 2018-10-31 2021-04-13 北斗智谷(北京)安全技术有限公司 Specific threat identification method and electronic equipment
CN112311721B (en) * 2019-07-25 2022-11-22 深信服科技股份有限公司 Method and device for detecting downloading behavior
CN112822150A (en) * 2020-08-19 2021-05-18 北京辰信领创信息技术有限公司 Method for detecting suspicious IP
CN113242252A (en) * 2021-05-21 2021-08-10 北京国联天成信息技术有限公司 Method and system for detecting and processing malicious codes in big data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103065089A (en) * 2012-12-11 2013-04-24 深信服网络科技(深圳)有限公司 Method and device for detecting webpage Trojan horses
US9092625B1 (en) * 2012-07-03 2015-07-28 Bromium, Inc. Micro-virtual machine forensics and detection
CN105337994A (en) * 2015-11-26 2016-02-17 晶赞广告(上海)有限公司 Malicious code detection method and device based on network flow

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9092625B1 (en) * 2012-07-03 2015-07-28 Bromium, Inc. Micro-virtual machine forensics and detection
CN103065089A (en) * 2012-12-11 2013-04-24 深信服网络科技(深圳)有限公司 Method and device for detecting webpage Trojan horses
CN105337994A (en) * 2015-11-26 2016-02-17 晶赞广告(上海)有限公司 Malicious code detection method and device based on network flow

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Android平台下基于流量监测的安全软件设计与实现;李俊;《中国优秀硕士学位论文全文数据库信息科技辑》;20140915;正文第1.1.2节,3.1.1-3.1.2节,4.2节,4.3节,图3-3,图4-5 *

Also Published As

Publication number Publication date
CN108229159A (en) 2018-06-29

Similar Documents

Publication Publication Date Title
CN108229159B (en) Malicious code detection method and system
CN106815112B (en) Massive data monitoring system and method based on deep packet inspection
US20190075049A1 (en) Determining Direction of Network Sessions
US20100095370A1 (en) Selective packet capturing method and apparatus using kernel probe
CN106936791B (en) Method and device for intercepting malicious website access
CN103051617A (en) Method, device and system for identifying network behaviors of program
CN111800412A (en) Advanced sustainable threat tracing method, system, computer equipment and storage medium
CN111049786A (en) Network attack detection method, device, equipment and storage medium
CN102025567A (en) Sharing access detection method and related device
CN111049781A (en) Detection method, device, equipment and storage medium for rebound network attack
CN112929376A (en) Flow data processing method and device, computer equipment and storage medium
CN108270783B (en) Data processing method and device, electronic equipment and storage medium
CN111641951A (en) 5G network APT attack tracing method and system based on SA architecture
US9680739B2 (en) Information transmission system, information communication apparatus, and information transmission apparatus
CN113992508A (en) Local area network automatic networking method of intelligent equipment and intelligent equipment
CN109474567B (en) DDOS attack tracing method and device, storage medium and electronic equipment
CN113518042B (en) Data processing method, device, equipment and storage medium
CN109218375B (en) Application interaction method and device
CN103067360A (en) Method and system for procedure network behavior identification
CN108076070B (en) FASP (fast open shortest Path protocol) blocking method, device and analysis system
CN106713355B (en) Network filtering method based on PC (personal computer) terminal and client PC
CN107579949B (en) Data message processing method and device
CN112640392B (en) Trojan horse detection method, device and equipment
US9049170B2 (en) Building filter through utilization of automated generation of regular expression
JP2014209674A (en) Identification device, identification method, and identification program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant