CN110135160B - Software detection method, device and system - Google Patents

Software detection method, device and system Download PDF

Info

Publication number
CN110135160B
CN110135160B CN201910353079.9A CN201910353079A CN110135160B CN 110135160 B CN110135160 B CN 110135160B CN 201910353079 A CN201910353079 A CN 201910353079A CN 110135160 B CN110135160 B CN 110135160B
Authority
CN
China
Prior art keywords
api
sequence
software
thread
malicious software
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910353079.9A
Other languages
Chinese (zh)
Other versions
CN110135160A (en
Inventor
徐国爱
徐国胜
孙博文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201910353079.9A priority Critical patent/CN110135160B/en
Publication of CN110135160A publication Critical patent/CN110135160A/en
Application granted granted Critical
Publication of CN110135160B publication Critical patent/CN110135160B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/52Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow
    • G06F21/53Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow by executing in a restricted environment, e.g. sandbox or secure virtual machine
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Stored Programmes (AREA)

Abstract

The invention provides a method, a device and a system for detecting software, wherein the method comprises the following steps: acquiring software data to be detected; screening out a system file thread from the software data; extracting an API sequence in the system file thread through a sandbox; truncating the API sequence, and recoding the truncated API sequence according to multiple threads to obtain an API multi-thread sequence; and detecting and analyzing the API multithreading sequence through a target detection model to obtain a software detection result. The method can realize more efficient and accurate malicious software detection and can obtain more various malicious software detection results.

Description

Software detection method, device and system
Technical Field
The invention relates to the technical field of information security, in particular to a method, a device and a system for detecting software.
Background
With the rapid development of information technology and the internet, network security issues are receiving more and more attention, and malware, which refers to a program such as a virus, a worm or a trojan horse that intentionally performs a malicious task on a computer system, is definitely the most harmful. Control is exercised by breaking the software process. The conventional detection method of the N-Gram is widely used in the detection of the malicious software sequence.
With the continuous development of the malware countermeasure technology, the malware detection technology is gradually developed from static detection to dynamic and static combination.
However, most software detection methods using static sequence extraction schemes are based on N-Gram feature extraction, and due to the problems of multithreading, extremely uneven sequence length and the like of dynamic sequences, detection results are inaccurate, and meanwhile, the calculation cost is high and the receptive field is too small.
Disclosure of Invention
The invention provides a method, a device and a system for detecting software, which are used for realizing more efficient and accurate malicious software detection and obtaining more types of malicious software detection results.
In a first aspect, an embodiment of the present invention provides a software detection method, including:
acquiring software data to be detected;
screening out a system file thread from the software data;
extracting an API sequence in the system file thread through a sandbox;
truncating the API sequence, and recoding the truncated API sequence according to multiple threads to obtain an API multi-thread sequence;
and detecting and analyzing the API multithreading sequence through a target detection model to obtain a software detection result.
In a possible design, before performing detection analysis on the API multithreading sequence through a target detection model to obtain a software detection result, the method further includes:
acquiring malicious software data; the malware data includes: infectious virus, Trojan horse program, mine excavation program, Lesox virus;
screening out a system file thread of the malicious software from the malicious software data;
extracting an API sequence in a system file thread of the malicious software through a sandbox to obtain the API sequence of the malicious software;
intercepting the API sequence of the malicious software, and recoding the intercepted API sequence according to multithreading to obtain an API multithreading sequence of the malicious software;
constructing an initial detection model; the initial detection model is a classification model based on hole convolution and TextCNN;
and taking multi-class loglos as an evaluation target, and performing iterative training on the initial detection model through an API multithreading sequence of the malicious software to obtain the target detection model.
In one possible design, extracting, by sandboxing, the sequence of APIs in the system file thread includes:
and dynamically executing the system file thread through the sandbox to obtain the API name, the API thread number, the API return value and the sequence number called by the API in the thread of the file call.
In one possible design, truncating the API sequence and re-encoding the truncated API sequence according to multiple threads to obtain an API multithreading sequence, including:
when the number of files calling the API in a certain thread exceeds a preset threshold value, truncating the API sequence corresponding to the thread, and storing API records of the preset number to obtain a truncated API sequence;
and recoding the truncated API sequence according to the multithreading to obtain the API multithreading sequence.
In one possible design, the detecting and analyzing the API multithreading sequence by the target detection model to obtain the software detection result includes:
detecting the API multithreading sequence through a target detection model, and judging whether software data corresponding to the API multithreading sequence is malicious software data or not;
if the data is the malicious software data, outputting a type tag of the malicious software data;
and if the data is not the malicious software data, prompting the software data to be safe.
In a second aspect, an apparatus for detecting software provided in an embodiment of the present invention includes:
the acquisition module is used for acquiring software data to be detected;
the screening module is used for screening out a system file thread from the software data;
the extraction module is used for extracting the API sequence in the system file thread through the sandbox;
the coding module is used for truncating the API sequence and recoding the truncated API sequence according to multiple threads to obtain an API multi-thread sequence;
and the obtaining module is used for detecting and analyzing the API multithreading sequence through the target detection model to obtain a software detection result.
In a possible design, before performing detection analysis on the API multithreading sequence through a target detection model to obtain a software detection result, the method further includes:
acquiring malicious software data; the malware data includes: infectious virus, Trojan horse program, mine excavation program, Lesox virus;
screening out a system file thread of the malicious software from the malicious software data;
extracting an API sequence in a system file thread of the malicious software through a sandbox to obtain the API sequence of the malicious software;
intercepting the API sequence of the malicious software, and recoding the intercepted API sequence according to multithreading to obtain an API multithreading sequence of the malicious software;
constructing an initial detection model; the initial detection model is a classification model based on hole convolution and TextCNN;
and taking multi-class loglos as an evaluation target, and performing iterative training on the initial detection model through an API multithreading sequence of the malicious software to obtain the target detection model.
In one possible design, the extraction module is specifically configured to:
and dynamically executing the system file thread through the sandbox to obtain the API name, the API thread number, the API return value and the sequence number called by the API in the thread of the file call.
In one possible design, the encoding module is specifically configured to:
when the number of files calling the API in a certain thread exceeds a preset threshold value, truncating the API sequence corresponding to the thread, and storing API records of the preset number to obtain a truncated API sequence;
and recoding the truncated API sequence according to the multithreading to obtain the API multithreading sequence.
In one possible design, a module is obtained, in particular for:
detecting the API multithreading sequence through a target detection model, and judging whether software data corresponding to the API multithreading sequence is malicious software data or not;
if the data is the malicious software data, outputting a type tag of the malicious software data;
and if the data is not the malicious software data, prompting the software data to be safe.
In a third aspect, a system for detecting software provided by an embodiment of the present invention includes a memory and a processor, where the memory stores executable instructions of the processor; wherein the processor is configured to perform the method of software detection of any one of the first aspect via execution of the execution instructions.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for software detection according to any one of the first aspect.
The invention provides a method, a device and a system for detecting software, wherein the method comprises the following steps: acquiring software data to be detected; screening out a system file thread from the software data; extracting an API sequence in the system file thread through a sandbox; truncating the API sequence, and recoding the truncated API sequence according to multiple threads to obtain an API multi-thread sequence; and detecting and analyzing the API multithreading sequence through a target detection model to obtain a software detection result. The method can realize more efficient and accurate malicious software detection and can obtain more various malicious software detection results.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic view of an application scenario of the present invention;
FIG. 2 is a flowchart of a software detection method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating a hole convolution in a software detection method according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a software detection apparatus according to a second embodiment of the present invention;
fig. 5 is a schematic structural diagram of a software detection system according to a third embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The following describes the technical solutions of the present invention and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present invention will be described below with reference to the accompanying drawings.
Fig. 1 is a schematic view of an application scenario of the present invention, as shown in fig. 1, in the present invention, a software detection system 11 is used to obtain software data 12 to be detected, a system file thread is screened out from the software data to be detected, an API sequence in the system file thread is extracted through a sandbox, the API sequence is truncated, and the truncated API sequence is recoded according to multiple threads to obtain an API multithreading sequence; and detecting and analyzing the API multithreading sequence through a target detection model to obtain a software detection result 13. In an alternative embodiment, the software detection system includes an object detection model. By applying the detection method, more efficient and accurate detection of the malicious software can be realized, and more types of malicious software detection results can be obtained.
Fig. 2 is a flowchart of a software detection method according to an embodiment of the present invention, and as shown in fig. 2, the software detection method may include:
s201, acquiring software data to be detected.
In an optional embodiment, the obtaining module may capture program information obtained from an operating state of the program when the program runs, where the operating state may include a CPU execution instruction sequence of the program, a system call (system call), an Application Programming Interface (API), or a system service with a higher abstraction level. In an alternative embodiment, the software detection system employs an API Monitor (i.e., API call monitoring software) that monitors and displays calls made by the application, and tracks any exported APIs, such as the Win32API and other third party APIs, and so on. The API support displays rich information, may include function names, call sequences, input and output parameters, function return values, etc., and may predefine 82 DLLs (Dynamic Link Library) and prototypes of about 4000 APIs.
In this embodiment, the software detection system may obtain dynamic information in the running state, and has lower redundancy than static information in the running state, and may capture the dynamic information in real time, and in addition, the dynamic information is not affected by shell encryption, or a modification technology at some instruction levels is ineffective for system calls or APIs (Application Programming interfaces) with higher system abstraction, and is less affected by modification.
S202, screening out system file threads from the software data.
In an alternative embodiment, a system file thread is screened from the software data, wherein the system file thread may comprise a Windows platform software file thread or a software file thread which is improved by simple adaptation of an API sequence and is suitable for an Android platform and a Linux platform.
And S203, extracting the API sequence in the system file thread through the sandbox.
Specifically, the sandbox dynamically executes the system file thread to obtain the name of the API called by the file, the number of the API thread, the return value of the API, and the serial number of the API called in the thread.
In an alternative embodiment, the software detection system may include different sandboxes, and the dynamic execution of the system file thread through the sandboxes extracts the API sequence in the system file thread, and mainly retains the API name of the file call, the API thread number, the API return value, and the sequence number of the API call in the thread. In an optional embodiment, the software detection system employs an extraction module, for example, a sandbox (sandbox) can automatically and dynamically execute software data to be detected, which is not trusted, in an isolated environment, and extract dynamic behaviors such as process behavior, network behavior, file behavior, and the like in the running process of the software data. In this embodiment, no specific limitation on the sandbox is performed, and only the API sequence in the system file thread is extracted after the sandbox dynamic analysis. For example, Cuckoo adopts an open-source automatic malware analysis system written by Python, and can track and record all calling conditions of malware; the malware file behaviors may include behaviors of creating a new file, modifying a file, deleting a file, reading a file, or downloading a file in the malware execution Process, a memory image of the malware may be obtained, and network traffic of the malware is recorded in a PCAP (Process Characterization Analysis Package) format; screenshots and the like during malware execution may also be obtained. And further, deep analysis can be performed on software data according to the dynamic execution result of the sandbox.
And S204, intercepting the API sequence, and recoding the intercepted API sequence according to multiple threads to obtain an API multi-thread sequence.
Specifically, when the number of files calling the API in a certain thread exceeds a preset threshold, truncating the API sequence corresponding to the thread, and storing a preset number of API records to obtain a truncated API sequence; and recoding the truncated API sequence according to the multithreading to obtain the API multithreading sequence.
In this embodiment, because the sandbox execution time has a precision limitation, a situation that the same thread or different threads execute the API for multiple times may occur on one index, and although the internal sequence of the same TID (thread controller) may be guaranteed, continuity cannot be guaranteed. When more than 5000 API files are called in one thread TID, the API sequence corresponding to the thread can be intercepted, and records of 5000 APIs in front of each TID are reserved according to the sequence to obtain the intercepted API sequence. And recoding the truncated API sequence according to the multithreading to obtain the API multithreading sequence. In an optional embodiment, there is no sequential relationship between TIDs of different threads, and indexes in the same TID represent calling precedence from small to large. In an alternative embodiment, the software detection system may be expanded by different encoding methods, or may incorporate word vector techniques for re-encoding.
In this embodiment, the step S204 is adopted to avoid the problem of too large computation overhead caused by too large API sequence length, and the intercepted API sequence is re-encoded, so that a sequence with obvious timing and high relevance can be obtained, and further, a wider variety of malware detection results can be obtained.
And S205, detecting and analyzing the API multithreading sequence through the target detection model to obtain a software detection result.
Specifically, the API multithreading sequence is detected through a target detection model, and whether software data corresponding to the API multithreading sequence is malicious software data or not is judged;
if the data is the malicious software data, outputting a type tag of the malicious software data;
and if the data is not the malicious software data, prompting the software data to be safe.
In an alternative embodiment, the classification structure of malware may include: infectious viruses, trojan programs, mine mining programs, DDOS (Distributed Denial of Service) trojan, lemonavirus, and so on, the number of classifications may be as many as 6 hundred million.
In this embodiment, through API calls and packet reconstruction, the malware data will submit a copy version, i.e., a copy of the malware data, which is exactly the typical behavior of the vflodor trojan family. Therefore, the software data corresponding to the API multithreading sequence is judged to be malicious software data. Wherein Vflooder is a special type of Flooder (worm) trojan that can send a large amount of information to the target to interrupt the target's normal operation. And then outputting a type label of the malicious software data: a worm trojan. In an optional embodiment, the target detection model detects the API multithreading sequence, and in an optional embodiment, the software detection system obtains standard, safe running data, determines that the data is not malware data, and prompts the software data for safety.
In an optional embodiment, before performing detection analysis on the API multithreading sequence through the target detection model to obtain a software detection result, the method further includes:
acquiring malicious software data; the malware data includes: infectious virus, Trojan horse program, mine excavation program, Lesox virus;
screening out a system file thread of the malicious software from the malicious software data;
extracting an API sequence in a system file thread of the malicious software through a sandbox to obtain the API sequence of the malicious software;
intercepting the API sequence of the malicious software, and recoding the intercepted API sequence according to multiple threads to obtain the API multi-thread sequence of the malicious software;
constructing an initial detection model; the initial detection model is a classification model based on the hole convolution and TextCNN;
and (3) taking multi-class loglos as an evaluation target, and performing iterative training on the initial detection model through an API (application program interface) multithreading sequence of the malicious software to obtain a target detection model.
In an alternative embodiment, the software detection system obtains malware data, wherein the malware data may include: infectious viruses, Trojan programs, mine excavation programs, Lesox viruses, and the like. And the software detection system screens out the system file thread of the malicious software from the malicious software data. And extracting the API sequence in the system file thread of the malicious software through the sandbox to obtain the API sequence of the malicious software, wherein the API name, the API thread number, the API return value and the sequence number called by the API in the thread are obtained. And intercepting the API sequence of the malicious software, and recoding the API sequence according to the intercepted multithreading to obtain the API multithreading sequence of the malicious software. The encoding method in this embodiment is not limited, and those skilled in the art may make specific limitations according to actual needs, for example, different encoding methods may be adopted for expansion, or a word vector technology may be introduced for re-encoding, and the like.
Further constructing an initial detection model which is a classification model based on the cavity convolution and the TextCNN; and (3) taking multi-class loglos as an evaluation target, and performing iterative training on the initial detection model through an API (application program interface) multithreading sequence of the malicious software to obtain a target detection model.
In an alternative embodiment, the software inspection system builds an initial inspection model based on a classification model of the hole convolution, also known as dilation convolution, and TextCNN, where the hole convolution introduces a new parameter called "dilation rate" into the convolution layer that defines the spacing of values at which the convolution kernel processes the data. Fig. 3 is a schematic diagram of a hole convolution in the software detection method according to an embodiment of the present invention. Referring to FIG. 3, for a 2-scaled conv of 3x3, the actual convolution kernel size is again 3x3, but the hole is 1, i.e., for a 7x7 image block, only 9 red dots are convolved with the convolution kernel of 3x3, and the rest of the dots are skipped. It is also understood that the convolution kernel has a size of 7 × 7, but only 9 dots in the figure have a weight of 0 other than 0, and the rest are 0. It can be seen that although the convolution kernel is only 3x3, the field of this convolution has increased to 7x 7. In an alternative embodiment, if it is considered that the previous layer of the 2-scaled conv is a 1-scaled conv, then each dot is the convolution output of 1-scaled, so the field of view is 3x3, i.e. 1-scaled and 2-scaled together achieve a 7x7 convolution effect, and the field of the hole convolution is an exponential increase.
In an alternative embodiment, during convolution of the void convolution, spaces are filled between convolution kernel elements, a new hyper-parameter d is introduced, the value of (d-1) is the number of filled spaces, and assuming that the original convolution kernel size is k, the convolution kernel size n after filling the (d-1) spaces is: further, assuming that the input void convolution size is i, the step length is s, and the calculation formula of the feature map size o after void convolution is:
Figure BDA0002044589900000091
in an alternative embodiment, TextCNN adopts a text classification convolutional neural network, and due to the advantages of simple structure, good effect and the like, the TextCNN neural network can model a convolution window with a maximum length of 20 by using hierarchical convolutional Kernel hole convolution, namely adding two windows of scaled _ size ═ 1,2,3,4] and Kernel _ size ═ 2,3,4,5 in the convolution process.
In an optional embodiment, with a multi-class logloss (Log loss) as an evaluation target, the initial detection model is iteratively trained through an API multithreading sequence of the malware, so as to obtain a target detection model. Wherein the loginos is:
Figure BDA0002044589900000092
m represents the classification number, N represents the number of samples in the test set, yijRepresents whether the ith sample is of the type j (yes-1, no-0), PijRepresenting the probability (prob) that the ith sample was predicted as class j, the final logloss retains 6 bits after the decimal point.
According to the software detection method, more efficient and accurate malicious software detection can be achieved, and more various malicious software detection results can be obtained.
Fig. 4 is a schematic structural diagram of a software detection apparatus according to a second embodiment of the present invention, and as shown in fig. 4, the apparatus may include:
the acquiring module 31 is used for acquiring software data to be detected;
the screening module 32 is used for screening out system file threads from the software data;
the extraction module 33 is used for extracting the API sequence in the system file thread through the sandbox;
the encoding module 34 is configured to truncate the API sequence and re-encode the truncated API sequence according to multiple threads to obtain an API multithreading sequence;
and the obtaining module 35 is configured to perform detection analysis on the API multithreading sequence through the target detection model to obtain a software detection result.
In an optional embodiment, before performing detection analysis on the API multithreading sequence through the target detection model to obtain a software detection result, the method further includes:
acquiring malicious software data; the malware data includes: infectious virus, Trojan horse program, mine excavation program, Lesox virus;
screening out a system file thread of the malicious software from the malicious software data;
extracting an API sequence in a system file thread of the malicious software through a sandbox to obtain the API sequence of the malicious software;
intercepting the API sequence of the malicious software, and recoding the intercepted API sequence according to multiple threads to obtain the API multi-thread sequence of the malicious software;
constructing an initial detection model; the initial detection model is a classification model based on the hole convolution and TextCNN;
and (3) taking multi-class loglos as an evaluation target, and performing iterative training on the initial detection model through an API (application program interface) multithreading sequence of the malicious software to obtain a target detection model.
In an alternative embodiment, the extracting module 33 is specifically configured to:
and dynamically executing the system file thread through the sandbox to obtain the API name, the API thread number, the API return value and the sequence number called by the API in the thread of the file call.
In an alternative embodiment, the encoding module 34 is specifically configured to:
when the number of files calling the API in a certain thread exceeds a preset threshold value, truncating the API sequence corresponding to the thread, and storing API records of the preset number to obtain a truncated API sequence;
and recoding the truncated API sequence according to the multithreading to obtain the API multithreading sequence.
In an alternative embodiment, the obtaining module 35 is specifically configured to:
detecting the API multithreading sequence through a target detection model, and judging whether software data corresponding to the API multithreading sequence is malicious software data or not;
if the data is the malicious software data, outputting a type tag of the malicious software data;
and if the data is not the malicious software data, prompting the software data to be safe.
The device for detecting software in this embodiment may execute the technical solution in the method shown in fig. 2, and for the specific implementation process and technical principle, reference is made to the relevant description in the method shown in fig. 2, which is not described herein again.
Fig. 5 is a schematic structural diagram of a software detection system according to a third embodiment of the present invention, and as shown in fig. 5, the software detection system 40 according to this embodiment may include: a processor 41 and a memory 42.
A memory 42 for storing a computer program (e.g., an application program, a functional module, etc. implementing the above-described software detection method), computer instructions, etc.;
the computer programs, computer instructions, etc. described above may be stored in one or more memories 42 in partitions. And the above-mentioned computer program, computer instructions, data, etc. can be called by the processor 41.
A processor 41 for executing the computer program stored in the memory 42 to implement the steps of the method according to the above embodiments.
Reference may be made in particular to the description relating to the preceding method embodiment.
The processor 41 and the memory 42 may be separate structures or may be integrated structures integrated together. When the processor 41 and the memory 42 are separate structures, the memory 42 and the processor 41 may be coupled by a bus 43.
The server in this embodiment may execute the technical solution in the method shown in fig. 2, and for the specific implementation process and the technical principle, reference is made to the relevant description in the method shown in fig. 2, which is not described herein again.
In addition, embodiments of the present application further provide a computer-readable storage medium, in which computer-executable instructions are stored, and when at least one processor of the user equipment executes the computer-executable instructions, the user equipment performs the above-mentioned various possible methods.
Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an ASIC. Additionally, the ASIC may reside in user equipment. Of course, the processor and the storage medium may reside as discrete components in a communication device.
Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (6)

1. A method of software inspection, comprising:
acquiring software data to be detected;
screening out a system file thread from the software data;
dynamically executing the system file thread through the sandbox to obtain an API name, an API thread number, an API return value and an API calling sequence number in the thread of the file calling; when the number of files calling the API in a certain thread exceeds a preset threshold value, truncating the API sequence corresponding to the thread, numbering according to the sequence called by the API in the thread, and storing API records with preset number to obtain a truncated API sequence; recoding the cut API sequence according to multiple threads to obtain an API multi-thread sequence;
acquiring malicious software data; the malware data includes: infectious virus, Trojan horse program, mine excavation program, Lesox virus;
screening out a system file thread of the malicious software from the malicious software data;
extracting an API sequence in a system file thread of the malicious software through a sandbox to obtain the API sequence of the malicious software;
intercepting the API sequence of the malicious software, and recoding the intercepted API sequence according to multithreading to obtain an API multithreading sequence of the malicious software;
constructing an initial detection model; the initial detection model is a classification model based on hole convolution and TextCNN;
taking multi-class loglos as an evaluation target, and performing iterative training on the initial detection model through an API (application programming interface) multithreading sequence of malicious software to obtain a target detection model;
and detecting and analyzing the API multithreading sequence through the target detection model to obtain a software detection result.
2. The method of claim 1, wherein the performing a detection analysis on the API multithreading sequence through the target detection model to obtain a software detection result comprises:
detecting the API multithreading sequence through a target detection model, and judging whether software data corresponding to the API multithreading sequence is malicious software data or not;
if the data is the malicious software data, outputting a type tag of the malicious software data;
and if the data is not the malicious software data, prompting the software data to be safe.
3. An apparatus for software inspection, comprising:
the acquisition module is used for acquiring software data to be detected;
the screening module is used for screening out a system file thread from the software data;
the extraction module is used for dynamically executing the system file thread through the sandbox, and acquiring an API name, an API thread number, an API return value and an API calling sequence number in the thread of the file calling;
the coding module is used for truncating the API sequence corresponding to a thread when the number of files for calling the API in a certain thread exceeds a preset threshold value, numbering according to the sequence called by the API in the thread, storing a preset number of API records to obtain a truncated API sequence, and recoding the truncated API sequence according to multiple threads to obtain an API multi-thread sequence;
the obtaining module is used for detecting and analyzing the API multithreading sequence through a target detection model to obtain a software detection result;
before the detecting and analyzing of the API multithreading sequence by the target detection model to obtain the software detection result, the method further includes:
acquiring malicious software data; the malware data includes: infectious virus, Trojan horse program, mine excavation program, Lesox virus;
screening out a system file thread of the malicious software from the malicious software data;
extracting an API sequence in a system file thread of the malicious software through a sandbox to obtain the API sequence of the malicious software;
intercepting the API sequence of the malicious software, and recoding the intercepted API sequence according to multithreading to obtain an API multithreading sequence of the malicious software;
constructing an initial detection model; the initial detection model is a classification model based on hole convolution and TextCNN;
and taking multi-class loglos as an evaluation target, and performing iterative training on the initial detection model through an API multithreading sequence of the malicious software to obtain the target detection model.
4. The apparatus of claim 3, wherein the obtaining module is specifically configured to:
detecting the API multithreading sequence through a target detection model, and judging whether software data corresponding to the API multithreading sequence is malicious software data or not;
if the data is the malicious software data, outputting a type tag of the malicious software data;
and if the data is not the malicious software data, prompting the software data to be safe.
5. A system for software inspection, comprising: the device comprises a memory and a processor, wherein the memory stores executable instructions of the processor; wherein the processor is configured to perform the method of software detection of claim 1 or 2 via execution of the executable instructions.
6. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, is adapted to carry out the method of software inspection of claim 1 or 2.
CN201910353079.9A 2019-04-29 2019-04-29 Software detection method, device and system Active CN110135160B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910353079.9A CN110135160B (en) 2019-04-29 2019-04-29 Software detection method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910353079.9A CN110135160B (en) 2019-04-29 2019-04-29 Software detection method, device and system

Publications (2)

Publication Number Publication Date
CN110135160A CN110135160A (en) 2019-08-16
CN110135160B true CN110135160B (en) 2021-11-30

Family

ID=67575625

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910353079.9A Active CN110135160B (en) 2019-04-29 2019-04-29 Software detection method, device and system

Country Status (1)

Country Link
CN (1) CN110135160B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111475808B (en) * 2020-04-08 2022-07-08 苏州浪潮智能科技有限公司 Software security analysis method, system, equipment and computer storage medium
CN111797393B (en) * 2020-06-23 2023-05-23 安天科技集团股份有限公司 Method and device for detecting malicious mining behavior based on GPU
CN112000954B (en) * 2020-08-25 2024-01-30 华侨大学 Malicious software detection method based on feature sequence mining and simplification
CN112507330B (en) * 2020-11-04 2022-06-28 北京航空航天大学 Malicious software detection system based on distributed sandbox
CN112528284A (en) * 2020-12-18 2021-03-19 北京明略软件系统有限公司 Malicious program detection method and device, storage medium and electronic equipment
CN113139187B (en) * 2021-04-22 2023-12-19 北京启明星辰信息安全技术有限公司 Method and device for generating and detecting pre-training language model
CN113571199A (en) * 2021-09-26 2021-10-29 成都健康医联信息产业有限公司 Medical data classification and classification method, computer equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107609396A (en) * 2017-09-22 2018-01-19 杭州安恒信息技术有限公司 A kind of escape detection method based on sandbox virtual machine
CN108133139A (en) * 2017-11-28 2018-06-08 西安交通大学 A kind of Android malicious application detecting system compared based on more running environment behaviors
CN108376220A (en) * 2018-02-01 2018-08-07 东巽科技(北京)有限公司 A kind of malice sample program sorting technique and system based on deep learning
CN108734012A (en) * 2018-05-21 2018-11-02 上海戎磐网络科技有限公司 Malware recognition methods, device and electronic equipment
CN108830077A (en) * 2018-06-14 2018-11-16 腾讯科技(深圳)有限公司 A kind of script detection method, device and terminal
CN108874658A (en) * 2017-12-25 2018-11-23 北京安天网络安全技术有限公司 A kind of sandbox analysis method, device, electronic equipment and storage medium
CN109635523A (en) * 2018-11-29 2019-04-16 北京奇虎科技有限公司 Application program detection method, device and computer readable storage medium
CN109657468A (en) * 2018-11-29 2019-04-19 北京奇虎科技有限公司 Virus behavior detection method, device and computer readable storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9355253B2 (en) * 2012-10-18 2016-05-31 Broadcom Corporation Set top box architecture with application based security definitions
US10867039B2 (en) * 2017-10-19 2020-12-15 AO Kaspersky Lab System and method of detecting a malicious file

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107609396A (en) * 2017-09-22 2018-01-19 杭州安恒信息技术有限公司 A kind of escape detection method based on sandbox virtual machine
CN108133139A (en) * 2017-11-28 2018-06-08 西安交通大学 A kind of Android malicious application detecting system compared based on more running environment behaviors
CN108874658A (en) * 2017-12-25 2018-11-23 北京安天网络安全技术有限公司 A kind of sandbox analysis method, device, electronic equipment and storage medium
CN108376220A (en) * 2018-02-01 2018-08-07 东巽科技(北京)有限公司 A kind of malice sample program sorting technique and system based on deep learning
CN108734012A (en) * 2018-05-21 2018-11-02 上海戎磐网络科技有限公司 Malware recognition methods, device and electronic equipment
CN108830077A (en) * 2018-06-14 2018-11-16 腾讯科技(深圳)有限公司 A kind of script detection method, device and terminal
CN109635523A (en) * 2018-11-29 2019-04-16 北京奇虎科技有限公司 Application program detection method, device and computer readable storage medium
CN109657468A (en) * 2018-11-29 2019-04-19 北京奇虎科技有限公司 Virus behavior detection method, device and computer readable storage medium

Also Published As

Publication number Publication date
CN110135160A (en) 2019-08-16

Similar Documents

Publication Publication Date Title
CN110135160B (en) Software detection method, device and system
US11188650B2 (en) Detection of malware using feature hashing
WO2019083737A1 (en) System and method for analyzing binary code malware classification using artificial neural network techniques
CN109101815B (en) Malicious software detection method and related equipment
CN111931179B (en) Cloud malicious program detection system and method based on deep learning
KR20210051669A (en) method for machine LEARNING of MALWARE DETECTING MODEL AND METHOD FOR detecting Malware USING THE SAME
CN105653949B (en) A kind of malware detection methods and device
US10158664B2 (en) Malicious code detection
CN111400707A (en) File macro virus detection method, device, equipment and storage medium
US10275596B1 (en) Activating malicious actions within electronic documents
CN116346456A (en) Business logic vulnerability attack detection model training method and device
CN110210216B (en) Virus detection method and related device
CN113114679B (en) Message identification method and device, electronic equipment and medium
CN104008336B (en) ShellCode detecting method and device
US20230359729A1 (en) Detecting anomalies in code commits
CN111881047B (en) Method and device for processing obfuscated script
US10002253B2 (en) Execution of test inputs with applications in computer security assessment
CN113420293A (en) Android malicious application detection method and system based on deep learning
CN113971282A (en) AI model-based malicious application program detection method and equipment
CN109218284B (en) XSS vulnerability detection method and device, computer equipment and readable medium
CN113553586A (en) Virus detection method, model training method, device, equipment and storage medium
Jiang et al. An exploitability analysis technique for binary vulnerability based on automatic exception suppression
CN113051561A (en) Application program feature extraction method and device and classification method and device
CN111444144A (en) File feature extraction method and device
KR102382017B1 (en) Apparatus and method for malware lineage inference system with generating phylogeny

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant