CN111581640A - Malicious software detection method, device and equipment and storage medium - Google Patents

Malicious software detection method, device and equipment and storage medium Download PDF

Info

Publication number
CN111581640A
CN111581640A CN202010256014.5A CN202010256014A CN111581640A CN 111581640 A CN111581640 A CN 111581640A CN 202010256014 A CN202010256014 A CN 202010256014A CN 111581640 A CN111581640 A CN 111581640A
Authority
CN
China
Prior art keywords
psi
pmu
mean
image
std
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010256014.5A
Other languages
Chinese (zh)
Inventor
张若愚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Lanyun Technologies Co ltd
Original Assignee
Beijing Lanyun Technologies Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Lanyun Technologies Co ltd filed Critical Beijing Lanyun Technologies Co ltd
Priority to CN202010256014.5A priority Critical patent/CN111581640A/en
Publication of CN111581640A publication Critical patent/CN111581640A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/563Static detection by source code analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Abstract

A malicious software detection method, a malicious software detection device, malicious software detection equipment and a storage medium are provided, wherein the malicious software detection method comprises the following steps: converting the software to be tested into an image; extracting feature information of the image; and processing the characteristic information through a pre-trained classifier model to obtain a detection result of the software to be detected. In this embodiment, according to the scheme provided by this embodiment, the extraction efficiency of features is higher than that of manual extraction, and by using the classifier model obtained through training, the trained model can identify unknown and variant malware, and the execution efficiency is higher than that of a behavior-based malware detection method.

Description

Malicious software detection method, device and equipment and storage medium
Technical Field
The present disclosure relates to internet technologies, and in particular, to a method, an apparatus, and a device for detecting malware.
Background
With the development of the internet, malware has become one of the major threats to network security. Malicious software, also called malicious code, and malicious executable files refer to a class of software that is installed and executed in a system without authorization to achieve an improper purpose. At present, various novel malicious code forms such as backdoor, Trojan horses, worms, zombie programs and the like are gradually generated, great challenges are brought to the safety of computers and networks, enterprises and users are subjected to great economic losses, and the national information safety is seriously threatened.
Rapidly and accurately classifying malicious codes is one of the keys for preventing the malicious codes; at present, malicious code classification is mainly performed by extracting and analyzing feature codes of malicious software by technicians in the field according to experience, and when samples of the malicious software are large, the analysis process is complicated, the efficiency is low, and the judgment standards are not uniform.
Disclosure of Invention
The embodiment of the application provides a method, a device and equipment for detecting malicious software and a storage medium, and realizes the detection of the malicious software.
The application provides a malicious software detection method, which comprises the following steps:
converting the software to be tested into an image;
extracting feature information of the image;
and processing the characteristic information through a pre-trained classifier model to obtain a detection result of the software to be detected.
In an exemplary embodiment, the converting the software to be tested into the image includes: and converting the software to be tested into a two-dimensional image by using a binary gray-level image conversion algorithm.
In an exemplary embodiment, the characteristic information includes at least one of: the size of the image, the global gray level mean value of the image, the global gray level standard deviation of the image and the local characteristic information of the image; wherein the local feature information includes feature information of a partial region of the image.
In an exemplary embodiment, the local feature information includes at least one of: mean gray value pmu for partition kkGray scale standard deviation of partition k psikK is 1 to K, all pmukMean of (n), (pkStd (pmu) standard deviation of (A), all pmukAll pmukAll pmu in the partition in which the maximum is locatedkMinimum of (a), all pmukAll psikMean of (psi), all psikStandard deviation of std (psi), all psikMaximum of, all psikAll psikMinimum value of (a), all psikPmu in the partition in which the minimum value ofkMean of all values less than mean (pmu) -3 × std (pmu), pmukStandard deviation of all values less than mean (pmu) -3 × std (pmu), pmukMean of all values greater than mean (pmu) +3 std (pmu), pmukStandard deviation of all values greater than mean (pmu) +3 std (pmu), psikMean of all values less than mean (psi) -3 std (psi), psikStandard deviation of all values less than mean (psi) -3 × std (psi), psikMean of all values greater than mean (psi) +3 std (psi), psikThe standard deviation of all values greater than mean (psi) +3 × std (psi), wherein the image is divided into K partitions from top to bottom with a preset sliding step, each partition comprising n rows, the K>1, n is less than the total number of lines of the image.
In an exemplary embodiment, n is 3, and the preset sliding step is one row.
In an exemplary embodiment, the classifier model is generated based on a random forest algorithm.
An embodiment of the present application provides a malware detection apparatus, including:
the training module is configured to train to obtain a classifier model;
the conversion module is configured to convert the software to be tested into an image;
a feature extraction module configured to extract feature information of the image;
and the detection module is configured to process the characteristic information through the classifier model to obtain a detection result of the software to be detected.
In an exemplary embodiment, the characteristic information includes at least one of: size of the image, global mean grayscale of the image, global standard grayscale deviation of the image, mean grayscale pmu of partition kkGray scale standard deviation of partition k psikK is 1 to K, all pmukMean of (n), (pkStd (pmu) standard deviation of (A), all pmukAll pmukAll pmu in the partition in which the maximum is locatedkMinimum of (a), all pmukAll psikMean of (psi), all psikStandard deviation of std (psi), all psikMaximum of, all psikAll psikMinimum value of (a), all psikPmu in the partition in which the minimum value ofkMean of all values less than mean (pmu) -3 × std (pmu), pmukStandard deviation of all values less than mean (pmu) -3 × std (pmu), pmukMean of all values greater than mean (pmu) +3 std (pmu), pmukStandard deviation of all values greater than mean (pmu) +3 std (pmu), psikMean of all values less than mean (psi) -3 std (psi), psikStandard deviation of all values less than mean (psi) -3 × std (psi), psikMean of all values greater than mean (psi) +3 std (psi), psikThe standard deviation of all values greater than mean (psi) +3 × std (psi), wherein the image is divided into K partitions from top to bottom with a preset sliding step, each partition comprising n rows, the K>1, n is less than the total number of lines of the image.
The embodiment of the application provides a malicious software detection device, which comprises a memory and a processor, wherein the memory stores a program, and the program is read and executed by the processor to realize the malicious software detection method.
Embodiments of the present application provide a computer-readable storage medium storing one or more programs, which are executable by one or more processors to implement the above-mentioned malware detection method.
Compared with the related art, the embodiment of the application provides a malicious software detection method, which comprises the following steps: converting the software to be tested into an image; extracting feature information of the image; and processing the characteristic information through a pre-trained classifier model to obtain a detection result of the software to be detected. In the embodiment, the software to be detected is converted into image data, the features of the image are automatically extracted, and the trained model is used for detecting the software to be detected.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the application. Other advantages of the present application may be realized and attained by the instrumentalities and combinations particularly pointed out in the specification and the drawings.
Drawings
The accompanying drawings are included to provide an understanding of the present disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the examples serve to explain the principles of the disclosure and not to limit the disclosure.
Fig. 1 is a flowchart of a malware detection method according to an embodiment of the present application;
fig. 2 is a schematic diagram illustrating conversion of software to be tested into an image according to an embodiment of the present application;
FIG. 3 is a flowchart of a classifier model training method according to an embodiment of the present application;
fig. 4 is a block diagram of a malware detection apparatus according to an embodiment of the present application;
fig. 5 is a block diagram of a malware detection apparatus according to an embodiment of the present application;
fig. 6 is a block diagram of a computer-readable storage medium provided in an embodiment of the present application.
Detailed Description
The present application describes embodiments, but the description is illustrative rather than limiting and it will be apparent to those of ordinary skill in the art that many more embodiments and implementations are possible within the scope of the embodiments described herein. Although many possible combinations of features are shown in the drawings and discussed in the detailed description, many other combinations of the disclosed features are possible. Any feature or element of any embodiment may be used in combination with or instead of any other feature or element in any other embodiment, unless expressly limited otherwise.
The present application includes and contemplates combinations of features and elements known to those of ordinary skill in the art. The embodiments, features and elements disclosed in this application may also be combined with any conventional features or elements to form a unique inventive concept as defined by the claims. Any feature or element of any embodiment may also be combined with features or elements from other inventive aspects to form yet another unique inventive aspect, as defined by the claims. Thus, it should be understood that any of the features shown and/or discussed in this application may be implemented alone or in any suitable combination. Accordingly, the embodiments are not limited except as by the appended claims and their equivalents. Furthermore, various modifications and changes may be made within the scope of the appended claims.
Further, in describing representative embodiments, the specification may have presented the method and/or process as a particular sequence of steps. However, to the extent that the method or process does not rely on the particular order of steps set forth herein, the method or process should not be limited to the particular sequence of steps described. Other orders of steps are possible as will be understood by those of ordinary skill in the art. Therefore, the particular order of the steps set forth in the specification should not be construed as limitations on the claims. Further, the claims directed to the method and/or process should not be limited to the performance of their steps in the order written, and one skilled in the art can readily appreciate that the sequences may be varied and still remain within the spirit and scope of the embodiments of the present application.
The traditional antivirus software mainly uses a feature code technology to match file contents with a feature library by statically scanning the file contents. The feature code extraction is the core of the detection method, and is obtained by a method of extracting code segments with malware features by performing reverse analysis on known malware. This approach requires manual feature extraction, consumes a lot of manpower, and is unable to detect unknown and deformed malware, requiring frequent feature library updates.
Another is a behavior-based malware detection method, which monitors the behavior of a program (API calls, etc.) as it runs, and alerts and suspends malicious behavior when the program behavior triggers predefined rules. The method for detecting the relative feature codes of the false alarm rate and the missing alarm rate has the advantages that the detection is high, the malicious software can be detected only after running in the system, and the detection efficiency is low.
Machine Learning (ML) is one of the important methods for implementing artificial intelligence, and mainly studies how to make a computer simulate or implement human Learning behaviors to acquire new knowledge or skills. Most machine learning techniques learn through data or experience to achieve the purposes of improving algorithm performance or problem solving effect and the like. Machine learning is a multidisciplinary comprehensive cross field of knowledge relating to probability theory, statistics, algorithm complexity theory and the like, and related achievements are widely applied to information retrieval, recommendation systems, network security, fraud detection, medical diagnosis and the like.
The embodiment of the application provides a malicious software detection method based on machine learning, the method automatically extracts features by converting software to be detected into two-dimensional image data, the extraction efficiency of the features is higher than that of manual extraction, training is performed by using a machine learning algorithm, a trained model can identify unknown and variant malicious software, and the execution efficiency is higher than that of a behavior-based malicious software detection method.
As shown in fig. 1, an embodiment of the present application provides a malware detection method, including:
step 101, converting software to be tested into an image;
step 102, extracting characteristic information of the image;
and 103, processing the characteristic information through a pre-trained classifier model to obtain a detection result of the software to be detected. The detection result is, for example, whether the software to be detected is malware.
In this embodiment, software to be detected is converted into image data, features of the image are automatically extracted, the trained model is used for detecting the software to be detected, the extraction efficiency of the features is higher than that of manual extraction, the trained classifier model is used, the trained model can identify unknown and variant malicious software, and the execution efficiency is higher than that of a behavior-based malicious software detection method.
In one embodiment, the converting the software to be tested into the image includes: and converting the software to be tested into a two-dimensional image by using a binary gray-level image conversion algorithm.
In an embodiment, for a given executable file (i.e., software to be tested), the binary file is read in units of bytes, and sequentially read into unsigned shaping values (ranging from 0 to 255), and the unsigned shaping values are combined into vectors with fixed lengths, the value in this embodiment is 256, and finally, an r × J matrix m is generated for the whole file (in this embodiment, 256 is taken as J). This matrix is visualized as a grayscale image as shown in fig. 2. It should be noted that 256 is only an example, and other values greater than 256 or less than 256 may be taken as needed.
The malware coding rules are distinguished from the normal coding rules, and the distinction is reflected in the image feature statistics of the converted grayscale image, so that, in an exemplary embodiment, the feature information includes at least one of the following: the image processing method comprises the steps of obtaining the size of an image, the global gray level mean value of the image, the global gray level standard deviation of the image and local feature information of the image, wherein the local feature information comprises feature information of a partial region of the image.
The size of the image is, for example, the number of lines of the image, for example, r × J, and r may be the size of the image.
Global gray level mean of the image
Figure BDA0002437354760000071
Where r is the number of rows of the image and J is the number of columns of the image.
Global gray scale standard deviation of the image
Figure BDA0002437354760000072
With the sliding step length of 1 and the window size of n (in this embodiment, n is 3), the image is divided into p-r-n +1 partitions from top to bottom, that is, from top to bottom, the 1 st to nth rows are taken as the first partition, the 2 nd to nth +1 th rows are taken as the second partition, and so on, and the gray level mean value and the gray level standard deviation of each partition are calculated. In this embodiment, the step size is 1, and the window size is 3, and other step sizes and window sizes may be set. In addition, partitioning may be performed in other manners.
The local feature information includes at least one of:
gray average of partitions
Figure BDA0002437354760000073
Gray scale standard deviation of partitions
Figure BDA0002437354760000074
Where k denotes the kth partition.
Thus, the gray level mean value of each partition is formed into an array pmu [ p ], the gray level standard deviation of each partition is formed into an array psi [ p ], and on the basis, the coding characteristics of the partitions are embodied, and the following characteristics are considered:
the mean value of the array pmu represents a partition coding rule, mpmu mean (pmu), wherein mean represents a mean function;
the standard deviation of array pmu, representing the partition encoding law: spmu ═ std (pmu), where std stands for the mean function
Mean value of array psi, representing partition coding fluctuation rule, mpsi mean (psi)
The standard deviation of the array psi represents the partition coding fluctuation law: spsi ═ std (psi)
Partition encoding rule maximum: maxpmu ═ max (pmu), where max represents the maximum function;
partition encoding rule minimum value: minpmu ═ min (pmu), where min represents a minimum function;
maximum value of partition coding fluctuation rule: maxpsi max (psi)
Partition coding fluctuation rule minimum value: min (psi)
Relative position of partition coding rule maxima: rmaxpmu ═ find (maxpmu)/p, wherein the find function represents the index of the variable in the original array, namely the partition number where the maximum value is located, and p is the partition number, the same as the following;
relative position of partition coding rule minimum value: rminpmu & fin (minpmu)/p
Relative position of maximum value of partition coding fluctuation rule: rmaxpsi fine (maxpsi)/p
Relative position of minimum value of partition coding fluctuation rule: rminpsi fine (minpsi)/p
Furthermore, considering that the coding of malicious activities may only exist locally, and therefore should also consider partitioning and spreading cases that are distinguished from global coding rules, the following features can also be considered:
pmu mean ma1 and standard deviation sa1 for all values less than mean (pmu) -3 std (pmu), and mean ma2 and standard deviation sa2 for all values greater than mean (pmu) +3 std (pmu) of pmu.
Mean mb1 and standard deviation sb1 in psi for all values less than mean (psi) -3 std (psi), and mean mb2 and standard deviation sb2 in psi for all values greater than mean (psi) +3 std (psi).
In a gray image, texture appears to be different from the most regional code of the image in terms of coding and can be used as a characteristic for identifying the gray image, while texture is an abnormal value of the whole distribution in the image coding distribution, and the abnormal value can be represented by mean +/-3 std according to the three sigma principle of statistics, so that the statistical index of regional codes except mean (pmu) +/-3 std (pmu) can be used as an effective characteristic for identifying the image.
In conclusion, all the indexes form a feature vector for describing the texture features of the gray level image: [ r, mu, si, mpmu, spmu, mpsi, spsi, maxpmu, minpmu, maxpsi, minpsi, rmaxpmu, rminpmu, rmaxpsi, rminpsi, ma1, sa1, ma2, sa2, mb1, sb1, mb2, sb2]
It should be noted that the above features are merely examples, and other features may be used as necessary.
In an embodiment, the classifier model is trained based on a machine learning algorithm, such as generation by a random forest algorithm, and generating the classifier model includes training of the classifier model and detection of the classifier model.
The random forest belongs to a bagging algorithm in Ensemble Learning (Ensemble Learning), and the implementation process is as follows:
randomly putting back and sampling n1 samples from an original training set by using a Bootstrap (boot-pulling method), and sampling for k1 times to generate k1 training sets;
for k1 training sets, respectively training k1 Classification regression tree (CART) decision tree models (the k1 decision tree models can be determined according to specific problems, such as ID3 algorithm, C4.5 algorithm decision tree model)
For a single decision tree model, assuming that the number of training sample features is M, randomly selecting M feature subsets from the M features, and selecting the best feature from the M feature subsets to split each time according to the Gini index (if the algorithm is ID3/C4.5, the splitting principle is an information gain/information gain ratio, namely, selecting the best feature to split according to the information gain/information gain ratio);
and forming a random forest by the generated decision trees to serve as a classifier model. And for the classification problem, voting is carried out according to classifiers of a plurality of decision trees to determine a final classification result.
After training, carry out the optimization, include:
performing cross inspection on the trained classifier model, and calculating accuracy, detection rate, false alarm rate and auc (area under the Curve) value;
and adjusting the number n1 of the decision trees in the classifier model and the maximum feature number m of a single decision tree, retraining, and calculating the indexes (accuracy rate, detection rate, false alarm rate and auc value) until the indexes are optimal.
The above malware detection method can be implemented using python, java, or the like.
As shown in fig. 3, the training process of the classifier model includes:
step 301, converting collected samples including normal software (marked as white samples) and malicious software (marked as black samples) by using a binary system to gray level image conversion algorithm, and mapping the samples into a two-dimensional texture image;
step 302, extracting texture features of the texture image;
in this embodiment, the features are extracted from the overall and local aspects, and the overall coding rule and the local coding rule of the software are represented, including, for example, the size of an image, the global gray level mean, the global gray level standard deviation, the local gray level mean, the local gray level standard deviation, and the like;
step 303, dividing a data set consisting of the feature vectors and the class labels (white samples and black samples) into a test set and a training set;
in this example, a ten-fold cross-assay method was used: randomly dividing the data into 10 parts, taking 9 parts as a training set in turn, and taking the remaining 1 part as a test set; it should be noted that the training set and the test set may be divided in other ways, which is not limited in this application.
Step 304, training a classifier model by using a random forest algorithm;
in other embodiments, the classifier model may be trained using supervised Machine learning algorithms, such as the application proximity algorithm (kNN, k-nearest neighbor), Support Vector Machine (SVM), logistic regression, decision trees, and so forth;
305, verifying the classifier model by using the test set, verifying the classifier model by using accuracy, detection rate, false alarm rate, auc index and the like, executing step 307 if the classifier model passes the verification, and executing step 306 if the classifier model does not pass the verification;
step 306, adjusting parameters, and returning to step 304;
and 307, finishing training.
As shown in fig. 4, at least one embodiment of the present application provides a malware detection apparatus, including:
a training module 401 configured to train to obtain a classifier model;
a conversion module 402 configured to convert the software to be tested into an image;
a feature extraction module 403 configured to extract feature information of the image;
the detection module 404 is configured to process the feature information through the classifier model to obtain a detection result of the software to be detected.
As shown in fig. 5, an embodiment of the present application provides a malware detection apparatus 50, which includes a memory 510 and a processor 520, where the memory 510 stores a program, and when the program is read and executed by the processor 520, the program implements the malware detection method according to any embodiment.
As shown in fig. 6, an embodiment of the present application provides a computer-readable storage medium 60, where the computer-readable storage medium 60 stores one or more programs 610, and the one or more programs 610 may be executed by one or more processors to implement the malware detection method according to any one of the embodiments.
It should be noted that the scheme provided by the embodiment of the present application is not limited to be used for detecting malware, and may also be used as software classification.
It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.

Claims (10)

1. A malware detection method, comprising:
converting the software to be tested into an image;
extracting feature information of the image;
and processing the characteristic information through a pre-trained classifier model to obtain a detection result of the software to be detected.
2. The malware detection method of claim 1, wherein converting the software under test into an image comprises: and converting the software to be tested into a two-dimensional image by using a binary gray-level image conversion algorithm.
3. The malware detection method of claim 1, wherein the characteristic information comprises at least one of: the size of the image, the global gray level mean value of the image, the global gray level standard deviation of the image and the local characteristic information of the image; wherein the local feature information includes feature information of a partial region of the image.
4. The malware detection method of claim 3,
the local feature information includes at least one of: mean gray value pmu for partition kkGray scale standard deviation of partition k psikK is 1 to K, all pmukMean of (n), (pkStd (pmu) standard deviation of (A), all pmukAll pmukAll pmu in the partition in which the maximum is locatedkMinimum of (a), all pmukAll psikMean of (psi), all psikStandard deviation of std (psi), all psikMaximum of, all psikAll psikMinimum value of (a), all psikPmu in the partition in which the minimum value ofkMean of all values less than mean (pmu) -3 × std (pmu), pmukStandard deviation of all values less than mean (pmu) -3 × std (pmu), pmukMean of all values greater than mean (pmu) +3 std (pmu), standard deviation of all values greater than mean (pmu) +3 std (pmu) in pmuk, psikMean of all values less than mean (psi) -3 std (psi), psikStandard deviation of all values less than mean (psi) -3 × std (psi), psikMean of all values greater than mean (psi) +3 std (psi), psikThe standard deviation of all values greater than mean (psi) +3 × std (psi), wherein the image is divided into K partitions from top to bottom with a preset sliding step, each partition comprising n rows, the K>1, n is less than the total number of lines of the image.
5. The malware detection method of claim 4, wherein n is 3, and the preset sliding step is one row.
6. The malware detection method of any one of claims 1 to 5, wherein the classifier model is generated based on a random forest algorithm.
7. A malware detection apparatus, comprising:
the training module is configured to train to obtain a classifier model;
the conversion module is configured to convert the software to be tested into an image;
a feature extraction module configured to extract feature information of the image;
and the detection module is configured to process the characteristic information through the classifier model to obtain a detection result of the software to be detected.
8. The malware detection device of claim 7, wherein the characteristic information comprises at least one of: size of the image, global mean grayscale of the image, global standard grayscale deviation of the image, mean grayscale pmu of partition kkGray scale standard deviation of partition k psikK is 1 to K, all pmukMean of (n), (pkStd (pmu) standard deviation of (A), all pmukAll pmukAll pmu in the partition in which the maximum is locatedkMinimum of (a), all pmukAll psikMean of (psi), all psikStandard deviation of std (psi), all psikMaximum of, all psikAll psikMinimum value of (a), all psikThe minimum value of (a), the mean value of all values in pmuk which are less than mean (pmu) -3 std (pmu), the standard deviation of all values in pmuk which are less than mean (pmu) -3 std (pmu), pmukMean of all values greater than mean (pmu) +3 std (pmu), pmukStandard deviation of all values greater than mean (pmu) +3 std (pmu), psikMean of all values less than mean (psi) -3 std (psi), psikStandard deviation of all values less than mean (psi) -3 × std (psi), psikMean of all values greater than mean (psi) +3 std (psi), psikThe standard deviation of all values greater than mean (psi) +3 × std (psi), wherein the image is divided into K partitions from top to bottom with a preset sliding step, each partition comprising n rows, the K>1, n is less than the total number of lines of the image.
9. A malware detection apparatus comprising a memory and a processor, the memory storing a program that, when read and executed by the processor, implements the malware detection method according to any one of claims 1 to 6.
10. A computer-readable storage medium storing one or more programs, the one or more programs being executable by one or more processors to implement the malware detection method of any one of claims 1 to 6.
CN202010256014.5A 2020-04-02 2020-04-02 Malicious software detection method, device and equipment and storage medium Pending CN111581640A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010256014.5A CN111581640A (en) 2020-04-02 2020-04-02 Malicious software detection method, device and equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010256014.5A CN111581640A (en) 2020-04-02 2020-04-02 Malicious software detection method, device and equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111581640A true CN111581640A (en) 2020-08-25

Family

ID=72119166

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010256014.5A Pending CN111581640A (en) 2020-04-02 2020-04-02 Malicious software detection method, device and equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111581640A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112764791A (en) * 2021-01-25 2021-05-07 济南大学 Incremental updating malicious software detection method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150052606A1 (en) * 2011-10-14 2015-02-19 Telefonica, S.A. Method and a system to detect malicious software
CN109359439A (en) * 2018-10-26 2019-02-19 北京天融信网络安全技术有限公司 Software detecting method, device, equipment and storage medium
CN110096878A (en) * 2019-04-26 2019-08-06 武汉智美互联科技有限公司 A kind of detection method of Malware
CN110572393A (en) * 2019-09-09 2019-12-13 河南戎磐网络科技有限公司 Malicious software traffic classification method based on convolutional neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150052606A1 (en) * 2011-10-14 2015-02-19 Telefonica, S.A. Method and a system to detect malicious software
CN109359439A (en) * 2018-10-26 2019-02-19 北京天融信网络安全技术有限公司 Software detecting method, device, equipment and storage medium
CN110096878A (en) * 2019-04-26 2019-08-06 武汉智美互联科技有限公司 A kind of detection method of Malware
CN110572393A (en) * 2019-09-09 2019-12-13 河南戎磐网络科技有限公司 Malicious software traffic classification method based on convolutional neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张晨斌等: "基于灰度图纹理指纹的恶意软件分类", 《计算机科学》 *
金逸灵: "基于卷积神经网络的容器中恶意软件检测", 《现代计算机》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112764791A (en) * 2021-01-25 2021-05-07 济南大学 Incremental updating malicious software detection method and system
CN112764791B (en) * 2021-01-25 2023-08-08 济南大学 Incremental update malicious software detection method and system

Similar Documents

Publication Publication Date Title
Chen Deep transfer learning for static malware classification
CN112491796A (en) Intrusion detection and semantic decision tree quantitative interpretation method based on convolutional neural network
CN111259219B (en) Malicious webpage identification model establishment method, malicious webpage identification method and malicious webpage identification system
CN112052451A (en) Webshell detection method and device
CN111753290A (en) Software type detection method and related equipment
CN112738092A (en) Log data enhancement method, classification detection method and system
CN112632609A (en) Abnormality detection method, abnormality detection device, electronic apparatus, and storage medium
CN110704841A (en) Convolutional neural network-based large-scale android malicious application detection system and method
US11562133B2 (en) System and method for detecting incorrect triple
Smith et al. Supervised and unsupervised learning techniques utilizing malware datasets
CN115577357A (en) Android malicious software detection method based on stacking integration technology
CN110097120B (en) Network flow data classification method, equipment and computer storage medium
TW202240453A (en) Method and computer for learning corredpondence between malicious behaviors and execution trace of malware and method for implementing neural network
CN111581640A (en) Malicious software detection method, device and equipment and storage medium
Jere et al. Principal component properties of adversarial samples
CN110808947B (en) Automatic vulnerability quantitative evaluation method and system
CN112016088A (en) Method and device for generating file detection model and method and device for detecting file
CN111144453A (en) Method and equipment for constructing multi-model fusion calculation model and method and equipment for identifying website data
CN111191238A (en) Webshell detection method, terminal device and storage medium
CN115567224A (en) Method for detecting abnormal transaction of block chain and related product
Mitsuhashi et al. Exploring optimal deep learning models for image-based malware variant classification
CN111695117B (en) Webshell script detection method and device
Lu et al. Multi-class malware classification using deep residual network with Non-SoftMax classifier
Juvonen et al. Anomaly detection framework using rule extraction for efficient intrusion detection
CN113076544A (en) Vulnerability detection method and system based on deep learning model compression and mobile device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200825