CN117496246A - Malicious software classification method based on convolutional neural network - Google Patents

Malicious software classification method based on convolutional neural network Download PDF

Info

Publication number
CN117496246A
CN117496246A CN202311489175.9A CN202311489175A CN117496246A CN 117496246 A CN117496246 A CN 117496246A CN 202311489175 A CN202311489175 A CN 202311489175A CN 117496246 A CN117496246 A CN 117496246A
Authority
CN
China
Prior art keywords
image
malware
malicious software
classification method
gray
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311489175.9A
Other languages
Chinese (zh)
Inventor
魏林锋
黎庭威
何卓丰
王宇
宣建通
陈佳韩
李健
李学明
黄宇勤
丁振杨
欧燕
杨子怡
孙炜
唐英展
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinan University
Original Assignee
Jinan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinan University filed Critical Jinan University
Priority to CN202311489175.9A priority Critical patent/CN117496246A/en
Publication of CN117496246A publication Critical patent/CN117496246A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/40Image enhancement or restoration by the use of histogram techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention discloses a malicious software classification method based on a convolutional neural network. The method comprises the following steps: collecting a malicious software sample data set; converting the malware sample into a gray scale image; increasing the local contrast of the gray images and the contrast between the gray images, and simultaneously inhibiting the amplification of noise; the gray level image is input into an Efficientnet-B0 model to obtain a more refined feature vector, regularized finally and input into a Softmax function to be classified so as to determine the malicious software family to which the gray level image belongs. According to the method, a malicious software sample is converted into a gray image, and an Efficientnet-B0 model with high accuracy and fewer parameters is used, so that families corresponding to malicious software can be identified efficiently, a certain unknown malicious software attack discovery capability is provided, and the method can be expanded to other platforms for identifying scenes of the malicious software.

Description

Malicious software classification method based on convolutional neural network
Technical Field
The invention belongs to the fields of computer system security, network security and artificial intelligence security application, and particularly relates to a malicious software classification method based on a convolutional neural network.
Background
Malware is software that is installed and run on a user device without permission from the user, infringing on the legal rights and interests of the user. Because the malicious software creates the malicious software variant by reusing the core code, the malicious software is easier to write, and an automatic malicious software and variant generation platform thereof are usually arranged, so that the number of the malicious software is large and widely spread, and the malicious software causes great threat to enterprises, governments, financial institutions and the like, and even damages a user software and hardware system to cause great economic loss. Since most of the malware is automatically generated or generated by conventional malware, the most important method for searching and protecting the malware is to efficiently classify and assign the malware to a conventional malware family, and then the existing searching and protecting method is adopted to prevent the attack of the malware. Aiming at the problems, the invention provides a malicious software classification technology and method based on a convolutional neural network.
The current malware detection methods are mainly divided into static analysis and dynamic analysis. Static analysis is to analyze its code files without executing the application, and although static analysis provides the most comprehensive code coverage at a faster rate and with less overhead, obfuscation and encryption techniques can affect its analysis performance and effectiveness. Dynamic analysis has better performance and effect in dealing with obfuscation techniques and encryption techniques by monitoring application running in a sandbox environment and collecting its behavior information, but dynamic analysis requires longer analysis time and higher memory overhead. Furthermore, by running an application in a sandbox does not cover all possible code execution paths and running scenarios, and some malware may detect the sandbox environment, malicious behavior may not appear during dynamic analysis.
The image-based malicious software detection does not need to extract features of an original sample, the image generation speed is high, and the image-based malicious software detection has better performance and effect on the malicious software detection by using the confusion technology and the encryption technology. The code of the malware generally has a plurality of bytes, and the image-based malware detection corresponds each byte to one pixel, so code execution instructions can be converted into a plurality of pixel values. Locating similar instruction sequences from different malware samples is equivalent to identifying regions with similar pixel values in their corresponding images. However, similar instruction sequences of different malware samples belonging to the same malware family may exist at different locations of their files, resulting in reduced classification model accuracy. Too many parameters of the traditional convolutional neural network can lead to low classification efficiency and poor generalization of malicious software based on the convolutional neural network.
In summary, the method for classifying the malicious software by using the convolutional neural network still needs to be further improved in terms of accuracy, generalization and efficiency by converting the malicious software sample into the gray level image.
Disclosure of Invention
The invention aims to solve the defects and the shortcomings of the existing classification schemes and provides a malicious software classification method based on a convolutional neural network. According to the method, the malicious software sample is converted into the gray image, so that a great amount of time expenditure in the feature extraction process is avoided; then, a self-adaptive histogram equalization method for limiting the contrast is used for enhancing the local contrast of the gray level image, and compared with the histogram method, the self-adaptive histogram equalization method for limiting the contrast can enhance the local contrast of the image, avoid amplifying the noise of the image and enhance the contrast effect between the images; finally, the obtained image is input into a classifier, the classifier removes the last full-connection layer of the Efficient net-B0 model, all layers before the full-connection layer are reserved, a global average pooling layer and a Softmax layer are added after the full-connection layer, the global average pooling layer has fewer parameters than the full-connection layer, the regularization function is achieved, and the Efficient net-B0 model can obtain higher accuracy with a small number of parameters and calculated amount. The method can obtain higher accuracy with less detection time.
The technical scheme adopted by the invention is as follows:
a malicious software classification method based on a convolutional neural network comprises the following steps:
s1) marking software samples in a malicious software data set, converting each byte into decimal numbers between [0,255] according to byte sequence of the software samples, converting the decimal numbers into a first gray scale image, dividing the first gray scale image into a training set and a testing set through a cross verification function, and ensuring that the proportion of each malicious software category corresponding to the training set and the testing set is consistent with that of an original data set;
s2) image enhancement: the first gray image is processed by adopting an adaptive histogram equalization method for limiting contrast,
obtaining a second gray scale image with enhanced local contrast;
s3) feature extraction: inputting the second gray level image into an EfficientNet-B0 model, extracting features, outputting more refined feature vectors with stronger expression capacity, and obtaining a third gray level image;
s4) image classification: the third gray level image is input to the global average pooling layer, one-dimensional vector is output, then the one-dimensional vector is input to the Softmax layer, the input one-dimensional vector is converted into probability distribution, each element of the output vector is between 0 and 1, the probability value that a sample belongs to a certain malicious software family is represented, the sum of all elements is 1, and the category with the highest probability is selected as a prediction result.
In some examples of malware classification methods, the image enhancement specifically includes:
a) Dividing an original image into a plurality of regions;
b) Calculating a cumulative distribution function CDF of pixel values in the image area;
c) Judging whether the frequency value of a certain pixel in the image area is higher than a preset frequency threshold value, if so, performing clipping operation by using an image threshold processing function, and randomly assigning the pixels higher than the preset frequency threshold value to [0,255]
Values within the range to ensure that no pixel has a frequency value above the threshold;
d) The interpolation method is used for converting each region, so that pixel values are related to each other, noise amplification is limited, and contrast of an image is enhanced.
In some examples of malware classification methods, the Efficient Net-B0 model is formed by optimization through compression and excitation methods using a series of MBConv modules.
In some examples of malware classification methods, the image thresholding function is that of a Python CV2 library.
In some examples of malware classification methods, a third party open source library CV2 of Python is used to convert decimal numbers into a first grayscale image.
In some examples of malware classification methods, during a clipping operation, portions of pixel values that occur more frequently than a frequency threshold are divided equally into 0-255, out of 256 packets, and if there are portions that are not allocated equally, equally spaced are inserted into the packets in sequence until all the excess portions are allocated to the corresponding packets.
In some examples of malware classification methods, the region length is determined by the instruction length of its samples, and the width of the region is determined by the average height of all malware samples of the same series.
In some examples of malware classification methods, the malware data set is a data set Malimg having multiple malware types.
In some examples of malware classification methods, malware is tagged with open source antivirus software, clamAV.
In some examples of malware classification methods, the cross-validation function is stratifiedfold.
In some examples of malware classification methods, a pre-set frequency threshold is set with reference to known publications.
The beneficial effects of the invention are as follows:
compared with the problems of low accuracy, poor generalization, low efficiency and the like of the prior classification technology, the invention has the following advantages:
(1) The classification efficiency is high: according to the method, the malicious software sample is converted into the image, the conversion speed is high, the time required by static feature extraction and dynamic feature extraction is saved, and the parameters and the calculated amount of the model are less.
(2) The accuracy is high: the local contrast of the original image is enhanced, and the accuracy of classification can be improved by using a model with high classification accuracy.
(3) High generalization: the full-connection layer of the EfficientNet-B0 model is removed, the generalization capability of the model is reduced due to excessive parameters of the full-connection layer, and the feature vector is simplified by using the global average pooling layer, so that the regularization effect is achieved, and the generalization capability of the model is enhanced.
(4) And (3) visualization: and the malicious software sample is converted into an image, so that the difference between different malicious software families can be observed conveniently and intuitively.
Drawings
FIG. 1 is a flow chart of a convolutional neural network-based malware classification method of the present invention.
FIG. 2 is a flow chart of an image enhancement process of the convolutional neural network-based malware classification method of the present invention.
FIG. 3 is a flow chart of an image classification process of the convolutional neural network-based malware classification method of the present invention.
Detailed Description
A malicious software classification method based on a convolutional neural network comprises the following steps:
s1) marking software samples in a malicious software data set, converting each byte into decimal numbers between [0,255] according to byte sequence of the software samples, converting the decimal numbers into a first gray scale image, dividing the first gray scale image into a training set and a testing set through a cross verification function, and ensuring that the proportion of each malicious software category corresponding to the training set and the testing set is consistent with that of an original data set;
s2) image enhancement: the first gray image is processed by adopting an adaptive histogram equalization method for limiting contrast,
obtaining a second gray scale image with enhanced local contrast;
s3) feature extraction: inputting the second gray level image into an EfficientNet-B0 model, extracting features, outputting more refined feature vectors with stronger expression capacity, and obtaining a third gray level image;
s4) image classification: the third gray level image is input to the global average pooling layer, one-dimensional vector is output, then the one-dimensional vector is input to the Softmax layer, the input one-dimensional vector is converted into probability distribution, each element of the output vector is between 0 and 1, the probability value that a sample belongs to a certain malicious software family is represented, the sum of all elements is 1, and the category with the highest probability is selected as a prediction result.
The source of the malicious software data set has no special requirement, and the sample is complete in variety and easy to obtain. In some examples of malware classification methods, the malware data set is a data set Malimg having multiple malware types.
Various marking software may be used to mark malware. In some examples of malware classification methods, malware is marked with open-source antivirus software, clamAV, taking into account accessibility of programs.
The decimal numbers may be converted into the first gray scale image using various well known algorithms. In some examples of malware classification methods, decimal numbers are converted to a first grayscale image using a third party open source library CV2 of Python, taking into account the accessibility of the program.
There is no special requirement for the cross-validation function, which in some examples of malware classification methods is stratifiedfold.
In some examples of malware classification methods, the image enhancement specifically includes:
a) Dividing an original image into a plurality of regions;
b) Calculating a cumulative distribution function CDF of pixel values in the image area;
c) Judging whether the frequency value of a certain pixel in the image area is higher than a preset frequency threshold value, if so, performing clipping operation by using an image threshold processing function, and randomly assigning the pixels higher than the preset frequency threshold value to [0,255]
Values within the range to ensure that no pixel has a frequency value above the threshold;
d) The interpolation method is used for converting each region, so that pixel values are related to each other, noise amplification is limited, and contrast of an image is enhanced.
The cumulative distribution function CDF can be calculated as follows:
where L is the total number of gray pixels, 256, n j Is the probability value that the pixel value j occurs in the image area.
In some examples of malware classification methods, the Efficient Net-B0 model is formed by optimization through compression and excitation methods using a series of MBConv modules. Specifically, when constructing the afflicientnet-B0 model, a mobile rollover bottleneck convolution module in the MobileNet V2 is used as a main building block of the model, and on the basis, a multi-objective neural architecture is used for searching, so that a base network afflicientnet-B0 model is finally determined. The MBConv module in the Efficientnet-B0 model is formed by optimization using the compression and excitation method in SENet on the basis of a depth separable convolution. The Efficient net-B0 model can be regarded as an efficient feature extractor, and the image with enhanced local contrast outputs feature vectors which are more refined and have stronger expressive power after a series of operations such as convolution, pooling and activation.
The image thresholding function may be a variety of known functions. In some examples of malware classification methods, the image thresholding function is that of the Python CV2 library, considering the accessibility of the algorithm.
In some examples of malware classification methods, during a clipping operation, portions of pixel values that occur more frequently than a frequency threshold are divided equally into 0-255, out of 256 packets, and if there are portions that are not allocated equally, equally spaced are inserted into the packets in sequence until all the excess portions are allocated to the corresponding packets.
In some examples of malware classification methods, the region length is determined by the instruction length of its samples, and the width of the region is determined by the average height of all malware samples of the same series.
In some examples of malware classification methods, a pre-set frequency threshold is set with reference to known publications.
In some examples of the malware classification method, during the image classification operation, after the image is input into the EfficientNet-B0 model, a series of operations such as convolution, pooling and activation are performed, and then a feature vector with higher definition and higher expression capability is output. Then the regularization function is achieved through the global average pooling layer, it can simplify the three-dimensional input of the image width and length w×h×d to a one-dimensional output of which only length 1×1×d remains, and then input the output to the Softmax function. The Softmax function receives a vector z containing K real numbers and converts it to a K probability-forming probability distribution proportional to the exponent of the input number, the corresponding function being:
the Softmax function first indexes each element in the input vector z, i.e(z i Representing the i-th element in z), and adding all elements to obtain a value representing the index sum +.>Normalizing each element by dividing the index value of each element by the sum of the indices to obtain an output Softmax (z i ) Each element representing the vector represents a probability of a malware family.
It should be noted that, without conflict, the embodiments of the present invention and features of the embodiments may be combined with each other, and the present invention will be further described in detail with reference to the drawings and the specific embodiments.
The invention provides a malicious software classification method based on a convolutional neural network. According to the method, the malicious software sample is converted into the gray image, so that the large time cost of feature extraction is avoided; then using a self-adaptive histogram equalization method for limiting contrast to enhance the local contrast of the gray image; and finally inputting the obtained image into an Efficientnet-B0 model with the last full-connection layer removed, and adding a global average pooling layer and a Softmax layer after the Efficientnet-B0 model to judge which malware family a malware sample belongs to. As shown in fig. 1 to 3, the method specifically comprises the following steps:
step one, obtaining a public malicious software data set Malimg, marking the data set by using ClamAV, and converting each byte into decimal numbers between [0,255] according to byte sequence. A third party open source library of Python was used to convert it into a two-dimensional gray scale image. The gray level image is divided into a training set and a testing set through a cross validation function StratitifiedKFOld, so that the proportion of each malicious software category corresponding to the training set and the testing set is ensured to be consistent with that of the original data set.
Step two,
The first step is to divide the gray image;
the second step is to calculate the Cumulative Distribution Function (CDF) frequency value of the image region, which is calculated as follows:
where L is the total number of gray pixels, 256.n is n j Is the probability value that the pixel value j appears in the image area;
judging whether the frequency value cdf (i) of the pixel is higher than a preset frequency threshold value, if yes, performing clipping operation by using an image threshold processing function of a CV2 library, and randomly endowing a part of the pixels with values in the range of [0,255], so that the frequency value of the pixel can be ensured to be higher than the frequency threshold value of clipping limit;
the fourth step is to transform each region by using interpolation in order to correlate pixel values within each region, limit amplification of noise and enhance contrast of the image.
And thirdly, when constructing the Efficientnet-B0 model, using a mobile overturn bottleneck convolution module in the MobileNet V2 as a main building block of the model, and searching by using a multi-target neural architecture on the basis to finally determine a base line network Efficientnet-B0 model. The MBConv module in the Efficientnet-B0 model is formed by optimization using the compression and excitation method in SENet on the basis of a depth separable convolution. The Efficient net-B0 model can be regarded as an efficient feature extractor, and the image with enhanced local contrast outputs feature vectors which are more refined and have stronger expressive power after a series of operations such as convolution, pooling and activation.
And step four, inputting an image into an EfficientNet-B0 model, and outputting a more refined feature vector with stronger expression capability after a series of operations such as convolution, pooling, activation and the like. Then the regularization function is achieved through the global average pooling layer, it can simplify the three-dimensional input of the image width and length w×h×d to a one-dimensional output of which only length 1×1×d remains, and then input the output to the Softmax function. The Softmax function receives a vector z containing K real numbers and converts it to a K probability-forming probability distribution proportional to the exponent of the input number, the corresponding function being:
the Softmax function first indexes each element in the input vector z, i.e(z i Representing the i-th element in z), and adding all elements to obtain a value representing the index sum +.>Normalizing each element by dividing the index value of each element by the sum of the indices to obtain an output Softmax (z i ) Each element representing the vector represents a probability of a malware family.
Comparison of classification efficiency for different algorithms:
in the ImageNet dataset, the accuracy of the efficentet-B0 model was higher than that of the ResNet50 and the densanenet 169, the amount of parameters was minimal, the amount of calculation was minimal FLPOS (floating point operations), and the efficentets were evaluated on 8 common migration learning datasets, the results indicated that the efficentets reached the currently optimal accuracy on 5 datasets therein, and the amount of parameters was greatly reduced, indicating that the efficentets had good accuracy, performance, and migration ability.
The above description of the present invention is further illustrated in detail and should not be taken as limiting the practice of the present invention. It is within the scope of the present invention for those skilled in the art to make simple deductions or substitutions without departing from the concept of the present invention.

Claims (10)

1. A malicious software classification method based on a convolutional neural network comprises the following steps:
s1) marking software samples in a malicious software data set, converting each byte into decimal numbers between [0,255] according to byte sequence of the software samples, converting the decimal numbers into a first gray scale image, dividing the first gray scale image into a training set and a testing set through a cross verification function, and ensuring that the proportion of each malicious software category corresponding to the training set and the testing set is consistent with that of an original data set;
s2) image enhancement: processing the first gray level image by adopting a self-adaptive histogram equalization method for limiting contrast ratio to obtain a second gray level image for enhancing local contrast ratio;
s3) feature extraction: inputting the second gray level image into an EfficientNet-B0 model, extracting features, outputting more refined feature vectors with stronger expression capacity, and obtaining a third gray level image;
s4) image classification: the third gray level image is input to the global average pooling layer, one-dimensional vector is output, then the one-dimensional vector is input to the Softmax layer, the input one-dimensional vector is converted into probability distribution, each element of the output vector is between 0 and 1, the probability value that a sample belongs to a certain malicious software family is represented, the sum of all elements is 1, and the category with the highest probability is selected as a prediction result.
2. The malware classification method according to claim 1, wherein the image enhancement specifically comprises:
a) Dividing an original image into a plurality of regions;
b) Calculating a cumulative distribution function CDF of pixel values in the image area;
c) Judging whether the frequency value of a certain pixel in the image area is higher than a preset frequency threshold value, if so, performing clipping operation by using an image threshold processing function, and randomly assigning the value in the range of [0,255] to the pixel higher than the preset frequency threshold value so as to ensure that the frequency value of the pixel is higher than the threshold value;
d) The interpolation method is used for converting each region, so that pixel values are related to each other, noise amplification is limited, and contrast of an image is enhanced.
3. The malware categorization method of claim 1, wherein the afflicientnet-B0 model is formed by optimization through compression and excitation methods using a series of MBConv modules.
4. The malware classification method of claim 2, wherein the image thresholding function is an image thresholding function of a Python CV2 library; and/or
The decimal numbers are converted into a first gray scale image using a third party open source library CV2 of Python.
5. The malware classification method according to claim 2 or 4, wherein, in the clipping operation, the portions where the frequency of occurrence of the pixel values exceeds the frequency threshold are divided equally into 0-255, and 256 packets in total, and if there are portions that are not distributed equally, the portions are inserted into the packets in equal intervals in order until all the excess portions are distributed to the corresponding packets.
6. The method of claim 1, wherein the length of the region is determined by the instruction length of the sample, and the width of the region is determined by the average height of all malware samples in the same series.
7. The malware classification method according to claim 1, wherein the malware data set is a data set Malimg having a plurality of malware types.
8. The malware classification method according to claim 1, wherein malware is marked using an open source antivirus software, clamAV.
9. The malware classification method of claim 1, wherein the cross-validation function is stratifiedfold.
10. The malware classification method according to claim 1, wherein the preset frequency threshold is set with reference to known publications.
CN202311489175.9A 2023-11-09 2023-11-09 Malicious software classification method based on convolutional neural network Pending CN117496246A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311489175.9A CN117496246A (en) 2023-11-09 2023-11-09 Malicious software classification method based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311489175.9A CN117496246A (en) 2023-11-09 2023-11-09 Malicious software classification method based on convolutional neural network

Publications (1)

Publication Number Publication Date
CN117496246A true CN117496246A (en) 2024-02-02

Family

ID=89674045

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311489175.9A Pending CN117496246A (en) 2023-11-09 2023-11-09 Malicious software classification method based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN117496246A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113989583A (en) * 2021-09-03 2022-01-28 中电积至(海南)信息技术有限公司 Method and system for detecting malicious traffic of internet
CN114926680A (en) * 2022-05-13 2022-08-19 山东省计算中心(国家超级计算济南中心) Malicious software classification method and system based on AlexNet network model
WO2023193629A1 (en) * 2022-04-08 2023-10-12 华为技术有限公司 Coding method and apparatus for region enhancement layer, and decoding method and apparatus for area enhancement layer

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113989583A (en) * 2021-09-03 2022-01-28 中电积至(海南)信息技术有限公司 Method and system for detecting malicious traffic of internet
WO2023193629A1 (en) * 2022-04-08 2023-10-12 华为技术有限公司 Coding method and apparatus for region enhancement layer, and decoding method and apparatus for area enhancement layer
CN114926680A (en) * 2022-05-13 2022-08-19 山东省计算中心(国家超级计算济南中心) Malicious software classification method and system based on AlexNet network model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨春雨: "基于纹理特征融合与深度学习的恶意软件分类", 中国优秀硕士学位论文全文数据库, no. 2021, 15 September 2021 (2021-09-15), pages 17 - 40 *

Similar Documents

Publication Publication Date Title
CN108985361B (en) Malicious traffic detection implementation method and device based on deep learning
CN111832019B (en) Malicious code detection method based on generation countermeasure network
CN109302410B (en) Method and system for detecting abnormal behavior of internal user and computer storage medium
CN109492395B (en) Method, device and storage medium for detecting malicious program
CN111259397B (en) Malware classification method based on Markov graph and deep learning
CN112491796A (en) Intrusion detection and semantic decision tree quantitative interpretation method based on convolutional neural network
Seneviratne et al. Self-supervised vision transformers for malware detection
CN112088378A (en) Image hidden information detector
CN113904861B (en) Encryption traffic safety detection method and device
Tran et al. Image-based unknown malware classification with few-shot learning models
AlGarni et al. An efficient convolutional neural network with transfer learning for malware classification
CN110705622A (en) Decision-making method and system and electronic equipment
CN111241550B (en) Vulnerability detection method based on binary mapping and deep learning
CN111291712B (en) Forest fire recognition method and device based on interpolation CN and capsule network
CN116644422A (en) Malicious code detection method based on malicious block labeling and image processing
CN112560034A (en) Malicious code sample synthesis method and device based on feedback type deep countermeasure network
Xin et al. Malicious code detection method based on image segmentation and deep residual network RESNET
CN116595525A (en) Threshold mechanism malicious software detection method and system based on software map
CN117496246A (en) Malicious software classification method based on convolutional neural network
CN116188439A (en) False face-changing image detection method and device based on identity recognition probability distribution
CN115567224A (en) Method for detecting abnormal transaction of block chain and related product
CN115828239A (en) Malicious code detection method based on multi-dimensional data decision fusion
CN115564970A (en) Network attack tracing method, system and storage medium
CN114638356A (en) Static weight guided deep neural network back door detection method and system
CN113553586A (en) Virus detection method, model training method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination