CN116996278B - Webpage detection method and device based on mining behavior of WASM module - Google Patents
Webpage detection method and device based on mining behavior of WASM module Download PDFInfo
- Publication number
- CN116996278B CN116996278B CN202310900991.8A CN202310900991A CN116996278B CN 116996278 B CN116996278 B CN 116996278B CN 202310900991 A CN202310900991 A CN 202310900991A CN 116996278 B CN116996278 B CN 116996278B
- Authority
- CN
- China
- Prior art keywords
- module
- target
- dimensional matrix
- local
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 117
- 238000005065 mining Methods 0.000 title claims abstract description 50
- 238000000605 extraction Methods 0.000 claims abstract description 16
- 239000011159 matrix material Substances 0.000 claims description 51
- 239000010410 layer Substances 0.000 claims description 46
- 238000013527 convolutional neural network Methods 0.000 claims description 19
- 238000011176 pooling Methods 0.000 claims description 13
- 238000000034 method Methods 0.000 claims description 12
- 238000012549 training Methods 0.000 claims description 11
- 239000002344 surface layer Substances 0.000 claims description 10
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 4
- 239000000284 extract Substances 0.000 claims description 3
- 230000006399 behavior Effects 0.000 abstract description 31
- 230000009471 action Effects 0.000 description 5
- 230000000007 visual effect Effects 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 241000700605 Viruses Species 0.000 description 1
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1441—Countermeasures against malicious traffic
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
- G06F11/3672—Test management
- G06F11/3684—Test management for test design, e.g. generating new test cases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/54—Extraction of image or video features relating to texture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computer Security & Cryptography (AREA)
- Multimedia (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Medical Informatics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Hardware Design (AREA)
- Biophysics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a webpage detection method and device based on a WASM module mining behavior, comprising the following steps: acquiring a target website; extracting binary files of the WASM module in the target website; converting the binary file into a visualized RGB image; inputting the RGB image into a webpage detection model, generating target characteristics after the webpage detection model performs characteristic extraction based on the image textures of the RGB image, and generating detection results of mining behaviors or mining behaviors without the existence of a target website according to the target characteristics. The invention can realize the detection of whether the WASM module has mining behaviors or not through the RGB images corresponding to the WASM module, can avoid the problem of inaccurate detection caused by the detection based on the byte code level, can well detect the mining behaviors compiled into the confusion type code, and improves the accuracy of webpage detection.
Description
Technical Field
The invention relates to the technical field of webpage detection, in particular to a webpage detection method and device based on mining behavior of a WASM module.
Background
Hackers often embed a mining script into a web page so that when a user accesses and browses the page, the mining script is executed to force the user to exit the page, and the mining behavior is used by many websites as a substitute for online advertising benefits of the web page, so that more and more hackers can obtain great benefits from the web page by attacking the websites to steal the embedded mining script. The new generation of mining viruses has appeared a mixed-up mining behavior of compiling part of JS script codes into a WASM module in a webpage, or a WASM module of compiling a WASM program into a C language code by using a C language confusing device to recompile the mixed C language code into the webpage, and the mixed Web program can completely access the data and functions of the webpage through the APIs of WebAssemble and JavaScript. For the above-mentioned mixed-up mining behavior, the prior art often adopts a JS code analysis-based detector or a WASM instruction detector to detect whether the mining behavior exists, and because of the detection of the level of the bytecode or the detection of the operation code, it is difficult to accurately detect whether the malicious mining behavior exists in the web page for the mining behavior with the mixed-up technology, so that the accuracy of the web page detection is lower.
Disclosure of Invention
The embodiment of the invention provides a webpage detection method and device based on a WASM module mining behavior, which can effectively solve the problems that in the prior art, due to the detection of a byte code level basically or based on the detection of an operation code, whether a webpage has a malicious mining behavior or not can be difficult to accurately detect for the mining behavior with confusion technology, so that the webpage detection accuracy is lower.
The embodiment of the invention provides a webpage detection method based on a WASM module mining behavior, which comprises the following steps:
acquiring a target website;
extracting binary files of the WASM module in the target website;
converting the binary file into a visualized RGB image;
inputting the RGB image into a webpage detection model, so that the webpage detection model generates target characteristics after characteristic extraction based on image textures of the RGB image; generating a detection result of the target website according to the target characteristics; wherein, the detection result comprises: there is or is no mining action.
Preferably, the converting the binary file into a visualized RGB image includes:
extracting byte streams in the binary file, and generating a corresponding two-dimensional matrix from the byte streams;
calculating a local entropy value corresponding to the two-dimensional matrix and calculating a global entropy value corresponding to the two-dimensional matrix;
taking the value in the two-dimensional matrix as an R channel value, taking the local entropy value as a G channel value and taking the global entropy value as a B channel value;
and mapping the R channel value, the G channel value and the B channel value into corresponding channels to generate a visualized RGB image.
Preferably, the calculating the local entropy value corresponding to the two-dimensional matrix includes:
obtaining a plurality of sub-matrixes of numerical values of the central value of the two-dimensional matrix in a preset range, and calculating a local entropy value corresponding to each sub-matrix according to the following formula:
H_local=-a*∑(n/g)*log2(n/g)
wherein H_local is the local entropy value corresponding to each submatrix, a is a preset multiple, n is the frequency of each numerical value in the submatrix, and g is the number of the numerical values;
preferably, the calculating the global entropy value corresponding to the two-dimensional matrix includes:
and calculating a global entropy value corresponding to the two-dimensional matrix according to the following formula:
L_local=-a*∑(n/N)*log2(n/N);
wherein L_local is the global entropy value corresponding to the two-dimensional matrix, a is a preset multiple, N is the frequency of each numerical value in the submatrix, and N is the total number of the numerical values of the two-dimensional matrix.
Preferably, the webpage detection model comprises a convolution layer, a pooling layer and a full connection layer;
after the webpage detection model performs feature extraction based on the image texture of the RGB image, generating target features, including:
the convolution layer in the webpage detection model extracts surface layer features and deep layer features of the RGB image based on the image textures of the RGB image, and sends the surface layer features and the deep layer features to the pooling layer;
and the pooling layer performs feature selection on the surface layer features and the deep layer features, and generates target features after downsampling the RGB image.
Preferably, the generating a detection result of the target website according to the target feature includes:
and classifying the target features through a full connection layer in the webpage detection model to generate a detection result of the target website.
Preferably, the generating of the web page detection model includes:
taking an RGB image corresponding to a sample WASM module as input of a convolutional neural network, taking an actual detection result of the sample WASM module as output of the convolutional neural network, and performing iterative training on the convolutional neural network;
and when the convergence of the convolutional neural network is detected, taking the convolutional neural network after training as the webpage detection model.
Webpage detection device based on WASM module mining behavior includes: the system comprises a website acquisition module, a binary file extraction module, an image conversion module and a detection result generation module;
the website acquisition module is used for acquiring a target website;
the binary file extraction module is used for extracting binary files of the WASM module in the target website;
the image conversion module is used for converting the binary file into a visualized RGB image;
the detection result generation module is used for inputting the RGB image into a webpage detection model so that the webpage detection model can generate target characteristics after extracting characteristics based on image textures of the RGB image; generating a detection result of the target website according to the target characteristics; wherein, the detection result comprises: there is or is no mining action.
The invention has the following beneficial effects:
the embodiment of the invention provides a webpage detection method and device based on a WASM module mining behavior, comprising the following steps: acquiring a target website; extracting binary files of the WASM module in the target website; converting the binary file into a visualized RGB image; inputting the RGB image into a webpage detection model, so that the webpage detection model generates target characteristics after characteristic extraction based on image textures of the RGB image; generating a detection result of the target website according to the target characteristics; wherein, the detection result comprises: there is or is no mining action. Compared with the prior art, the invention converts the binary files in the WASM module into the visualized RGB images, then carries out feature extraction based on the image textures of the RGB images through the webpage detection model, and because different binary files can generate different images, the WASM module with malicious mining behaviors is different from the visual RGB images corresponding to the benign WASM module without malicious mining behaviors in texture features, after the features of the images are extracted based on the image textures through the webpage detection model, different detection results corresponding to different images can be distinguished, so that the invention can realize that whether the WASM module has mining behaviors or not can be detected through the RGB images corresponding to the WASM module, and can avoid the problem of inaccurate detection caused by detection based on byte code level detection only; according to the invention, the binary files corresponding to the WASM module are converted into the visual RGB image for detection, and codes of the WASM module are not detected, so that the mining behavior of which the codes are compiled into confusion can be well detected, and the accuracy of webpage detection is improved.
Drawings
Fig. 1 is a flow chart of a web page detection method based on a mining behavior of a WASM module according to an embodiment of the present invention.
Fig. 2 is a network architecture diagram of a web page detection model according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of classification accuracy of a web page detection model according to an embodiment of the invention.
Fig. 4 is a schematic diagram of a classification metric of a web page detection model according to an embodiment of the invention.
Fig. 5 is a schematic structural diagram of a web page detection device based on a mining behavior of a WASM module according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 is a schematic flow chart of a web page detection method based on a mining behavior of a WASM module according to an embodiment of the present invention, where the web page detection method includes:
step S1: acquiring a target website;
step S2: extracting binary files of the WASM module in the target website;
step S3: converting the binary file into a visualized RGB image;
step S4: inputting the RGB image into a webpage detection model, so that the webpage detection model generates target characteristics after characteristic extraction based on image textures of the RGB image; generating a detection result of the target website according to the target characteristics; wherein, the detection result comprises: there is or is no mining action.
For step S1, in a preferred embodiment, the webpage detection model is simple and easy to use, after the model is trained, the URL of the website is input, and the files in the WASM module are extracted for detection, so that the analysis difficulty is greatly reduced, and meanwhile, the analysis rapidity is improved.
For step S2, in a preferred embodiment, extracting the binary file of the WASM module in the target website includes:
acquiring a target website input by a user, and collecting a WASM module in the website in the target website to obtain a WASM binary file to be detected.
For step S3, in a preferred embodiment, said converting said binary file into a visualized RGB image comprises:
extracting byte streams in the binary file, and generating a corresponding two-dimensional matrix from the byte streams;
calculating a local entropy value corresponding to the two-dimensional matrix and calculating a global entropy value corresponding to the two-dimensional matrix; specifically, a plurality of sub-matrixes of numerical values of the central value of the two-dimensional matrix in a preset range are obtained, and the local entropy value corresponding to each sub-matrix is calculated according to the following formula:
H_local=-a*∑(n/g)*log 2 (n/g)
wherein H_local is the local entropy value corresponding to each submatrix, a is a preset multiple, n is the frequency of each numerical value in the submatrix, and g is the number of the numerical values;
calculating a global entropy value corresponding to the two-dimensional matrix according to the following formula, wherein the global entropy value comprises:
L_local=-a*∑(n/N)*log 2 (n/N);
wherein L_local is the global entropy value corresponding to the two-dimensional matrix, a is a preset multiple, N is the frequency of each numerical value in the submatrix, and N is the total number of the numerical values of the two-dimensional matrix.
Taking the value in the two-dimensional matrix as an R channel value, taking the local entropy value as a G channel value and taking the global entropy value as a B channel value;
and mapping the R channel value, the G channel value and the B channel value into corresponding channels to generate a visualized RGB image.
Specifically, in a preferred embodiment, firstly, the data of the binary file of the WASM module is stored in a one-dimensional array, and the length of the one-dimensional array is L;
converting the one-dimensional array into a two-dimensional matrix of L, and deleting redundant bytes;
the values of three channels are calculated: taking each numerical value of the two-dimensional matrix as an R channel; taking 60 times of the local entropy of each numerical value as a G channel; the calculation process of the local entropy comprises the following steps:
the method comprises the steps of obtaining a submatrix with surrounding 9 numerical values taking the value of a current matrix as a center, and calculating the local entropy of the submatrix, wherein the specific calculation formula is as follows:
H_local=-60*∑(n/9)*log2(n/9);
where n is the frequency of each value in the submatrix, the formula represents multiplying the probability (n/9) of each value in the submatrix by its corresponding information quantity (-log 2 (n/9)), and then summing all values in the submatrix to obtain the value H_local of the local entropy.
Taking 60 times of global entropy of each numerical value as a B channel; the calculation process of the global entropy comprises the following steps:
L_local=-60*∑(n/N)*log2(n/N);
where N is the frequency of each value in the two-dimensional matrix, the number of the total number of the two-dimensional matrices is N, the formula represents multiplying the probability (N/N) of each value in the two-dimensional matrix by its corresponding information quantity (-log 2 (N/N)), and then summing all the values in the two-dimensional matrix to obtain the value h_local of the global entropy.
The above R, G, B values are mapped into three channels to form an RGB image.
In the invention, on the visual analysis of the image, a new method for converting binary files into the visualized RGB image is provided, and more local and global characteristic information can be reserved by using the combination of local entropy and global entropy, so that more accurate classification results can be output when the characteristics of the image are extracted for classification later.
The invention provides a method for visualizing the feature of the mining WASM to realize the detection of the mining webpage, compared with the prior WASM feature graying method, the image texture is clearer, the image contour distinction between benign and malicious is larger, and the overall performance is good;
for step S4, in a preferred embodiment, the RGB image is input into a web detection model, so that the web detection model generates target features after feature extraction based on image textures of the RGB image; generating a detection result of the target website according to the target characteristics, wherein the detection result comprises the following steps:
as shown in FIG. 2, the web page detection model of the present invention comprises a convolution layer, a pooling layer and a full-connection layer, wherein the network structure of the present invention adopts three convolution layers, three pooling layers and two full-connection layers
The convolution layer in the webpage detection model extracts surface layer features and deep layer features of the RGB image based on the image textures of the RGB image, and sends the surface layer features and the deep layer features to the pooling layer;
the pooling layer performs feature selection on the surface layer features and the deep layer features, and generates target features after downsampling the RGB image;
and finally, classifying the target features through a full connection layer in the webpage detection model to generate a detection result of the target website.
Schematically, the convolution layer is the core of the convolution neural network corresponding to the webpage detection model, and is mainly used for extracting the surface layer characteristics and the deep layer characteristics of the image. The number and the size of convolution kernels of the first layer adopted by the invention are respectively 20 and 3 x 3, the second layer is 50 and 3 x 3, and the third layer is 100 and 3 x 3. The pooling layer is mainly used for feature selection and downsampling pictures. The size of the pooling layer adopted in the invention is 2 x 2, the step length is 2, and the features are sent into the full connection layer after feature extraction and feature selection of the convolution layer and the pooling layer. The function of the full connection layer is mainly to reduce the influence of the spatial position of the picture on the characteristics and classify the samples. In the invention, two full-connection layers are adopted, namely 1 x 256 and 1 x 2 respectively, and finally, classification of the output pictures is adopted;
in a preferred embodiment, the output of each layer of the convolutional neural network of the present invention is non-linearly transformed by the Relu function, so as to improve the accuracy of the output classification result.
In a preferred embodiment, the generation of the web page detection model of the present invention includes:
taking an RGB image corresponding to a sample WASM module as input of a convolutional neural network, taking an actual detection result of the sample WASM module as output of the convolutional neural network, and performing iterative training on the convolutional neural network;
and when the convergence of the convolutional neural network is detected, taking the convolutional neural network after training as the webpage detection model.
Specifically, the RGB image corresponding to the sample WASM module is normalized to 100 x 100 equal size by using a double interpolation algorithm and stored, and then the normalized sample image is input into a convolutional neural network model for training; the training set and the testing set are divided into a training set and a testing set in a proportion of 8:2 by using RGB images corresponding to a benign sample WASM module and a malicious sample WASM module to train the convolutional neural network. Schematically, after training, the accuracy of convolutional neural network classification and the classification metrics thereof provided by the invention are shown in fig. 4 and 5, and in a preferred embodiment, after the final model of the invention performs parameter tuning, the learning rate is finally determined to be 0.0011.
After model training is finished, a webpage URL can be provided for a detection model or a system, and the system can obtain whether the WASM binary file is malicious or benign by extracting the WASM binary file of the provided webpage and inputting images corresponding to the WASM binary file into the trained model, so that whether the corresponding webpage has mining behaviors or does not have non-mining behaviors is indicated.
Compared with the existing WASM feature gray scale method, the WASM module binary file containing the mining operation code can be extracted and converted into a visual RGB image, and the detection is performed on the image level through a webpage detection model. The method is not easily affected by the confusion technology, so that the novel mining webpage detection effect based on the WASM technology is better, because the malicious WASM module and the benign WASM module are different in texture characteristics after corresponding to the visual RGB image, whether the classification result of the mining behavior exists or not can be more accurately identified in the image through the trained webpage detection model, that is, the final model performance reaches the expectation, and the detection overall performance is better and the detection result is more accurate.
As shown in fig. 5, on the basis of the above-mentioned embodiments of the web page detection method based on the mining behavior of the WASM module, the present invention correspondingly provides an embodiment of a device item;
an embodiment of the present invention provides a web page detection device based on a WASM module mining behavior, including: the system comprises a website acquisition module, a binary file extraction module, an image conversion module and a detection result generation module;
the website acquisition module is used for acquiring a target website;
the binary file extraction module is used for extracting binary files of the WASM module in the target website;
the image conversion module is used for converting the binary file into a visualized RGB image;
the detection result generation module is used for inputting the RGB image into a webpage detection model so that the webpage detection model can generate target characteristics after extracting characteristics based on image textures of the RGB image; generating a detection result of the target website according to the target characteristics; wherein, the detection result comprises: there is or is no mining action.
It should be noted that the above-described apparatus embodiments are merely illustrative, and the units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the invention, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
It will be clearly understood by those skilled in the art that, for convenience and brevity, the specific working process of the apparatus described above may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that changes and modifications may be made without departing from the principles of the invention, such changes and modifications are also intended to be within the scope of the invention.
Claims (5)
1. A webpage detection method based on a WASM module mining behavior is characterized by comprising the following steps:
acquiring a target website;
extracting binary files of the WASM module in the target website;
extracting byte streams in the binary file, and generating a corresponding two-dimensional matrix from the byte streams;
calculating a local entropy value corresponding to the two-dimensional matrix and calculating a global entropy value corresponding to the two-dimensional matrix; taking the value in the two-dimensional matrix as an R channel value, taking the local entropy value as a G channel value and taking the global entropy value as a B channel value; mapping the R channel value, the G channel value and the B channel value into corresponding channels to generate a visualized RGB image;
inputting the RGB image into a webpage detection model, so that the webpage detection model generates target characteristics after characteristic extraction based on image textures of the RGB image; generating a detection result of the target website according to the target characteristics; wherein, the detection result comprises: the ore digging behavior exists or does not exist;
the calculating the local entropy value corresponding to the two-dimensional matrix comprises the following steps:
obtaining a plurality of sub-matrixes of target values of the central value of the two-dimensional matrix within a preset range, and calculating a local entropy value corresponding to each sub-matrix according to the following formula:
H_local = -a*;
wherein H_local is the local entropy value corresponding to each sub-matrix, a is a preset multiple, i is the calculated times, b is the total number of numerical values in the sub-matrix,for the frequency of each numerical value in the submatrix, g is the number of target numerical values;
the calculating the global entropy value corresponding to the two-dimensional matrix comprises the following steps:
calculating a global entropy value corresponding to the two-dimensional matrix according to the following formula:
L_local=-a*;
wherein L_local is the global entropy corresponding to the two-dimensional matrix, a is a preset multiple, i is the calculated times,for the frequency of each numerical value in the submatrix, N is the total number of numerical values of the two-dimensional matrix.
2. The web page detection method based on the mining behavior of the WASM module as claimed in claim 1, wherein the web page detection model comprises a convolution layer, a pooling layer and a full connection layer;
after the webpage detection model performs feature extraction based on the image texture of the RGB image, generating target features, including:
the convolution layer in the webpage detection model extracts surface layer features and deep layer features of the RGB image based on the image textures of the RGB image, and sends the surface layer features and the deep layer features to the pooling layer;
and the pooling layer performs feature selection on the surface layer features and the deep layer features, and generates target features after downsampling the RGB image.
3. The method for detecting web pages based on mining behavior of WASM module according to claim 2, wherein the generating the detection result of the target website according to the target feature comprises:
and classifying the target features through a full connection layer in the webpage detection model to generate a detection result of the target website.
4. The web page detection method based on the mining behavior of the WASM module as claimed in claim 1, wherein the generating of the web page detection model comprises:
taking an RGB image corresponding to a sample WASM module as input of a convolutional neural network, taking an actual detection result of the sample WASM module as output of the convolutional neural network, and performing iterative training on the convolutional neural network;
and when the convergence of the convolutional neural network is detected, taking the convolutional neural network after training as the webpage detection model.
5. Webpage detection device based on WASM module mining behavior, characterized by comprising: the system comprises a website acquisition module, a binary file extraction module, an image conversion module and a detection result generation module;
the website acquisition module is used for acquiring a target website;
the binary file extraction module is used for extracting binary files of the WASM module in the target website;
the image conversion module is used for extracting byte streams in the binary file and generating corresponding two-dimensional matrixes from the byte streams; calculating a local entropy value corresponding to the two-dimensional matrix and calculating a global entropy value corresponding to the two-dimensional matrix; taking the value in the two-dimensional matrix as an R channel value, taking the local entropy value as a G channel value and taking the global entropy value as a B channel value; mapping the R channel value, the G channel value and the B channel value into corresponding channels to generate a visualized RGB image;
the detection result generation module is used for inputting the RGB image into a webpage detection model so that the webpage detection model can generate target characteristics after extracting characteristics based on image textures of the RGB image; generating a detection result of the target website according to the target characteristics; wherein, the detection result comprises: the ore digging behavior exists or does not exist;
the calculating the local entropy value corresponding to the two-dimensional matrix comprises the following steps:
obtaining a plurality of sub-matrixes of target values of the central value of the two-dimensional matrix within a preset range, and calculating a local entropy value corresponding to each sub-matrix according to the following formula:
H_local = -a*;
wherein H_local is the local entropy value corresponding to each sub-matrix, a is a preset multiple, i is the calculated times, b is the total number of numerical values in the sub-matrix,for the frequency of each numerical value in the submatrix, g is the number of target numerical values;
the calculating the global entropy value corresponding to the two-dimensional matrix comprises the following steps:
calculating a global entropy value corresponding to the two-dimensional matrix according to the following formula:
L_local=-a*;
wherein L_local is the global entropy corresponding to the two-dimensional matrix, a is a preset multiple, i is the calculated times,for the frequency of each numerical value in the submatrix, N is the total number of numerical values of the two-dimensional matrix.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310900991.8A CN116996278B (en) | 2023-07-21 | 2023-07-21 | Webpage detection method and device based on mining behavior of WASM module |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310900991.8A CN116996278B (en) | 2023-07-21 | 2023-07-21 | Webpage detection method and device based on mining behavior of WASM module |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116996278A CN116996278A (en) | 2023-11-03 |
CN116996278B true CN116996278B (en) | 2024-01-19 |
Family
ID=88522527
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310900991.8A Active CN116996278B (en) | 2023-07-21 | 2023-07-21 | Webpage detection method and device based on mining behavior of WASM module |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116996278B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108062478A (en) * | 2018-01-04 | 2018-05-22 | 北京理工大学 | The malicious code sorting technique that global characteristics visualization is combined with local feature |
CN108846284A (en) * | 2018-06-29 | 2018-11-20 | 浙江工业大学 | A kind of Android malicious application detection method based on bytecode image and deep learning |
US10162967B1 (en) * | 2016-08-17 | 2018-12-25 | Trend Micro Incorporated | Methods and systems for identifying legitimate computer files |
KR101922956B1 (en) * | 2018-08-07 | 2019-02-27 | (주)케이사인 | Method of detecting malware based on entropy count map of low dimensional number |
CN111585961A (en) * | 2020-04-03 | 2020-08-25 | 北京大学 | Webpage mining attack detection and protection method and device |
CN112214766A (en) * | 2020-10-12 | 2021-01-12 | 杭州安恒信息技术股份有限公司 | Method and device for detecting mining trojans, electronic device and storage medium |
WO2022089763A1 (en) * | 2020-10-30 | 2022-05-05 | Inlyse Gmbh | Method for detection of malware |
CN116208356A (en) * | 2022-10-27 | 2023-06-02 | 浙江大学 | Virtual currency mining flow detection method based on deep learning |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220311782A1 (en) * | 2021-03-24 | 2022-09-29 | Mayachitra, Inc. | Malware detection using frequency domain-based image visualization and deep learning |
-
2023
- 2023-07-21 CN CN202310900991.8A patent/CN116996278B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10162967B1 (en) * | 2016-08-17 | 2018-12-25 | Trend Micro Incorporated | Methods and systems for identifying legitimate computer files |
CN108062478A (en) * | 2018-01-04 | 2018-05-22 | 北京理工大学 | The malicious code sorting technique that global characteristics visualization is combined with local feature |
CN108846284A (en) * | 2018-06-29 | 2018-11-20 | 浙江工业大学 | A kind of Android malicious application detection method based on bytecode image and deep learning |
KR101922956B1 (en) * | 2018-08-07 | 2019-02-27 | (주)케이사인 | Method of detecting malware based on entropy count map of low dimensional number |
CN111585961A (en) * | 2020-04-03 | 2020-08-25 | 北京大学 | Webpage mining attack detection and protection method and device |
CN112214766A (en) * | 2020-10-12 | 2021-01-12 | 杭州安恒信息技术股份有限公司 | Method and device for detecting mining trojans, electronic device and storage medium |
WO2022089763A1 (en) * | 2020-10-30 | 2022-05-05 | Inlyse Gmbh | Method for detection of malware |
CN116208356A (en) * | 2022-10-27 | 2023-06-02 | 浙江大学 | Virtual currency mining flow detection method based on deep learning |
Non-Patent Citations (3)
Title |
---|
基于多通道图像深度学习的恶意代码检测;蒋考林等;《计算机应用》;全文 * |
基于字节码图像和深度学习的Android恶意应用检测;陈铁明等;《研究与开发》;全文 * |
抗混淆的恶意代码图像纹理特征描述方法;刘亚姝等;《通信学报》;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN116996278A (en) | 2023-11-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11163991B2 (en) | Method and apparatus for detecting body | |
US20190139191A1 (en) | Image processing methods and image processing devices | |
CN107229918A (en) | A kind of SAR image object detection method based on full convolutional neural networks | |
CN114900126B (en) | Grounding test equipment and grounding test method for solar cell module | |
JP7163504B2 (en) | Image processing method and its apparatus, computer program and electronic equipment | |
Li et al. | Few-Shot Learning with Generative Adversarial Networks Based on WOA13 Data. | |
CN115511890B (en) | Analysis system for large-flow data of special-shaped network interface | |
CN113361367B (en) | Underground target electromagnetic inversion method and system based on deep learning | |
CN109978888B (en) | Image segmentation method, device and computer readable storage medium | |
Xie et al. | Bag-of-words feature representation for blind image quality assessment with local quantized pattern | |
CN112001362A (en) | Image analysis method, image analysis device and image analysis system | |
CN107480621B (en) | Age identification method based on face image | |
CN111145202B (en) | Model generation method, image processing method, device, equipment and storage medium | |
CN113920538A (en) | Object detection method, device, equipment, storage medium and computer program product | |
CN112419342A (en) | Image processing method, image processing device, electronic equipment and computer readable medium | |
CN116091414A (en) | Cardiovascular image recognition method and system based on deep learning | |
CN116797590A (en) | Mura defect detection method and system based on machine vision | |
CN113592769B (en) | Abnormal image detection and model training method, device, equipment and medium | |
Yang et al. | No‐reference image quality assessment via structural information fluctuation | |
Oga et al. | River state classification combining patch-based processing and CNN | |
CN107871128B (en) | High-robustness image recognition method based on SVG dynamic graph | |
CN116996278B (en) | Webpage detection method and device based on mining behavior of WASM module | |
CN116975864A (en) | Malicious code detection method and device, electronic equipment and storage medium | |
CN112380537A (en) | Method, device, storage medium and electronic equipment for detecting malicious software | |
CN116228520A (en) | Image compressed sensing reconstruction method and system based on transform generation countermeasure network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |