CN116996278B - Webpage detection method and device based on mining behavior of WASM module - Google Patents

Webpage detection method and device based on mining behavior of WASM module Download PDF

Info

Publication number
CN116996278B
CN116996278B CN202310900991.8A CN202310900991A CN116996278B CN 116996278 B CN116996278 B CN 116996278B CN 202310900991 A CN202310900991 A CN 202310900991A CN 116996278 B CN116996278 B CN 116996278B
Authority
CN
China
Prior art keywords
module
target
dimensional matrix
local
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310900991.8A
Other languages
Chinese (zh)
Other versions
CN116996278A (en
Inventor
张瑜
王慧
潘小明
石元泉
陈桂宏
彭景惠
陈艺芳
黄炜艺
陈溢爽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Electronic Information Product Inspection And Research Institute
Guangdong Polytechnic Normal University
Original Assignee
Zhejiang Electronic Information Product Inspection And Research Institute
Guangdong Polytechnic Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Electronic Information Product Inspection And Research Institute, Guangdong Polytechnic Normal University filed Critical Zhejiang Electronic Information Product Inspection And Research Institute
Priority to CN202310900991.8A priority Critical patent/CN116996278B/en
Publication of CN116996278A publication Critical patent/CN116996278A/en
Application granted granted Critical
Publication of CN116996278B publication Critical patent/CN116996278B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3684Test management for test design, e.g. generating new test cases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/54Extraction of image or video features relating to texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a webpage detection method and device based on a WASM module mining behavior, comprising the following steps: acquiring a target website; extracting binary files of the WASM module in the target website; converting the binary file into a visualized RGB image; inputting the RGB image into a webpage detection model, generating target characteristics after the webpage detection model performs characteristic extraction based on the image textures of the RGB image, and generating detection results of mining behaviors or mining behaviors without the existence of a target website according to the target characteristics. The invention can realize the detection of whether the WASM module has mining behaviors or not through the RGB images corresponding to the WASM module, can avoid the problem of inaccurate detection caused by the detection based on the byte code level, can well detect the mining behaviors compiled into the confusion type code, and improves the accuracy of webpage detection.

Description

Webpage detection method and device based on mining behavior of WASM module
Technical Field
The invention relates to the technical field of webpage detection, in particular to a webpage detection method and device based on mining behavior of a WASM module.
Background
Hackers often embed a mining script into a web page so that when a user accesses and browses the page, the mining script is executed to force the user to exit the page, and the mining behavior is used by many websites as a substitute for online advertising benefits of the web page, so that more and more hackers can obtain great benefits from the web page by attacking the websites to steal the embedded mining script. The new generation of mining viruses has appeared a mixed-up mining behavior of compiling part of JS script codes into a WASM module in a webpage, or a WASM module of compiling a WASM program into a C language code by using a C language confusing device to recompile the mixed C language code into the webpage, and the mixed Web program can completely access the data and functions of the webpage through the APIs of WebAssemble and JavaScript. For the above-mentioned mixed-up mining behavior, the prior art often adopts a JS code analysis-based detector or a WASM instruction detector to detect whether the mining behavior exists, and because of the detection of the level of the bytecode or the detection of the operation code, it is difficult to accurately detect whether the malicious mining behavior exists in the web page for the mining behavior with the mixed-up technology, so that the accuracy of the web page detection is lower.
Disclosure of Invention
The embodiment of the invention provides a webpage detection method and device based on a WASM module mining behavior, which can effectively solve the problems that in the prior art, due to the detection of a byte code level basically or based on the detection of an operation code, whether a webpage has a malicious mining behavior or not can be difficult to accurately detect for the mining behavior with confusion technology, so that the webpage detection accuracy is lower.
The embodiment of the invention provides a webpage detection method based on a WASM module mining behavior, which comprises the following steps:
acquiring a target website;
extracting binary files of the WASM module in the target website;
converting the binary file into a visualized RGB image;
inputting the RGB image into a webpage detection model, so that the webpage detection model generates target characteristics after characteristic extraction based on image textures of the RGB image; generating a detection result of the target website according to the target characteristics; wherein, the detection result comprises: there is or is no mining action.
Preferably, the converting the binary file into a visualized RGB image includes:
extracting byte streams in the binary file, and generating a corresponding two-dimensional matrix from the byte streams;
calculating a local entropy value corresponding to the two-dimensional matrix and calculating a global entropy value corresponding to the two-dimensional matrix;
taking the value in the two-dimensional matrix as an R channel value, taking the local entropy value as a G channel value and taking the global entropy value as a B channel value;
and mapping the R channel value, the G channel value and the B channel value into corresponding channels to generate a visualized RGB image.
Preferably, the calculating the local entropy value corresponding to the two-dimensional matrix includes:
obtaining a plurality of sub-matrixes of numerical values of the central value of the two-dimensional matrix in a preset range, and calculating a local entropy value corresponding to each sub-matrix according to the following formula:
H_local=-a*∑(n/g)*log2(n/g)
wherein H_local is the local entropy value corresponding to each submatrix, a is a preset multiple, n is the frequency of each numerical value in the submatrix, and g is the number of the numerical values;
preferably, the calculating the global entropy value corresponding to the two-dimensional matrix includes:
and calculating a global entropy value corresponding to the two-dimensional matrix according to the following formula:
L_local=-a*∑(n/N)*log2(n/N);
wherein L_local is the global entropy value corresponding to the two-dimensional matrix, a is a preset multiple, N is the frequency of each numerical value in the submatrix, and N is the total number of the numerical values of the two-dimensional matrix.
Preferably, the webpage detection model comprises a convolution layer, a pooling layer and a full connection layer;
after the webpage detection model performs feature extraction based on the image texture of the RGB image, generating target features, including:
the convolution layer in the webpage detection model extracts surface layer features and deep layer features of the RGB image based on the image textures of the RGB image, and sends the surface layer features and the deep layer features to the pooling layer;
and the pooling layer performs feature selection on the surface layer features and the deep layer features, and generates target features after downsampling the RGB image.
Preferably, the generating a detection result of the target website according to the target feature includes:
and classifying the target features through a full connection layer in the webpage detection model to generate a detection result of the target website.
Preferably, the generating of the web page detection model includes:
taking an RGB image corresponding to a sample WASM module as input of a convolutional neural network, taking an actual detection result of the sample WASM module as output of the convolutional neural network, and performing iterative training on the convolutional neural network;
and when the convergence of the convolutional neural network is detected, taking the convolutional neural network after training as the webpage detection model.
Webpage detection device based on WASM module mining behavior includes: the system comprises a website acquisition module, a binary file extraction module, an image conversion module and a detection result generation module;
the website acquisition module is used for acquiring a target website;
the binary file extraction module is used for extracting binary files of the WASM module in the target website;
the image conversion module is used for converting the binary file into a visualized RGB image;
the detection result generation module is used for inputting the RGB image into a webpage detection model so that the webpage detection model can generate target characteristics after extracting characteristics based on image textures of the RGB image; generating a detection result of the target website according to the target characteristics; wherein, the detection result comprises: there is or is no mining action.
The invention has the following beneficial effects:
the embodiment of the invention provides a webpage detection method and device based on a WASM module mining behavior, comprising the following steps: acquiring a target website; extracting binary files of the WASM module in the target website; converting the binary file into a visualized RGB image; inputting the RGB image into a webpage detection model, so that the webpage detection model generates target characteristics after characteristic extraction based on image textures of the RGB image; generating a detection result of the target website according to the target characteristics; wherein, the detection result comprises: there is or is no mining action. Compared with the prior art, the invention converts the binary files in the WASM module into the visualized RGB images, then carries out feature extraction based on the image textures of the RGB images through the webpage detection model, and because different binary files can generate different images, the WASM module with malicious mining behaviors is different from the visual RGB images corresponding to the benign WASM module without malicious mining behaviors in texture features, after the features of the images are extracted based on the image textures through the webpage detection model, different detection results corresponding to different images can be distinguished, so that the invention can realize that whether the WASM module has mining behaviors or not can be detected through the RGB images corresponding to the WASM module, and can avoid the problem of inaccurate detection caused by detection based on byte code level detection only; according to the invention, the binary files corresponding to the WASM module are converted into the visual RGB image for detection, and codes of the WASM module are not detected, so that the mining behavior of which the codes are compiled into confusion can be well detected, and the accuracy of webpage detection is improved.
Drawings
Fig. 1 is a flow chart of a web page detection method based on a mining behavior of a WASM module according to an embodiment of the present invention.
Fig. 2 is a network architecture diagram of a web page detection model according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of classification accuracy of a web page detection model according to an embodiment of the invention.
Fig. 4 is a schematic diagram of a classification metric of a web page detection model according to an embodiment of the invention.
Fig. 5 is a schematic structural diagram of a web page detection device based on a mining behavior of a WASM module according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 is a schematic flow chart of a web page detection method based on a mining behavior of a WASM module according to an embodiment of the present invention, where the web page detection method includes:
step S1: acquiring a target website;
step S2: extracting binary files of the WASM module in the target website;
step S3: converting the binary file into a visualized RGB image;
step S4: inputting the RGB image into a webpage detection model, so that the webpage detection model generates target characteristics after characteristic extraction based on image textures of the RGB image; generating a detection result of the target website according to the target characteristics; wherein, the detection result comprises: there is or is no mining action.
For step S1, in a preferred embodiment, the webpage detection model is simple and easy to use, after the model is trained, the URL of the website is input, and the files in the WASM module are extracted for detection, so that the analysis difficulty is greatly reduced, and meanwhile, the analysis rapidity is improved.
For step S2, in a preferred embodiment, extracting the binary file of the WASM module in the target website includes:
acquiring a target website input by a user, and collecting a WASM module in the website in the target website to obtain a WASM binary file to be detected.
For step S3, in a preferred embodiment, said converting said binary file into a visualized RGB image comprises:
extracting byte streams in the binary file, and generating a corresponding two-dimensional matrix from the byte streams;
calculating a local entropy value corresponding to the two-dimensional matrix and calculating a global entropy value corresponding to the two-dimensional matrix; specifically, a plurality of sub-matrixes of numerical values of the central value of the two-dimensional matrix in a preset range are obtained, and the local entropy value corresponding to each sub-matrix is calculated according to the following formula:
H_local=-a*∑(n/g)*log 2 (n/g)
wherein H_local is the local entropy value corresponding to each submatrix, a is a preset multiple, n is the frequency of each numerical value in the submatrix, and g is the number of the numerical values;
calculating a global entropy value corresponding to the two-dimensional matrix according to the following formula, wherein the global entropy value comprises:
L_local=-a*∑(n/N)*log 2 (n/N);
wherein L_local is the global entropy value corresponding to the two-dimensional matrix, a is a preset multiple, N is the frequency of each numerical value in the submatrix, and N is the total number of the numerical values of the two-dimensional matrix.
Taking the value in the two-dimensional matrix as an R channel value, taking the local entropy value as a G channel value and taking the global entropy value as a B channel value;
and mapping the R channel value, the G channel value and the B channel value into corresponding channels to generate a visualized RGB image.
Specifically, in a preferred embodiment, firstly, the data of the binary file of the WASM module is stored in a one-dimensional array, and the length of the one-dimensional array is L;
converting the one-dimensional array into a two-dimensional matrix of L, and deleting redundant bytes;
the values of three channels are calculated: taking each numerical value of the two-dimensional matrix as an R channel; taking 60 times of the local entropy of each numerical value as a G channel; the calculation process of the local entropy comprises the following steps:
the method comprises the steps of obtaining a submatrix with surrounding 9 numerical values taking the value of a current matrix as a center, and calculating the local entropy of the submatrix, wherein the specific calculation formula is as follows:
H_local=-60*∑(n/9)*log2(n/9);
where n is the frequency of each value in the submatrix, the formula represents multiplying the probability (n/9) of each value in the submatrix by its corresponding information quantity (-log 2 (n/9)), and then summing all values in the submatrix to obtain the value H_local of the local entropy.
Taking 60 times of global entropy of each numerical value as a B channel; the calculation process of the global entropy comprises the following steps:
L_local=-60*∑(n/N)*log2(n/N);
where N is the frequency of each value in the two-dimensional matrix, the number of the total number of the two-dimensional matrices is N, the formula represents multiplying the probability (N/N) of each value in the two-dimensional matrix by its corresponding information quantity (-log 2 (N/N)), and then summing all the values in the two-dimensional matrix to obtain the value h_local of the global entropy.
The above R, G, B values are mapped into three channels to form an RGB image.
In the invention, on the visual analysis of the image, a new method for converting binary files into the visualized RGB image is provided, and more local and global characteristic information can be reserved by using the combination of local entropy and global entropy, so that more accurate classification results can be output when the characteristics of the image are extracted for classification later.
The invention provides a method for visualizing the feature of the mining WASM to realize the detection of the mining webpage, compared with the prior WASM feature graying method, the image texture is clearer, the image contour distinction between benign and malicious is larger, and the overall performance is good;
for step S4, in a preferred embodiment, the RGB image is input into a web detection model, so that the web detection model generates target features after feature extraction based on image textures of the RGB image; generating a detection result of the target website according to the target characteristics, wherein the detection result comprises the following steps:
as shown in FIG. 2, the web page detection model of the present invention comprises a convolution layer, a pooling layer and a full-connection layer, wherein the network structure of the present invention adopts three convolution layers, three pooling layers and two full-connection layers
The convolution layer in the webpage detection model extracts surface layer features and deep layer features of the RGB image based on the image textures of the RGB image, and sends the surface layer features and the deep layer features to the pooling layer;
the pooling layer performs feature selection on the surface layer features and the deep layer features, and generates target features after downsampling the RGB image;
and finally, classifying the target features through a full connection layer in the webpage detection model to generate a detection result of the target website.
Schematically, the convolution layer is the core of the convolution neural network corresponding to the webpage detection model, and is mainly used for extracting the surface layer characteristics and the deep layer characteristics of the image. The number and the size of convolution kernels of the first layer adopted by the invention are respectively 20 and 3 x 3, the second layer is 50 and 3 x 3, and the third layer is 100 and 3 x 3. The pooling layer is mainly used for feature selection and downsampling pictures. The size of the pooling layer adopted in the invention is 2 x 2, the step length is 2, and the features are sent into the full connection layer after feature extraction and feature selection of the convolution layer and the pooling layer. The function of the full connection layer is mainly to reduce the influence of the spatial position of the picture on the characteristics and classify the samples. In the invention, two full-connection layers are adopted, namely 1 x 256 and 1 x 2 respectively, and finally, classification of the output pictures is adopted;
in a preferred embodiment, the output of each layer of the convolutional neural network of the present invention is non-linearly transformed by the Relu function, so as to improve the accuracy of the output classification result.
In a preferred embodiment, the generation of the web page detection model of the present invention includes:
taking an RGB image corresponding to a sample WASM module as input of a convolutional neural network, taking an actual detection result of the sample WASM module as output of the convolutional neural network, and performing iterative training on the convolutional neural network;
and when the convergence of the convolutional neural network is detected, taking the convolutional neural network after training as the webpage detection model.
Specifically, the RGB image corresponding to the sample WASM module is normalized to 100 x 100 equal size by using a double interpolation algorithm and stored, and then the normalized sample image is input into a convolutional neural network model for training; the training set and the testing set are divided into a training set and a testing set in a proportion of 8:2 by using RGB images corresponding to a benign sample WASM module and a malicious sample WASM module to train the convolutional neural network. Schematically, after training, the accuracy of convolutional neural network classification and the classification metrics thereof provided by the invention are shown in fig. 4 and 5, and in a preferred embodiment, after the final model of the invention performs parameter tuning, the learning rate is finally determined to be 0.0011.
After model training is finished, a webpage URL can be provided for a detection model or a system, and the system can obtain whether the WASM binary file is malicious or benign by extracting the WASM binary file of the provided webpage and inputting images corresponding to the WASM binary file into the trained model, so that whether the corresponding webpage has mining behaviors or does not have non-mining behaviors is indicated.
Compared with the existing WASM feature gray scale method, the WASM module binary file containing the mining operation code can be extracted and converted into a visual RGB image, and the detection is performed on the image level through a webpage detection model. The method is not easily affected by the confusion technology, so that the novel mining webpage detection effect based on the WASM technology is better, because the malicious WASM module and the benign WASM module are different in texture characteristics after corresponding to the visual RGB image, whether the classification result of the mining behavior exists or not can be more accurately identified in the image through the trained webpage detection model, that is, the final model performance reaches the expectation, and the detection overall performance is better and the detection result is more accurate.
As shown in fig. 5, on the basis of the above-mentioned embodiments of the web page detection method based on the mining behavior of the WASM module, the present invention correspondingly provides an embodiment of a device item;
an embodiment of the present invention provides a web page detection device based on a WASM module mining behavior, including: the system comprises a website acquisition module, a binary file extraction module, an image conversion module and a detection result generation module;
the website acquisition module is used for acquiring a target website;
the binary file extraction module is used for extracting binary files of the WASM module in the target website;
the image conversion module is used for converting the binary file into a visualized RGB image;
the detection result generation module is used for inputting the RGB image into a webpage detection model so that the webpage detection model can generate target characteristics after extracting characteristics based on image textures of the RGB image; generating a detection result of the target website according to the target characteristics; wherein, the detection result comprises: there is or is no mining action.
It should be noted that the above-described apparatus embodiments are merely illustrative, and the units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the invention, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
It will be clearly understood by those skilled in the art that, for convenience and brevity, the specific working process of the apparatus described above may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that changes and modifications may be made without departing from the principles of the invention, such changes and modifications are also intended to be within the scope of the invention.

Claims (5)

1. A webpage detection method based on a WASM module mining behavior is characterized by comprising the following steps:
acquiring a target website;
extracting binary files of the WASM module in the target website;
extracting byte streams in the binary file, and generating a corresponding two-dimensional matrix from the byte streams;
calculating a local entropy value corresponding to the two-dimensional matrix and calculating a global entropy value corresponding to the two-dimensional matrix; taking the value in the two-dimensional matrix as an R channel value, taking the local entropy value as a G channel value and taking the global entropy value as a B channel value; mapping the R channel value, the G channel value and the B channel value into corresponding channels to generate a visualized RGB image;
inputting the RGB image into a webpage detection model, so that the webpage detection model generates target characteristics after characteristic extraction based on image textures of the RGB image; generating a detection result of the target website according to the target characteristics; wherein, the detection result comprises: the ore digging behavior exists or does not exist;
the calculating the local entropy value corresponding to the two-dimensional matrix comprises the following steps:
obtaining a plurality of sub-matrixes of target values of the central value of the two-dimensional matrix within a preset range, and calculating a local entropy value corresponding to each sub-matrix according to the following formula:
H_local = -a*
wherein H_local is the local entropy value corresponding to each sub-matrix, a is a preset multiple, i is the calculated times, b is the total number of numerical values in the sub-matrix,for the frequency of each numerical value in the submatrix, g is the number of target numerical values;
the calculating the global entropy value corresponding to the two-dimensional matrix comprises the following steps:
calculating a global entropy value corresponding to the two-dimensional matrix according to the following formula:
L_local=-a*
wherein L_local is the global entropy corresponding to the two-dimensional matrix, a is a preset multiple, i is the calculated times,for the frequency of each numerical value in the submatrix, N is the total number of numerical values of the two-dimensional matrix.
2. The web page detection method based on the mining behavior of the WASM module as claimed in claim 1, wherein the web page detection model comprises a convolution layer, a pooling layer and a full connection layer;
after the webpage detection model performs feature extraction based on the image texture of the RGB image, generating target features, including:
the convolution layer in the webpage detection model extracts surface layer features and deep layer features of the RGB image based on the image textures of the RGB image, and sends the surface layer features and the deep layer features to the pooling layer;
and the pooling layer performs feature selection on the surface layer features and the deep layer features, and generates target features after downsampling the RGB image.
3. The method for detecting web pages based on mining behavior of WASM module according to claim 2, wherein the generating the detection result of the target website according to the target feature comprises:
and classifying the target features through a full connection layer in the webpage detection model to generate a detection result of the target website.
4. The web page detection method based on the mining behavior of the WASM module as claimed in claim 1, wherein the generating of the web page detection model comprises:
taking an RGB image corresponding to a sample WASM module as input of a convolutional neural network, taking an actual detection result of the sample WASM module as output of the convolutional neural network, and performing iterative training on the convolutional neural network;
and when the convergence of the convolutional neural network is detected, taking the convolutional neural network after training as the webpage detection model.
5. Webpage detection device based on WASM module mining behavior, characterized by comprising: the system comprises a website acquisition module, a binary file extraction module, an image conversion module and a detection result generation module;
the website acquisition module is used for acquiring a target website;
the binary file extraction module is used for extracting binary files of the WASM module in the target website;
the image conversion module is used for extracting byte streams in the binary file and generating corresponding two-dimensional matrixes from the byte streams; calculating a local entropy value corresponding to the two-dimensional matrix and calculating a global entropy value corresponding to the two-dimensional matrix; taking the value in the two-dimensional matrix as an R channel value, taking the local entropy value as a G channel value and taking the global entropy value as a B channel value; mapping the R channel value, the G channel value and the B channel value into corresponding channels to generate a visualized RGB image;
the detection result generation module is used for inputting the RGB image into a webpage detection model so that the webpage detection model can generate target characteristics after extracting characteristics based on image textures of the RGB image; generating a detection result of the target website according to the target characteristics; wherein, the detection result comprises: the ore digging behavior exists or does not exist;
the calculating the local entropy value corresponding to the two-dimensional matrix comprises the following steps:
obtaining a plurality of sub-matrixes of target values of the central value of the two-dimensional matrix within a preset range, and calculating a local entropy value corresponding to each sub-matrix according to the following formula:
H_local = -a*
wherein H_local is the local entropy value corresponding to each sub-matrix, a is a preset multiple, i is the calculated times, b is the total number of numerical values in the sub-matrix,for the frequency of each numerical value in the submatrix, g is the number of target numerical values;
the calculating the global entropy value corresponding to the two-dimensional matrix comprises the following steps:
calculating a global entropy value corresponding to the two-dimensional matrix according to the following formula:
L_local=-a*
wherein L_local is the global entropy corresponding to the two-dimensional matrix, a is a preset multiple, i is the calculated times,for the frequency of each numerical value in the submatrix, N is the total number of numerical values of the two-dimensional matrix.
CN202310900991.8A 2023-07-21 2023-07-21 Webpage detection method and device based on mining behavior of WASM module Active CN116996278B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310900991.8A CN116996278B (en) 2023-07-21 2023-07-21 Webpage detection method and device based on mining behavior of WASM module

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310900991.8A CN116996278B (en) 2023-07-21 2023-07-21 Webpage detection method and device based on mining behavior of WASM module

Publications (2)

Publication Number Publication Date
CN116996278A CN116996278A (en) 2023-11-03
CN116996278B true CN116996278B (en) 2024-01-19

Family

ID=88522527

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310900991.8A Active CN116996278B (en) 2023-07-21 2023-07-21 Webpage detection method and device based on mining behavior of WASM module

Country Status (1)

Country Link
CN (1) CN116996278B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062478A (en) * 2018-01-04 2018-05-22 北京理工大学 The malicious code sorting technique that global characteristics visualization is combined with local feature
CN108846284A (en) * 2018-06-29 2018-11-20 浙江工业大学 A kind of Android malicious application detection method based on bytecode image and deep learning
US10162967B1 (en) * 2016-08-17 2018-12-25 Trend Micro Incorporated Methods and systems for identifying legitimate computer files
KR101922956B1 (en) * 2018-08-07 2019-02-27 (주)케이사인 Method of detecting malware based on entropy count map of low dimensional number
CN111585961A (en) * 2020-04-03 2020-08-25 北京大学 Webpage mining attack detection and protection method and device
CN112214766A (en) * 2020-10-12 2021-01-12 杭州安恒信息技术股份有限公司 Method and device for detecting mining trojans, electronic device and storage medium
WO2022089763A1 (en) * 2020-10-30 2022-05-05 Inlyse Gmbh Method for detection of malware
CN116208356A (en) * 2022-10-27 2023-06-02 浙江大学 Virtual currency mining flow detection method based on deep learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220311782A1 (en) * 2021-03-24 2022-09-29 Mayachitra, Inc. Malware detection using frequency domain-based image visualization and deep learning

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10162967B1 (en) * 2016-08-17 2018-12-25 Trend Micro Incorporated Methods and systems for identifying legitimate computer files
CN108062478A (en) * 2018-01-04 2018-05-22 北京理工大学 The malicious code sorting technique that global characteristics visualization is combined with local feature
CN108846284A (en) * 2018-06-29 2018-11-20 浙江工业大学 A kind of Android malicious application detection method based on bytecode image and deep learning
KR101922956B1 (en) * 2018-08-07 2019-02-27 (주)케이사인 Method of detecting malware based on entropy count map of low dimensional number
CN111585961A (en) * 2020-04-03 2020-08-25 北京大学 Webpage mining attack detection and protection method and device
CN112214766A (en) * 2020-10-12 2021-01-12 杭州安恒信息技术股份有限公司 Method and device for detecting mining trojans, electronic device and storage medium
WO2022089763A1 (en) * 2020-10-30 2022-05-05 Inlyse Gmbh Method for detection of malware
CN116208356A (en) * 2022-10-27 2023-06-02 浙江大学 Virtual currency mining flow detection method based on deep learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
基于多通道图像深度学习的恶意代码检测;蒋考林等;《计算机应用》;全文 *
基于字节码图像和深度学习的Android恶意应用检测;陈铁明等;《研究与开发》;全文 *
抗混淆的恶意代码图像纹理特征描述方法;刘亚姝等;《通信学报》;全文 *

Also Published As

Publication number Publication date
CN116996278A (en) 2023-11-03

Similar Documents

Publication Publication Date Title
US11163991B2 (en) Method and apparatus for detecting body
US20190139191A1 (en) Image processing methods and image processing devices
CN107229918A (en) A kind of SAR image object detection method based on full convolutional neural networks
CN114900126B (en) Grounding test equipment and grounding test method for solar cell module
JP7163504B2 (en) Image processing method and its apparatus, computer program and electronic equipment
Li et al. Few-Shot Learning with Generative Adversarial Networks Based on WOA13 Data.
CN115511890B (en) Analysis system for large-flow data of special-shaped network interface
CN113361367B (en) Underground target electromagnetic inversion method and system based on deep learning
CN109978888B (en) Image segmentation method, device and computer readable storage medium
Xie et al. Bag-of-words feature representation for blind image quality assessment with local quantized pattern
CN112001362A (en) Image analysis method, image analysis device and image analysis system
CN107480621B (en) Age identification method based on face image
CN111145202B (en) Model generation method, image processing method, device, equipment and storage medium
CN113920538A (en) Object detection method, device, equipment, storage medium and computer program product
CN112419342A (en) Image processing method, image processing device, electronic equipment and computer readable medium
CN116091414A (en) Cardiovascular image recognition method and system based on deep learning
CN116797590A (en) Mura defect detection method and system based on machine vision
CN113592769B (en) Abnormal image detection and model training method, device, equipment and medium
Yang et al. No‐reference image quality assessment via structural information fluctuation
Oga et al. River state classification combining patch-based processing and CNN
CN107871128B (en) High-robustness image recognition method based on SVG dynamic graph
CN116996278B (en) Webpage detection method and device based on mining behavior of WASM module
CN116975864A (en) Malicious code detection method and device, electronic equipment and storage medium
CN112380537A (en) Method, device, storage medium and electronic equipment for detecting malicious software
CN116228520A (en) Image compressed sensing reconstruction method and system based on transform generation countermeasure network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant