CN114595451A - Graph convolution-based android malicious application classification method - Google Patents

Graph convolution-based android malicious application classification method Download PDF

Info

Publication number
CN114595451A
CN114595451A CN202210158644.8A CN202210158644A CN114595451A CN 114595451 A CN114595451 A CN 114595451A CN 202210158644 A CN202210158644 A CN 202210158644A CN 114595451 A CN114595451 A CN 114595451A
Authority
CN
China
Prior art keywords
program block
malicious code
malicious
graph
code program
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210158644.8A
Other languages
Chinese (zh)
Inventor
林飞
尹修恒
易永波
古元
毛华阳
华仲峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Act Technology Development Co ltd
Original Assignee
Beijing Act Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Act Technology Development Co ltd filed Critical Beijing Act Technology Development Co ltd
Priority to CN202210158644.8A priority Critical patent/CN114595451A/en
Publication of CN114595451A publication Critical patent/CN114595451A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/563Static detection by source code analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Hardware Design (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Virology (AREA)
  • Stored Programmes (AREA)

Abstract

An android malicious application classification method based on graph convolution relates to the technical field of information, and comprises the following steps: 1) decompiling the APK file; 2) extracting the calling relation of the malicious code program block and the instruction distribution characteristics of the malicious code program block; 3) selecting a key program block according to the importance degree associated with the malicious code program block; 4) performing dimensionality reduction and nonlinear transformation on instruction distribution characteristics of the malicious code program block; 5) the embedded characteristic of the calling relation graph of the malicious code program block is fused with the instruction distribution characteristic of the transformed malicious code program block; 6) establishing a graph convolution neural network model; 7) sorting and screening nodes; 8) carrying out family classification on the malicious software by the convolutional neural network; according to the method, the graph is constructed by using the calling relation and the program block characteristics, the graph neural network model is established by fully using the graph characteristics to identify the malicious applications and classify the malicious applications, and the identification accuracy is effectively improved.

Description

Graph convolution-based android malicious application classification method
Technical Field
The invention relates to the technical field of information.
Background
Android is a free and open source code operating system, which is led and developed by google, usa and the open mobile alliance, and is mainly used for mobile devices, such as smart phones and tablet computers. With the rapid development of the mobile internet, android terminal users are greatly increased, and malicious applications are gradually flooded. Besides the well-known trojans, viruses, lasso software, advertising, spyware, and the like. Mobile malware detection has become a hotspot problem in the field of network security.
At present, malicious software detection mainly adopts a static method and a dynamic method, wherein static analysis mainly extracts the authority, API sequence, code and other characteristics of an android program, and a dynamic analysis technology needs to simulate the real machine operation of the program and capture the behavior characteristics of the software as an analysis basis. Compared with dynamic analysis, static analysis has the characteristics of high analysis speed, less occupied resources and the like, and mainly comprises a detection method based on feature codes and a detection method based on machine learning.
The feature code-based malicious software detection method mainly comprises the steps of extracting feature codes of target software to be detected, matching the feature codes with a known malicious software feature code recognition library, defining the target software as malicious software if matching is successful, and defining the target software as normal software if matching is not successful. Common feature codes mainly include digital signatures of android applications, common API functions and sensitive permissions of malicious software, and the like.
The detection method based on machine learning mainly adopts the principle that different dimensional features of a program are analyzed and extracted, each application is represented by a multi-dimensional vector, and finally a machine learning classification algorithm is utilized to train a training set sample, so that a classifier is constructed to predict whether an unknown sample is malicious or not. The machine learning classification algorithm comprises: such as support vector machines, random forests, neural networks, etc. Common feature dimensions include rights, components, APIs, and APP presentation information, among others.
The identification method based on the feature codes is compared with a malicious software feature code database for identification, has the characteristics of high speed, high accuracy, strong interpretability and the like, is strongly dependent on the feature code database, meanwhile, along with the continuous emergence of novel malicious software, the database needs to be continuously updated, a large amount of labor cost and time cost are consumed, the feature codes are easily changed through technologies such as confusion, and malicious detection is avoided.
The algorithm based on machine learning represents malicious software in a multi-dimensional vector mode through feature extraction, and then trains a classifier to recognize. The method can quickly find the variant application of known malicious families, and can deeply analyze the characteristics and screen important characteristics. However, the method only analyzes the static information of the programs and ignores the calling relation among the software programs.
The invention provides a malicious application identification method based on graph convolution, which is characterized in that Dalvik byte codes are obtained by decompiling APK files, program blocks are divided according to the Dalvik instruction execution sequence, each program block has different instructions, calling relations exist among the program blocks, a graph convolution neural network model is established, and malicious applications are classified.
Prior Art
Dalvik is a virtual machine specially designed for the android operating system by Google, and is deeply optimized. The Davilk bytecode is of only two types: a base type and a reference type. Both objects and arrays are reference types, and the description of the bytecode type in Davilk is consistent with the descriptor rules in the JVM.
The Dalvik byte code has its own instruction set, similar to assembly language, one Dalvik instruction includes corresponding operation code and operand, the Dalvik instruction in one function can be divided into basic blocks according to its execution sequence relation, each basic block is composed of several Dalvik instructions, the Dalvik byte code has 244 different instructions. Smali file stores Dalvik byte code, Smali supports annotation, debugging information and line number information, Smali supports the basic characteristic of Java, and Smali is generally used for reverse engineering of android programs.
An article, "Android native code control flow chart extraction method based on symbolic execution", which is carried in journal of network and information security, 2017, 7 months, volume 3, 7 th. The article presents a method for extracting program call graph and instruction distribution characteristics within a program block based on Dalvik bytecode execution using a symbolic substitution method.
Disclosure of Invention
In view of the defects of the prior art, the graph convolution-based android malicious application classification method provided by the invention comprises the following steps: 1) decompiling the APK file; 2) extracting the calling relation of the malicious code program block and the instruction distribution characteristics of the malicious code program block; 3) selecting a key program block according to the importance degree associated with the malicious code program block; 4) performing dimensionality reduction and nonlinear transformation on instruction distribution characteristics of the malicious code program block; 5) the embedded characteristic of the calling relation graph of the malicious code program block is fused with the instruction distribution characteristic of the transformed malicious code program block; 6) establishing a graph convolution neural network model; 7) sorting and screening nodes; 8) carrying out family classification on the malicious software by the convolutional neural network;
the graph convolution-based android malicious application classification method comprises the following specific implementation steps:
1) decompiling APK files
Decompiling the android application program by using an apktool tool to obtain Smali intermediate code;
2) extracting calling relation of malicious code program block and instruction distribution characteristic of malicious code program block
The Smali file stores Dalvik byte codes, and a known malicious code library is utilized to compare Smali intermediate codes to find out malicious codes; executing a Dalvik byte code by using a symbol substitution method, thereby extracting a program calling relationship, defining the calling relationship between a malicious code program block and a non-malicious code program block, marking and numbering the malicious code program block, and marking and numbering the non-malicious code program block which has a direct calling relationship with a malicious code; marking and numbering non-malicious code program blocks with indirect calling relation with malicious codes; 244 different instructions exist in the Dalvik byte code, and the instruction distribution characteristics of the malicious code program block are determined according to the instructions in the malicious code program block;
3) selecting a key program block according to the importance degree associated with the malicious code program block;
the importance of each chunk to the malicious code family is calculated:
Figure 373699DEST_PATH_IMAGE001
Figure 600548DEST_PATH_IMAGE002
TF-IDF=TF*IDF;
wherein
Figure 878512DEST_PATH_IMAGE003
Is the number of occurrences of chunk i in malicious family j,
Figure 914601DEST_PATH_IMAGE004
is the sum of the number of occurrences of all blocks of the j files,
Figure 860655DEST_PATH_IMAGE005
is the total number of malicious applications,
Figure 756365DEST_PATH_IMAGE006
is the number of malicious applications containing chunk i;
after the importance of each program block to the malicious code family is calculated, eliminating the program blocks with the importance ranking of the program blocks to the malicious code family beyond the first three quarters to form a malicious code program block calling relation graph by taking the importance ranking of the program blocks to the malicious code family as a threshold value;
4) de-scaling and non-linear transformation of malicious code block instruction distribution characteristics
The Dalvik bytecode has 244 different instructions, and the obtained instruction distribution characteristics of the malicious code program block are 244-dimensional vector ei
And performing reduced dimension and nonlinear transformation on the instruction distribution characteristics of the malicious code program block:
Figure 977225DEST_PATH_IMAGE007
wherein EiIs the instruction distribution characteristic of the transformed malicious code program block, the vector dimension is 100, W is a transformation weight parameter matrix, the transformation weight parameter matrix is obtained by model training, g is an activation function and has the function of being specific to the malicious code program blockAnd (3) performing nonlinear transformation to enhance the expression capability of the model, wherein a RELU activation function is taken, and a function expression is as follows: f (x) = max (0, x);
5) fusion of embedded characteristics of calling relation graph of malicious code program block and instruction distribution characteristics of transformed malicious code program block
Multidimensional vector G for embedding each node in calling relation graph of malicious code program blockiRepresenting, as a graph-embedded feature, the vector dimension i as an instruction distribution transformation feature Ei(ii) a By means of IiThe fused features are represented as a result of the fusion,
Figure 547050DEST_PATH_IMAGE008
wherein
Figure 758946DEST_PATH_IMAGE009
Figure 485885DEST_PATH_IMAGE010
The weight parameters are obtained through model training; i isiThe dimension forming the feature fusion matrix I is N x d, N is the number of program block nodes, and d is the dimension of node embedding;
6) establishing a graph convolution neural network model
If the adjacent matrix of the calling relational graph of the malicious code program block is A, A isijIndicating that a calling relationship exists between the program block i and the program block j, wherein the value of the corresponding position is 1, and if no calling relationship exists, the value of the corresponding position is 0; defining the degree of calling the nodes of the relational graph by the malicious code program block as D, wherein the degree of the nodes represents the number of connections associated with the target nodes;
for feature extraction of the graph we use a multi-layer neural network structure, for each layer, using a mapping function
Figure 368300DEST_PATH_IMAGE011
To calculate; h1Is a graph-embedded expression of layer I, H0An initialization node expression matrix representing the 0 th layer, namely a characteristic fusion matrix I; f is a non-linear function; the graph convolution network can be regarded as a recursive computation of a plurality of graph convolutions, and the nonlinear function of each layer can be expressed as:
Figure 378982DEST_PATH_IMAGE012
wherein, the first and the second end of the pipe are connected with each other,
Figure 58791DEST_PATH_IMAGE013
,ENis an N-dimensional identity matrix, which is added to the identity matrix for adding self-join;
Figure 474823DEST_PATH_IMAGE014
is the in-out matrix for all the block nodes,
Figure 429439DEST_PATH_IMAGE015
is an activation function, the function expression is:
Figure 676844DEST_PATH_IMAGE016
7) sequencing screening node
According to the embedding vector inner product value, carrying out embedding sequencing on the nodes, and selecting the first 50 nodes with similar values;
8) carrying out family classification on the malicious software by the convolutional neural network;
and establishing a convolutional neural network model, embedding the 50 screened nodes into a matrix for convolutional calculation, and then connecting a full connection layer to classify malicious applications.
Advantageous effects
Applications in the same malicious family always have many common features and behave similarly, with the vast majority of malware being derivative variants of known malicious families. According to the method, the program similarity of the malicious family is considered, the program block of the malicious code is extracted, the graph is constructed by the calling relation and the program block characteristics, the graph neural network model is established by fully utilizing the graph characteristics to identify the malicious application and classify the malicious application, and the identification accuracy is effectively improved.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
Example one
Referring to fig. 1, the graph convolution-based android malicious application classification method provided by the invention comprises the following steps: s01 decompiling APK files; s02 extracting the calling relation of the malicious code program block and the instruction distribution characteristics of the malicious code program block; s03 selecting a key program block according to the importance degree associated with the malicious code program block; s04, performing dimensionality reduction and nonlinear transformation on the instruction distribution characteristics of the malicious code program block; s05 the embedded characteristic of the calling relation graph of the malicious code program block is fused with the instruction distribution characteristic of the transformed malicious code program block; s06, establishing a graph convolution neural network model; s07 sorting the screening nodes; s08, carrying out family classification on the malicious software by the convolutional neural network;
the graph convolution-based android malicious application classification method comprises the following specific implementation steps:
s01) decompiling APK files
Decompiling the android application program by using an tool, namely, using the apktool to obtain Smali intermediate code;
s02) extracting the calling relation of the malicious code program block and the instruction distribution characteristics of the malicious code program block
The Smali file stores Dalvik byte codes, and a known malicious code library is utilized to compare Smali intermediate codes to find out malicious codes; using a symbol substitution method to execute a Dalvik byte code, thereby extracting a program calling relationship, determining the calling relationship between a malicious code program block and a non-malicious code program block, marking and numbering the malicious code program block, and marking and numbering the non-malicious code program block which has a direct calling relationship with the malicious code; marking and numbering non-malicious code program blocks with indirect calling relation with malicious codes; 244 different instructions exist in the Dalvik byte code, and the instruction distribution characteristics of the malicious code program block are determined according to the instructions in the malicious code program block;
s03) selecting a key program block according to the importance degree associated with the malicious code program block;
the importance of each chunk to the malicious code family is calculated:
Figure 229352DEST_PATH_IMAGE001
Figure 227621DEST_PATH_IMAGE002
TF-IDF=TF*IDF;
wherein
Figure 477336DEST_PATH_IMAGE003
Is the number of occurrences of chunk i in malicious family j,
Figure 80713DEST_PATH_IMAGE004
is the sum of the number of occurrences of all blocks of the j files,
Figure 99395DEST_PATH_IMAGE005
is the total number of malicious applications,
Figure 881972DEST_PATH_IMAGE006
is the number of malicious applications containing chunk i;
after the importance of each program block to the malicious code family is calculated, eliminating the program blocks with the importance ranking of the program blocks to the malicious code family beyond the first three quarters to form a malicious code program block calling relation graph by taking the importance ranking of the program blocks to the malicious code family as a threshold value;
s04) performing reduced dimension and nonlinear transformation on instruction distribution characteristics of malicious code program blocks
The Dalvik byte code has 244 different instructions, and the obtained instruction distribution characteristic of the malicious code program block is 244-dimensional vector ei
And performing reduced dimension and nonlinear transformation on the instruction distribution characteristics of the malicious code program block:
Figure 255279DEST_PATH_IMAGE007
wherein EiThe method is characterized in that the method is a transformed malicious code program block instruction distribution characteristic, vector dimension is 100, W is a transformation weight parameter matrix, the transformation weight parameter matrix is obtained through model training, g is an activation function, the function is to carry out nonlinear transformation on the characteristic, the expression capability of a model is enhanced, and a RELU activation function and a function expression are taken: f (x) = max (0, x);
s05) the embedded characteristic of the calling relation graph of the malicious code program block is fused with the instruction distribution characteristic of the transformed malicious code program block
Multidimensional vector G for embedding each node in calling relation graph of malicious code program blockiRepresenting, as a graph-embedded feature, the vector dimension i as an instruction distribution transformation feature Ei(ii) a By means of IiThe fused features are represented as a result of the fusion,
Figure 603650DEST_PATH_IMAGE008
wherein
Figure 200948DEST_PATH_IMAGE009
Figure 493389DEST_PATH_IMAGE010
The weight parameters are obtained through model training; i isiThe dimension forming the feature fusion matrix I is N x d, N is the number of program block nodes, and d is the dimension of node embedding;
s06) establishing a graph convolution neural network model
If the adjacent matrix of the calling relational graph of the malicious code program block is A, A isijIndicating that a calling relationship exists between the program block i and the program block j, wherein the value of the corresponding position is 1, and if no calling relationship exists, the value of the corresponding position is 0; defining the degree of calling the nodes of the relational graph by the malicious code program block as D, wherein the degree of the nodes represents the number of connections associated with the target nodes;
for feature extraction of the graph we use a multi-layer neural network structure, for each layer, using a mapping function
Figure 764137DEST_PATH_IMAGE011
To calculate; h1Is a graph-embedded expression of layer I, H0An initialization node expression matrix representing the 0 th layer, namely a characteristic fusion matrix I; f is a non-linear function; the graph convolution network can be regarded as a recursive computation of a plurality of graph convolutions, and the nonlinear function of each layer can be expressed as:
Figure 873264DEST_PATH_IMAGE012
wherein the content of the first and second substances,
Figure 142929DEST_PATH_IMAGE013
,ENis an N-dimensional identity matrix, which is added to the identity matrix for adding self-join;
Figure 897168DEST_PATH_IMAGE014
is the in-out matrix for all the block nodes,
Figure 4231DEST_PATH_IMAGE015
is an activation function, the function expression is:
Figure 553024DEST_PATH_IMAGE016
s07) sorting and screening node
According to the embedding vector inner product value, carrying out embedding sequencing on the nodes, and selecting the first 50 nodes with similar values;
s08) carrying out family classification on the malicious software by the convolutional neural network;
and establishing a convolutional neural network model, embedding the 50 screened nodes into a matrix for convolution calculation, and then connecting a full connection layer to classify malicious applications.

Claims (1)

1. The graph convolution-based android malicious application classification method is characterized by comprising the following steps: 1) decompiling the APK file; 2) extracting the calling relation of the malicious code program block and the instruction distribution characteristics of the malicious code program block; 3) selecting a key program block according to the importance degree associated with the malicious code program block; 4) performing dimensionality reduction and nonlinear transformation on the instruction distribution characteristics of the malicious code program block; 5) the embedded characteristic of the calling relation graph of the malicious code program block is fused with the instruction distribution characteristic of the transformed malicious code program block; 6) establishing a graph convolution neural network model; 7) sorting and screening nodes; 8) carrying out family classification on the malicious software by the convolutional neural network;
the graph convolution-based android malicious application classification method comprises the following specific implementation steps:
1) decompiling APK files
Decompiling the android application program by using an apktool tool to obtain Smali intermediate code;
2) extracting calling relation of malicious code program block and instruction distribution characteristic of malicious code program block
The Smali file stores the Dalvik byte code, and a known malicious code library is used for comparing the Smali intermediate code to find a malicious code; executing a Dalvik byte code by using a symbol substitution method, thereby extracting a program calling relationship, defining the calling relationship between a malicious code program block and a non-malicious code program block, marking and numbering the malicious code program block, and marking and numbering the non-malicious code program block which has a direct calling relationship with a malicious code; marking and numbering non-malicious code program blocks with indirect calling relation with malicious codes; 244 different instructions exist in the Dalvik byte code, and the instruction distribution characteristics of the malicious code program block are determined according to the instructions in the malicious code program block;
3) selecting a key program block according to the importance degree associated with the malicious code program block;
the importance of each chunk to the malicious code family is calculated:
Figure 339947DEST_PATH_IMAGE001
Figure 714210DEST_PATH_IMAGE002
TF-IDF=TF*IDF;
wherein
Figure 720968DEST_PATH_IMAGE003
Is the number of occurrences of chunk i in malicious family j,
Figure 834286DEST_PATH_IMAGE004
is the sum of the number of occurrences of all blocks of the j files,
Figure 981147DEST_PATH_IMAGE005
is the total number of malicious applications,
Figure 803610DEST_PATH_IMAGE006
is the number of malicious applications containing chunk i;
after the importance of each program block to the malicious code family is calculated, eliminating the program blocks with the importance ranking of the program blocks to the malicious code family beyond the first three quarters to form a malicious code program block calling relation graph by taking the importance ranking of the program blocks to the malicious code family as a threshold value;
4) de-scaling and non-linear transformation of malicious code block instruction distribution characteristics
The Dalvik bytecode has 244 different instructions, and the obtained instruction distribution characteristics of the malicious code program block are 244-dimensional vector ei
And performing reduced dimension and nonlinear transformation on the instruction distribution characteristics of the malicious code program block:
Figure 777033DEST_PATH_IMAGE007
wherein EiThe method is characterized in that the method is a transformed malicious code program block instruction distribution characteristic, vector dimension is 100, W is a transformation weight parameter matrix, the transformation weight parameter matrix is obtained through model training, g is an activation function, the function is to carry out nonlinear transformation on the characteristic, the expression capability of a model is enhanced, and a RELU activation function and a function expression are taken: f (x) = max (0, x);
5) fusion of embedded characteristics of calling relation graph of malicious code program block and instruction distribution characteristics of transformed malicious code program block
Multidimensional vector G for embedding each node in calling relation graph of malicious code program blockiRepresenting, as a graph-embedded feature, the vector dimension i as an instruction distribution transformation feature Ei(ii) a By means of IiThe fused features are represented as a result of the fusion,
Figure 308423DEST_PATH_IMAGE008
in which
Figure 528174DEST_PATH_IMAGE009
Figure 103294DEST_PATH_IMAGE010
The weight parameters are obtained through model training; i isiThe dimension forming the feature fusion matrix I is N x d, N is the number of program block nodes, and d is the dimension of node embedding;
6) establishing a graph convolution neural network model
If the adjacent matrix of the calling relational graph of the malicious code program block is A, A isijIndicating that a calling relationship exists between the program block i and the program block j, wherein the corresponding position value is 1, and if no calling relationship exists, the corresponding position is 0; defining the degree of calling the nodes of the relational graph by the malicious code program block as D, wherein the degree of the nodes represents the number of connections associated with the target nodes;
for feature extraction of the graph we use a multi-layer neural network structure, for each layer, using a mapping function
Figure 314747DEST_PATH_IMAGE011
To calculate; h1Is a graph-embedded expression of layer I, H0An initialization node expression matrix representing the 0 th layer, namely a characteristic fusion matrix I; f is a non-linear function; the graph convolution network can be regarded as a recursive computation of a plurality of graph convolutions, and the nonlinear function of each layer can be expressed as:
Figure 278779DEST_PATH_IMAGE012
wherein, the first and the second end of the pipe are connected with each other,
Figure DEST_PATH_IMAGE013
,ENis an N-dimensional identity matrix, which is added to the identity matrix for adding self-join;
Figure DEST_PATH_IMAGE014
is the in-out matrix for all the block nodes,
Figure DEST_PATH_IMAGE015
is an activation function, the function expression is:
Figure DEST_PATH_IMAGE016
7) sequencing screening node
According to the embedding vector inner product value, carrying out embedding sequencing on the nodes, and selecting the first 50 nodes with similar values;
8) carrying out family classification on the malicious software by the convolutional neural network;
and establishing a convolutional neural network model, embedding the 50 screened nodes into a matrix for convolutional calculation, and then connecting a full connection layer to classify malicious applications.
CN202210158644.8A 2022-02-22 2022-02-22 Graph convolution-based android malicious application classification method Pending CN114595451A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210158644.8A CN114595451A (en) 2022-02-22 2022-02-22 Graph convolution-based android malicious application classification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210158644.8A CN114595451A (en) 2022-02-22 2022-02-22 Graph convolution-based android malicious application classification method

Publications (1)

Publication Number Publication Date
CN114595451A true CN114595451A (en) 2022-06-07

Family

ID=81804883

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210158644.8A Pending CN114595451A (en) 2022-02-22 2022-02-22 Graph convolution-based android malicious application classification method

Country Status (1)

Country Link
CN (1) CN114595451A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089951A (en) * 2023-02-24 2023-05-09 山东云天安全技术有限公司 Malicious code detection method, readable storage medium and electronic equipment
CN116186628A (en) * 2023-04-23 2023-05-30 广州钛动科技股份有限公司 APP automatic marking method and system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089951A (en) * 2023-02-24 2023-05-09 山东云天安全技术有限公司 Malicious code detection method, readable storage medium and electronic equipment
CN116089951B (en) * 2023-02-24 2023-07-14 山东云天安全技术有限公司 Malicious code detection method, readable storage medium and electronic equipment
CN116186628A (en) * 2023-04-23 2023-05-30 广州钛动科技股份有限公司 APP automatic marking method and system
CN116186628B (en) * 2023-04-23 2023-07-07 广州钛动科技股份有限公司 APP automatic marking method and system

Similar Documents

Publication Publication Date Title
Faruki et al. Mining control flow graph as api call-grams to detect portable executable malware
CN109271788B (en) Android malicious software detection method based on deep learning
CN111639337B (en) Unknown malicious code detection method and system for massive Windows software
Ünver et al. Android malware detection based on image-based features and machine learning techniques
CN107688743B (en) Malicious program detection and analysis method and system
WO2015101097A1 (en) Method and device for feature extraction
Jeon et al. Hybrid malware detection based on bi-lstm and spp-net for smart iot
CN114595451A (en) Graph convolution-based android malicious application classification method
Palahan et al. Extraction of statistically significant malware behaviors
Wang et al. LSCDroid: Malware detection based on local sensitive API invocation sequences
Elkhawas et al. Malware detection using opcode trigram sequence with SVM
Nguyen et al. Detecting repackaged android applications using perceptual hashing
Kakisim et al. Sequential opcode embedding-based malware detection method
CN111400713A (en) Malicious software family classification method based on operation code adjacency graph characteristics
Mirzaei et al. Scrutinizer: Detecting code reuse in malware via decompilation and machine learning
US11080236B1 (en) High throughput embedding generation system for executable code and applications
Lajevardi et al. Markhor: malware detection using fuzzy similarity of system call dependency sequences
Kaur et al. Unmasking Android obfuscation tools using spatial analysis
Pektaş et al. Runtime-behavior based malware classification using online machine learning
Wang et al. Malware detection using cnn via word embedding in cloud computing infrastructure
Guo et al. Classification of malware variant based on ensemble learning
Ahmad et al. Android mobile malware classification using a tokenization approach
CN114266045A (en) Network virus identification method and device, computer equipment and storage medium
Kyadige et al. Learning from context: Exploiting and interpreting file path information for better malware detection
Şahin et al. On the Android Malware Detection System Based on Deep Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination