CN117688565B - Malicious application detection method and system - Google Patents

Malicious application detection method and system Download PDF

Info

Publication number
CN117688565B
CN117688565B CN202410157834.7A CN202410157834A CN117688565B CN 117688565 B CN117688565 B CN 117688565B CN 202410157834 A CN202410157834 A CN 202410157834A CN 117688565 B CN117688565 B CN 117688565B
Authority
CN
China
Prior art keywords
feature
static
dynamic
characteristic
semantic association
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410157834.7A
Other languages
Chinese (zh)
Other versions
CN117688565A (en
Inventor
蔡晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhongke Network Core Technology Co ltd
Original Assignee
Beijing Zhongke Network Core Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhongke Network Core Technology Co ltd filed Critical Beijing Zhongke Network Core Technology Co ltd
Priority to CN202410157834.7A priority Critical patent/CN117688565B/en
Publication of CN117688565A publication Critical patent/CN117688565A/en
Application granted granted Critical
Publication of CN117688565B publication Critical patent/CN117688565B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The application discloses a malicious application detection method and a system thereof, which acquire static feature vectors of applications to be detected; extracting dynamic feature vectors from the simulation operation data of the application to be detected; extracting semantic association features of the static feature vector and the dynamic feature vector to obtain a dynamic feature-static feature semantic association feature map; the dynamic characteristic-static characteristic semantic association characteristic map is passed through a self-adaptive attention module to obtain a dynamic characteristic-static characteristic semantic association self-adaptive reinforcement characteristic map; and determining whether the application to be detected is a malicious application or not based on the dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram. Therefore, whether the application to be detected is a malicious application or not can be intelligently judged, the characteristics of multiple aspects of the application are comprehensively considered, and the accuracy and the robustness of malicious application detection are improved.

Description

Malicious application detection method and system
Technical Field
The application relates to the technical field of intelligent malicious application detection, in particular to a malicious application detection method and a malicious application detection system.
Background
With the development of the mobile internet, android smartphones have rapidly become popular computing platforms, and users often install various applications for use. In the process, some malicious applications can be installed unintentionally, user information is stolen or certain operations are executed in a concealed mode, and huge potential safety hazards are brought to Android users.
Therefore, a malicious application detection method and a system thereof are desired.
Disclosure of Invention
The application provides a malicious application detection method and a system thereof, which acquire static feature vectors of applications to be detected; extracting dynamic feature vectors from the simulation operation data of the application to be detected; extracting semantic association features of the static feature vector and the dynamic feature vector to obtain a dynamic feature-static feature semantic association feature map; the dynamic characteristic-static characteristic semantic association characteristic map is passed through a self-adaptive attention module to obtain a dynamic characteristic-static characteristic semantic association self-adaptive reinforcement characteristic map; and determining whether the application to be detected is a malicious application or not based on the dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram. Therefore, whether the application to be detected is a malicious application or not can be intelligently judged, the characteristics of multiple aspects of the application are comprehensively considered, and the accuracy and the robustness of malicious application detection are improved.
The application also provides a malicious application detection method, which comprises the following steps:
acquiring a static feature vector of an application to be detected;
extracting dynamic feature vectors from the simulation operation data of the application to be detected;
extracting semantic association features of the static feature vector and the dynamic feature vector to obtain a dynamic feature-static feature semantic association feature map;
the dynamic characteristic-static characteristic semantic association characteristic map is passed through a self-adaptive attention module to obtain a dynamic characteristic-static characteristic semantic association self-adaptive reinforcement characteristic map;
And determining whether the application to be detected is a malicious application or not based on the dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram.
In the above malicious application detection method, extracting semantic association features of the static feature vector and the dynamic feature vector to obtain a dynamic feature-static feature semantic association feature map includes: calculating a sample covariance correlation matrix of the dynamic feature vector relative to the static feature vector to obtain a dynamic feature-static feature correlation matrix; and extracting the characteristics of the dynamic characteristic-static characteristic association matrix by using a deep learning network model to obtain the dynamic characteristic-static characteristic semantic association characteristic map.
In the above malicious application detection method, calculating a sample covariance correlation matrix of the dynamic feature vector relative to the static feature vector to obtain a dynamic feature-static feature correlation matrix includes: calculating a sample covariance correlation matrix of the dynamic feature vector relative to the static feature vector with the following sample covariance formula to obtain the dynamic feature-static feature correlation matrix; the sample covariance formula is as follows:
Wherein, For the dynamic feature vector,/>For the static feature vector,/>And the dynamic characteristic-static characteristic incidence matrix is adopted.
In the malicious application detection method, the deep learning network model is a static feature-dynamic feature semantic association feature extractor based on a convolutional neural network model.
In the above malicious application detection method, the feature extraction of the dynamic feature-static feature correlation matrix by using a deep learning network model to obtain the dynamic feature-static feature semantic correlation feature map includes: and the dynamic characteristic-static characteristic association matrix passes through the static characteristic-dynamic characteristic semantic association characteristic extractor based on the convolutional neural network model to obtain the dynamic characteristic-static characteristic semantic association characteristic graph.
In the above malicious application detection method, the dynamic feature-static feature semantic association feature map is passed through an adaptive attention module to obtain a dynamic feature-static feature semantic association adaptive enhancement feature map, which includes: processing the dynamic feature-static feature semantic association feature map with the following adaptive attention formula to obtain the dynamic feature-static feature semantic association adaptive enhancement feature map; wherein, the self-adaptive attention formula is:
Wherein, For the dynamic feature-static feature semantically associated feature map,/>For pooling processing,/>For pooling vectors,/>Is a weight matrix,/>Is a bias vector,/>For the activation process,/>For the initial meta-weight feature vector,/>Is the/>, of the initial meta-weight feature vectorCharacteristic value/>To correct the meta-weight feature vector,/>Is the dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram,/>And multiplying the feature value in the correction element weight feature vector by a feature matrix of the dynamic feature-static feature semantic association feature graph along the channel dimension.
In the above malicious application detection method, determining whether the application to be detected is a malicious application based on the dynamic feature-static feature semantic association adaptive enhancement feature map includes: performing feature distribution optimization on the dynamic feature-static feature semantic association self-adaptive enhancement feature map to obtain an optimized dynamic feature-static feature semantic association self-adaptive enhancement feature map; and the optimized dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram is passed through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the application to be detected is a malicious application or not.
In the above malicious application detection method, the classifying result obtained by passing the optimized dynamic feature-static feature semantic association self-adaptive enhancement feature map through a classifier is used for indicating whether the application to be detected is a malicious application, and includes: expanding the optimized dynamic characteristic-static characteristic semantic association self-adaptive reinforcement feature map into classification feature vectors according to row vectors or column vectors; performing full-connection coding on the classification feature vectors by using a plurality of full-connection layers of the classifier to obtain coded classification feature vectors; and passing the coding classification feature vector through a Softmax classification function of the classifier to obtain the classification result.
The application also provides a malicious application detection system, which comprises:
The static feature vector acquisition module is used for acquiring a static feature vector of an application to be detected;
the dynamic feature vector extraction module is used for extracting dynamic feature vectors from the simulation operation data of the application to be detected;
the semantic association feature extraction module is used for extracting semantic association features of the static feature vector and the dynamic feature vector to obtain a dynamic feature-static feature semantic association feature map;
The self-adaptive attention module is used for enabling the dynamic characteristic-static characteristic semantic association characteristic diagram to pass through the self-adaptive attention module to obtain a dynamic characteristic-static characteristic semantic association self-adaptive reinforcement characteristic diagram;
And the to-be-detected application determining module is used for determining whether the to-be-detected application is a malicious application or not based on the dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram.
In the above malicious application detection system, the semantic association feature extraction module includes: a sample covariance correlation matrix calculation unit, configured to calculate a sample covariance correlation matrix of the dynamic feature vector relative to the static feature vector to obtain a dynamic feature-static feature correlation matrix; and the feature extraction unit is used for carrying out feature extraction on the dynamic feature-static feature association matrix by using a deep learning network model so as to obtain the dynamic feature-static feature semantic association feature map.
Compared with the prior art, the malicious application detection method and the system thereof provided by the application acquire the static feature vector of the application to be detected; extracting dynamic feature vectors from the simulation operation data of the application to be detected; extracting semantic association features of the static feature vector and the dynamic feature vector to obtain a dynamic feature-static feature semantic association feature map; the dynamic characteristic-static characteristic semantic association characteristic map is passed through a self-adaptive attention module to obtain a dynamic characteristic-static characteristic semantic association self-adaptive reinforcement characteristic map; and determining whether the application to be detected is a malicious application or not based on the dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram. Therefore, whether the application to be detected is a malicious application or not can be intelligently judged, the characteristics of multiple aspects of the application are comprehensively considered, and the accuracy and the robustness of malicious application detection are improved.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. In the drawings:
Fig. 1 is a flowchart of a malicious application detection method provided in an embodiment of the present application.
Fig. 2 is a schematic diagram of a system architecture of a malicious application detection method according to an embodiment of the present application.
Fig. 3 is a block diagram of a malicious application detection system according to an embodiment of the present application.
Fig. 4 is an application scenario diagram of a malicious application detection method provided in an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings. The exemplary embodiments of the present application and their descriptions herein are for the purpose of explaining the present application, but are not to be construed as limiting the application.
Unless defined otherwise, all technical and scientific terms used in the embodiments of the application have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in the present application is for the purpose of describing particular embodiments only and is not intended to limit the scope of the present application.
In describing embodiments of the present application, unless otherwise indicated and limited thereto, the term "connected" should be construed broadly, for example, it may be an electrical connection, or may be a communication between two elements, or may be a direct connection, or may be an indirect connection via an intermediate medium, and it will be understood by those skilled in the art that the specific meaning of the term may be interpreted according to circumstances.
It should be noted that, the term "first\second\third" related to the embodiment of the present application is merely to distinguish similar objects, and does not represent a specific order for the objects, it is to be understood that "first\second\third" may interchange a specific order or sequence where allowed. It is to be understood that the "first\second\third" distinguishing objects may be interchanged where appropriate such that embodiments of the application described herein may be practiced in sequences other than those illustrated or described herein.
With the popularity of the mobile internet, android smartphones have become an indispensable tool in people's daily life, and users typically download and install a variety of applications to meet their various demands. However, this behavior also carries some potential risks, and some applications may run in a seemingly normal manner, and in fact there are malicious behaviors, such as stealing personal information of a user or performing dangerous operations without the user's knowledge, and these potential safety hazards bring serious problems to the Android user, and need to draw enough attention.
The presence of malicious applications poses a serious threat to users and their personal information, and some malicious applications may silently collect private data of users, including but not limited to address books, text messages, call records, and geographic locations, which may be used for illegal purposes, such as personal information disclosure, identity theft, or other forms of fraud. In addition, some malicious applications may perform dangerous operations, such as sending a short message, making a call, or running malicious code in the background, without the user's knowledge, thereby causing economic loss or other adverse effects to the user.
In order to protect the security and privacy of the user, it is critical to take some measures against malicious applications. First, the user should carefully select and download the application program, avoiding downloading the application from an unofficial channel as much as possible, to reduce the risk of malicious applications. Second, the user should pay attention to the application's permission request, scrutinize whether the permissions required by the application match its functionality, and avoid granting too many unnecessary permissions. In addition, periodic updates to the handset system and applications are also important, as new updates typically contain fixes to known security vulnerabilities.
Besides the protection consciousness and behavior of the user, the Android system is continuously enhancing security protection measures. For example, google Play stores may conduct security reviews of uploaded applications to reduce the spread of malicious applications. In addition, the Android system also provides security functions such as application authority control, a sandbox mechanism and the like so as to help users to prevent threat of malicious applications on the system level.
In one embodiment of the present application, fig. 1 is a flowchart of a malicious application detection method provided in the embodiment of the present application. Fig. 2 is a schematic diagram of a system architecture of a malicious application detection method according to an embodiment of the present application. As shown in fig. 1 and 2, a malicious application detection method according to an embodiment of the present application includes: 110, acquiring a static feature vector of an application to be detected; 120, extracting a dynamic feature vector from the simulation operation data of the application to be detected; 130, extracting semantic association features of the static feature vector and the dynamic feature vector to obtain a dynamic feature-static feature semantic association feature map; 140, passing the dynamic feature-static feature semantic association feature map through an adaptive attention module to obtain a dynamic feature-static feature semantic association adaptive enhancement feature map; 150, determining whether the application to be detected is a malicious application or not based on the dynamic feature-static feature semantic association self-adaptive enhanced feature map.
In the step 110, the static feature vector can reflect information such as a code structure, an execution path, and the like of the application, and is helpful for performing structural analysis and feature extraction on the application. By acquiring the static feature vector, important feature information can be provided for subsequent malicious application detection, and potential malicious behaviors can be identified.
In the step 120, the dynamic feature vector can provide behavior information of the application in actual running, which is helpful to capture actual behavior features and suspicious behaviors of the application, and provides important clues for malicious application detection. By acquiring the dynamic feature vector, the actions of privacy invasion, information leakage, malicious propagation and the like of the application can be better identified, and an important basis is provided for subsequent malicious application detection.
In the step 130, when extracting the semantic association features, a suitable feature extraction method and association measurement mode are selected in consideration of the association between the static features and the dynamic features. By extracting the semantic association features, the semantic association between the static features and the dynamic features can be better captured, and the comprehensive analysis of the static structure and the dynamic behavior of the application is facilitated. The establishment of the dynamic characteristic-static characteristic semantic association characteristic diagram can provide richer characteristic information for subsequent characteristic analysis and processing, and is helpful for comprehensively understanding the characteristic mode and the behavior characteristic of the application.
In the step 140, the adaptive attention module is designed to consider the degree of association between the dynamic and static features, and select an appropriate attention mechanism and weight calculation method. Through the self-adaptive attention module, the association information between the dynamic characteristics and the static characteristics can be better mined, and the characteristics with important influence in the characteristic diagram can be enhanced. The generation of the dynamic characteristic-static characteristic semantic association self-adaptive enhancement characteristic diagram can enhance the importance of different characteristics in the characteristic diagram, and is helpful for improving the characterization capability and the distinguishing degree of the characteristic diagram.
In the step 150, when determining whether the application is a malicious application, the information in the dynamic feature-static feature semantic association adaptive enhancement feature map is comprehensively considered, and an appropriate classification or judgment model is selected. Based on the dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram, the characteristic information of the application can be analyzed more comprehensively, and whether the application has malicious behaviors or not can be judged accurately. By comprehensively considering semantic association information of dynamic features and static features, identification accuracy and robustness of malicious applications can be improved.
Aiming at the technical problems, the technical concept of the application is to realize intelligent detection of malicious applications by analyzing and correlating static characteristics and dynamic characteristics of the applications to be detected and combining a deep learning model and an adaptive attention mechanism.
Based on this, in the technical scheme of the application, firstly, a static feature vector of an application to be detected is obtained. Here, the static feature vector refers to a vector corresponding to a static feature, and the static feature refers to a feature of the application to be detected in a non-running state. The feature information may reflect code logic and execution paths of the application to be detected, including function call relationships, code block structures, and the like. More specifically, in an embodiment of the present application, a specific encoding process for acquiring a static feature vector of an application to be detected includes: decompiling the application to be detected to obtain source code information; extracting dalvik byte code features from the source code information; converting the Dalvik byte code features into Opcode sequence features according to a Dalvik conversion table; dividing the Opcode sequence feature into OpcodeN-gram feature sets by using an N-gram technology to obtain OpcodeN-gram features; and carrying out feature vectorization on the OpcodeN-gram features to obtain the static feature vector of the application to be detected.
And then, extracting dynamic characteristic vectors from the simulation running data of the application to be detected. Here, the dynamic feature refers to a feature that an application to be detected has in a dynamic state of operation. It should be appreciated that static features can only provide structural and code information of an application, and cannot directly reflect the behavior of the application at the actual run-time. By simulating the operation data, the behavior of the application in different scenes, such as network communication, file access, authority use and the like, can be captured. The dynamic features can provide actual behavior information of the application, which can help analyze suspicious behaviors and malicious features of the application. In particular, the simulated running data may record interaction patterns between the application and the user, other applications, or systems. Such as network communication behavior of the application, data received and transmitted, interactions with other applications, etc. The interaction modes can reveal the behaviors of privacy infringement, information disclosure, malicious propagation and the like of the application, and the malicious behaviors can be better identified by extracting dynamic characteristics. In an embodiment of the present application, an encoding process for extracting a dynamic feature vector from the simulated operation data of the application to be detected includes: collecting simulation operation data of the application to be detected; extracting a dynamic characteristic of a preset category from the simulated operation data, wherein the dynamic characteristic of the preset category comprises at least one of a permission leakage characteristic, a network data receiving characteristic, a service opening characteristic, a short message sending operation characteristic, an encryption and decryption operation characteristic, a network data sending characteristic, a file access operation characteristic, a file access specific operation characteristic, a data leakage characteristic, a network opening operation characteristic, a broadcast receiving characteristic, a dynamic loading characteristic, a network closing operation characteristic and a telephone dialing characteristic; and vectorizing the dynamic characteristics of the preset category to obtain the dynamic characteristic vector of the application to be detected.
In one embodiment of the present application, extracting semantic association features of the static feature vector and the dynamic feature vector to obtain a dynamic feature-static feature semantic association feature map includes: calculating a sample covariance correlation matrix of the dynamic feature vector relative to the static feature vector to obtain a dynamic feature-static feature correlation matrix; and extracting the characteristics of the dynamic characteristic-static characteristic association matrix by using a deep learning network model to obtain the dynamic characteristic-static characteristic semantic association characteristic map.
The dynamic and static features of the application to be detected can complement and enhance each other, considering that they provide information of different perspectives. Therefore, in the technical scheme of the application, the correlation between the dynamic feature and the static feature is captured by calculating the sample covariance correlation matrix of the dynamic feature vector relative to the static feature vector, so as to obtain a dynamic feature-static feature correlation matrix. The dynamic characteristic-static characteristic association matrix characterizes and characterizes the correlation and the mutual influence relationship between the dynamic characteristic and the static characteristic, so that the information expression of the dynamic characteristic-static characteristic association matrix is more perfect. In addition, the feature can be subjected to dimension reduction and screening by calculating the dynamic feature-static feature association matrix, so that the dimension of the feature can be reduced, and the feature processing efficiency is improved.
In a specific embodiment of the present application, calculating a sample covariance correlation matrix of the dynamic feature vector relative to the static feature vector to obtain a dynamic feature-static feature correlation matrix includes: calculating a sample covariance correlation matrix of the dynamic feature vector relative to the static feature vector with the following sample covariance formula to obtain the dynamic feature-static feature correlation matrix; the sample covariance formula is as follows:
Wherein, For the dynamic feature vector,/>For the static feature vector,/>And the dynamic characteristic-static characteristic incidence matrix is adopted.
It should be understood that the dynamic feature-static feature correlation matrix characterizes the correlation information between the dynamic feature and the static feature, but is merely a numerical matrix, which is difficult to directly understand. Therefore, in the technical scheme of the application, the dynamic characteristic-static characteristic association matrix passes through the static characteristic-dynamic characteristic semantic association characteristic extractor based on the convolutional neural network model to extract abstract semantic association information from the dynamic characteristic-static characteristic association matrix, so that a dynamic characteristic-static characteristic semantic association characteristic diagram is obtained, and the characteristics are more discriminant and distinguishable.
The deep learning network model is a static characteristic-dynamic characteristic semantic association characteristic extractor based on a convolutional neural network model.
In a specific embodiment of the present application, feature extraction is performed on the dynamic feature-static feature association matrix by using a deep learning network model to obtain the dynamic feature-static feature semantic association feature map, including: and the dynamic characteristic-static characteristic association matrix passes through the static characteristic-dynamic characteristic semantic association characteristic extractor based on the convolutional neural network model to obtain the dynamic characteristic-static characteristic semantic association characteristic graph.
The convolution neural network model can effectively extract features from the static feature-dynamic feature association matrix and capture the spatial association between the features, and semantic association information between the static features and the dynamic features can be effectively modeled and extracted through the convolution neural network model, so that the complex relationship between the features can be captured.
After the dynamic characteristic-static characteristic semantic association characteristic map is converted, semantic association information of the static characteristic and the dynamic characteristic can be fused together to form a more comprehensive characteristic map, and the characteristic map can more comprehensively represent the static structure and the dynamic behavior of the application and provide more comprehensive and accurate characteristic representation for subsequent malicious application detection. The generation of the dynamic characteristic-static characteristic semantic association characteristic diagram can better capture association information between the static characteristic and the dynamic characteristic, is beneficial to improving the accuracy of a detection model on malicious applications, and can improve the identification capability of a system on the malicious applications and reduce the possibility of misjudgment and missed judgment by better characterizing the characteristic information of the applications.
And then, the dynamic characteristic-static characteristic semantic association characteristic map passes through an adaptive attention module to obtain a dynamic characteristic-static characteristic semantic association adaptive enhancement characteristic map. Here, the adaptive attention module can automatically learn and adjust weights from the input dynamic feature-static feature semantic association feature map, focusing more attention on key features. For a dynamic feature-static feature semantic association feature map, each channel may represent a different feature, that is, some important feature areas may exist in the dynamic feature-static feature semantic association feature map, and the expression of the key features may be enhanced through the adaptive attention module, so that the accuracy of malicious application detection is improved.
It should be understood that the adaptive attention module can learn and adjust weights, so that more attention is focused on key features in the dynamic feature-static feature semantic association feature map, the system is facilitated to capture and express key features of malicious applications more accurately, and recognition accuracy of the malicious applications is improved. The self-adaptive attention module can automatically learn and adjust weights according to the input dynamic characteristic-static characteristic semantic association characteristic diagram, so that self-adaptive reinforcement of the characteristics is realized, the system is facilitated to adapt to different types of malicious applications and changes better, and the generalization capability and robustness of the system are improved.
Each channel represents different characteristics, so that the self-adaptive attention module can strengthen the expression of important characteristic areas in the dynamic characteristic-static characteristic semantic association characteristic diagram, malicious applications and normal applications can be better distinguished by strengthening the expression of key characteristics, and the detection accuracy and robustness are improved. By adaptively strengthening the expression of key features and important feature areas, whether the application has malicious behaviors can be judged more accurately, and the accuracy and effect of malicious application detection are improved.
The self-adaptive attention module is used for strengthening the dynamic characteristic-static characteristic semantic association characteristic diagram to obtain the dynamic characteristic-static characteristic semantic association self-adaptive strengthening characteristic diagram, so that key characteristics can be better captured and expressed, the performance and effect of the malicious application detection system are improved, and important guarantee is provided for protecting the safety and privacy of users.
In a specific embodiment of the present application, the dynamic feature-static feature semantic association feature map is passed through an adaptive attention module to obtain a dynamic feature-static feature semantic association adaptive enhancement feature map, which includes: processing the dynamic feature-static feature semantic association feature map with the following adaptive attention formula to obtain the dynamic feature-static feature semantic association adaptive enhancement feature map; wherein, the self-adaptive attention formula is:
Wherein, For the dynamic feature-static feature semantically associated feature map,/>For pooling processing,/>For pooling vectors,/>Is a weight matrix,/>Is a bias vector,/>For the activation process,/>For the initial meta-weight feature vector,/>Is the/>, of the initial meta-weight feature vectorCharacteristic value/>To correct the meta-weight feature vector,/>Is the dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram,/>And multiplying the feature value in the correction element weight feature vector by a feature matrix of the dynamic feature-static feature semantic association feature graph along the channel dimension.
In one embodiment of the present application, determining whether the application to be detected is a malicious application based on the dynamic feature-static feature semantic association adaptive enhancement feature map includes: performing feature distribution optimization on the dynamic feature-static feature semantic association self-adaptive enhancement feature map to obtain an optimized dynamic feature-static feature semantic association self-adaptive enhancement feature map; and the optimized dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram is passed through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the application to be detected is a malicious application or not.
In the technical scheme of the application, when the sample covariance correlation matrix of the dynamic feature vector relative to the static feature vector is calculated, the obtained dynamic feature-static feature correlation matrix can express the full source domain space correlation between the static feature and the dynamic feature of the application to be detected. In this way, after the dynamic feature-static feature correlation matrix passes through a static feature-dynamic feature semantic correlation feature extractor based on a convolutional neural network model, each feature matrix of the obtained dynamic feature-static feature semantic correlation feature graph expresses high-order feature domain spatial local correlation between the static feature and the dynamic feature of the application to be detected, and channel distribution of the convolutional neural network model is followed among the feature matrices. However, after the dynamic feature-static feature semantic association feature map passes through the self-adaptive attention module, self-adaptive attention strengthening is performed on the high-order local association features expressed by the feature matrixes of the dynamic feature-static feature semantic association feature map, so that the channel distribution sparsity among the feature matrixes of the dynamic feature-static feature semantic association self-adaptive strengthening feature map is increased, and when the dynamic feature-static feature semantic association self-adaptive strengthening feature map is classified by the classifier, probability density representation under a class probability density domain is thinned, and the regression convergence effect is affected when the dynamic feature-static feature semantic association self-adaptive strengthening feature map is classified by the classifier.
Based on the above, the applicant of the present application adaptively strengthens the feature map for the dynamic feature-static feature semantic associationOptimization was performed, expressed as: carrying out feature distribution optimization on the dynamic feature-static feature semantic association self-adaptive enhancement feature map by using the following optimization formula to obtain an optimized dynamic feature-static feature semantic association self-adaptive enhancement feature map; wherein, the optimization formula is:
Wherein, Self-adaptive enhanced feature map/>, representing the dynamic feature-static feature semantic associationPosition-by-position square of,/>For parameter trainable intermediate weight graphs, for example, based on channel distribution sparsity of the dynamic feature-static feature semantic association self-adaptive enhancement feature graph, the feature value of each feature matrix is initially set as the dynamic feature-static feature semantic association self-adaptive enhancement feature graph/>In addition,/>, global eigenvalue meanFor a single bitmap with all feature values of 1,Is the dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram,/>Is the optimized dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram,/>Representing per-position addition,/>Representing multiplication by location.
Here, to optimize the dynamic feature-static feature semantic association adaptive enhancement feature mapDistribution uniformity and consistency of sparse probability density in the whole probability space, and self-adaptive reinforcement feature map/>, of dynamic feature-static feature semantic association, is realized through a tail distribution reinforcement mechanism similar to standard cauchy distribution typeDistance type space distribution in a high-dimensional feature space is subjected to space angle inclination-based distance distribution optimization so as to realize the dynamic feature-static feature semantic association self-adaptive reinforcement feature map/>The distance between each local feature distribution is weakly correlated feature distribution space resonance, thereby improving the dynamic feature-static feature semantic association self-adaptive reinforcement feature map/>The uniformity and consistency of the overall probability density distribution layer relative to regression probability convergence improve the classification convergence effect, namely the classification convergence speed and the classification result accuracy.
Further, the optimized dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram is passed through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the application to be detected is a malicious application or not.
In a specific embodiment of the present application, the optimized dynamic feature-static feature semantic association adaptive enhancement feature map is passed through a classifier to obtain a classification result, where the classification result is used to indicate whether the application to be detected is a malicious application, and the method includes: expanding the optimized dynamic characteristic-static characteristic semantic association self-adaptive reinforcement feature map into classification feature vectors according to row vectors or column vectors; performing full-connection coding on the classification feature vectors by using a plurality of full-connection layers of the classifier to obtain coded classification feature vectors; and passing the coding classification feature vector through a Softmax classification function of the classifier to obtain the classification result.
By inputting the optimized dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram into the classifier, semantic association information between the static characteristic and the dynamic characteristic can be fully utilized, the characteristic feature capacity and the distinguishing degree of the characteristic are improved, the classifier is facilitated to more comprehensively understand the characteristic mode and the behavior characteristic of the application, and therefore the identification accuracy of malicious application is improved.
The optimized dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram can provide finer granularity and rich characteristic representation, contains semantic association information of static characteristics and dynamic characteristics, is favorable for more accurately describing characteristic modes and behavior characteristics of applications, can help a classifier to better distinguish malicious applications from normal applications, and improves classification accuracy and robustness. The optimized dynamic characteristic-static characteristic semantic association self-adaptive enhancement characteristic map can improve the generalization capability of the classifier on different types of malicious applications, so that the system has better robustness when facing novel malicious applications, is beneficial to adapting to the change and evolution of the malicious applications better, and keeps higher detection accuracy and reliability.
By comprehensively considering semantic association information of dynamic features and static features, the classifier can more accurately judge whether the application has malicious behaviors, and the utilization of the comprehensive features and the fine-granularity feature representation are beneficial to improving the accuracy of the classifier, reducing the possibility of misjudgment and missed judgment and improving the performance of a malicious application detection system.
In summary, the malicious application detection method based on the embodiment of the application is explained, which realizes the intelligent detection of the malicious application by analyzing and correlating the static characteristics and the dynamic characteristics of the application to be detected and combining a deep learning model and an adaptive attention mechanism.
Fig. 3 is a block diagram of a malicious application detection system according to an embodiment of the present application. As shown in fig. 3, the malicious application detection system 200 includes: a static feature vector obtaining module 210, configured to obtain a static feature vector of an application to be detected; a dynamic feature vector extraction module 220, configured to extract a dynamic feature vector from the simulated operation data of the application to be detected; a semantic association feature extraction module 230, configured to extract semantic association features of the static feature vector and the dynamic feature vector to obtain a dynamic feature-static feature semantic association feature map; the adaptive attention module 240 is configured to pass the dynamic feature-static feature semantic association feature map through the adaptive attention module to obtain a dynamic feature-static feature semantic association adaptive enhancement feature map; the to-be-detected application determining module 250 is configured to determine whether the to-be-detected application is a malicious application based on the dynamic feature-static feature semantic association adaptive enhancement feature map.
The malicious application detection system has the advantages that in various aspects, in the end-to-end process from feature acquisition to final classification decision, the information of static features and dynamic features is comprehensively utilized, and the robustness and generalization capability of the system are improved. Firstly, the static characteristic and the dynamic characteristic of the application are comprehensively acquired through a static characteristic vector acquisition module and a dynamic characteristic vector extraction module, and the structure information and the runtime behavior information of the application are covered. And then, the semantic association feature extraction module extracts semantic association information of the static feature vector and the dynamic feature vector, and provides a comprehensive feature analysis basis for subsequent malicious application detection. And secondly, the self-adaptive attention module is used for strengthening the dynamic characteristic-static characteristic semantic association characteristic diagram, so that the weight can be automatically learned and adjusted from the input dynamic characteristic-static characteristic semantic association characteristic diagram, more attention is focused on key characteristics, the expression of the key characteristics is strengthened, the identification accuracy of malicious applications is improved, the system is facilitated to more accurately capture and express the key characteristics of the malicious applications, and the identification accuracy of the malicious applications is improved.
In addition, the system can automatically learn and adjust weights according to the input dynamic characteristic-static characteristic semantic association characteristic diagram, so that the self-adaptive reinforcement of the characteristics is realized, and the generalization capability and the robustness of the system are improved. By enhancing the expression of key features, the system can better distinguish malicious applications from normal applications, and the accuracy and the robustness of detection are improved. The malicious application detection system can comprehensively acquire and analyze the static characteristics and the dynamic characteristics of the application, and can realize efficient identification and detection of the malicious application by integrating semantic association information of the static characteristics and the dynamic characteristics, thereby providing important guarantee for protecting the safety and privacy of users.
It will be appreciated by those skilled in the art that the specific operation of the respective steps in the above-described malicious application detection system has been described in detail in the above description of the malicious application detection method with reference to fig. 1 to 2, and thus, repetitive description thereof will be omitted.
As described above, the malicious application detection system 200 according to the embodiment of the present application can be implemented in various terminal devices, such as a server or the like for malicious application detection. In one example, the malicious application detection system 200 according to embodiments of the present application can be integrated into a terminal device as a software module and/or hardware module. For example, the malicious application detection system 200 may be a software module in the operating system of the terminal device, or may be an application developed for the terminal device; of course, the malicious application detection system 200 could equally be one of many hardware modules of the terminal device.
Alternatively, in another example, the malicious application detection system 200 and the terminal device may be separate devices, and the malicious application detection system 200 may be connected to the terminal device through a wired and/or wireless network, and transmit the interaction information in a agreed data format.
Fig. 4 is an application scenario diagram of a malicious application detection method provided in an embodiment of the present application. As shown in fig. 4, in this application scenario, first, a static feature vector of an application to be detected is acquired (e.g., C1 as illustrated in fig. 4); and extracting static feature vectors (e.g., C2 as illustrated in fig. 4) from the simulated running data of the application to be detected; the obtained static feature vector and static feature vector are then input into a server (e.g., S as illustrated in fig. 4) deployed with a malicious application detection algorithm, wherein the server is capable of processing the static feature vector and the static feature vector based on the malicious application detection algorithm to determine whether the application to be detected is a malicious application.
The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the application, and is not meant to limit the scope of the application, but to limit the application to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the application are intended to be included within the scope of the application.

Claims (8)

1. A malicious application detection method, comprising:
acquiring a static feature vector of an application to be detected;
extracting dynamic feature vectors from the simulation operation data of the application to be detected;
extracting semantic association features of the static feature vector and the dynamic feature vector to obtain a dynamic feature-static feature semantic association feature map;
the dynamic characteristic-static characteristic semantic association characteristic map is passed through a self-adaptive attention module to obtain a dynamic characteristic-static characteristic semantic association self-adaptive reinforcement characteristic map;
Determining whether the application to be detected is a malicious application or not based on the dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram;
The dynamic characteristic-static characteristic semantic association characteristic diagram is passed through an adaptive attention module to obtain a dynamic characteristic-static characteristic semantic association adaptive reinforcement characteristic diagram, which comprises the following steps:
Processing the dynamic feature-static feature semantic association feature map with the following adaptive attention formula to obtain the dynamic feature-static feature semantic association adaptive enhancement feature map; wherein, the self-adaptive attention formula is:
Wherein, For the dynamic feature-static feature semantically associated feature map,/>For pooling processing,/>For the purpose of pooling the vectors,Is a weight matrix,/>Is a bias vector,/>For the activation process,/>For the initial meta-weight feature vector,/>Is the/>, of the initial meta-weight feature vectorCharacteristic value/>To correct the meta-weight feature vector,/>Is the dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram,/>Multiplying the feature value in the correction element weight feature vector by a feature matrix of the dynamic feature-static feature semantic association feature graph along the channel dimension;
Wherein determining whether the application to be detected is a malicious application based on the dynamic feature-static feature semantic association adaptive reinforcement feature map comprises:
performing feature distribution optimization on the dynamic feature-static feature semantic association self-adaptive enhancement feature map to obtain an optimized dynamic feature-static feature semantic association self-adaptive enhancement feature map;
The optimized dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram is passed through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the application to be detected is a malicious application or not;
The feature distribution optimization is performed on the dynamic feature-static feature semantic association self-adaptive enhancement feature map to obtain an optimized dynamic feature-static feature semantic association self-adaptive enhancement feature map, which comprises the following steps: carrying out feature distribution optimization on the dynamic feature-static feature semantic association self-adaptive enhancement feature map by using the following optimization formula to obtain an optimized dynamic feature-static feature semantic association self-adaptive enhancement feature map; wherein, the optimization formula is:
Wherein, Self-adaptive enhanced feature map/>, representing the dynamic feature-static feature semantic associationPosition-by-position square of,/>Intermediate weight map trainable for parameters,/>For all single bitmaps with eigenvalues of 1,/>Is the dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram,/>Is the optimized dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram,/>Representing per-position addition,/>Representing multiplication by location.
2. The malicious application detection method according to claim 1, wherein extracting semantic association features of the static feature vector and the dynamic feature vector to obtain a dynamic feature-static feature semantic association feature map comprises:
Calculating a sample covariance correlation matrix of the dynamic feature vector relative to the static feature vector to obtain a dynamic feature-static feature correlation matrix;
and extracting the characteristics of the dynamic characteristic-static characteristic association matrix by using a deep learning network model to obtain the dynamic characteristic-static characteristic semantic association characteristic map.
3. The malicious application detection method of claim 2, wherein calculating a sample covariance correlation matrix of the dynamic feature vector relative to the static feature vector to obtain a dynamic feature-static feature correlation matrix comprises:
Calculating a sample covariance correlation matrix of the dynamic feature vector relative to the static feature vector with the following sample covariance formula to obtain the dynamic feature-static feature correlation matrix; the sample covariance formula is as follows:
Wherein, For the dynamic feature vector,/>For the static feature vector,/>And the dynamic characteristic-static characteristic incidence matrix is adopted.
4. The malicious application detection method of claim 3, wherein the deep learning network model is a static feature-dynamic feature semantic association feature extractor based on a convolutional neural network model.
5. The malicious application detection method of claim 4, wherein performing feature extraction on the dynamic feature-static feature correlation matrix by using a deep learning network model to obtain the dynamic feature-static feature semantic correlation feature map comprises:
and the dynamic characteristic-static characteristic association matrix passes through the static characteristic-dynamic characteristic semantic association characteristic extractor based on the convolutional neural network model to obtain the dynamic characteristic-static characteristic semantic association characteristic graph.
6. The malicious application detection method according to claim 5, wherein the optimized dynamic feature-static feature semantic association adaptive enhancement feature map is passed through a classifier to obtain a classification result, where the classification result is used to indicate whether the application to be detected is a malicious application, and the method includes:
expanding the optimized dynamic characteristic-static characteristic semantic association self-adaptive reinforcement feature map into classification feature vectors according to row vectors or column vectors;
Performing full-connection coding on the classification feature vectors by using a plurality of full-connection layers of the classifier to obtain coded classification feature vectors; and
And the coding classification feature vector is passed through a Softmax classification function of the classifier to obtain the classification result.
7. A malicious application detection system, comprising:
The static feature vector acquisition module is used for acquiring a static feature vector of an application to be detected;
the dynamic feature vector extraction module is used for extracting dynamic feature vectors from the simulation operation data of the application to be detected;
the semantic association feature extraction module is used for extracting semantic association features of the static feature vector and the dynamic feature vector to obtain a dynamic feature-static feature semantic association feature map;
The self-adaptive attention module is used for enabling the dynamic characteristic-static characteristic semantic association characteristic diagram to pass through the self-adaptive attention module to obtain a dynamic characteristic-static characteristic semantic association self-adaptive reinforcement characteristic diagram;
the application to be detected is a malicious application determining module and is used for determining whether the application to be detected is a malicious application or not based on the dynamic characteristic-static characteristic semantic association self-adaptive reinforcement characteristic diagram;
Wherein the adaptive attention module comprises:
Processing the dynamic feature-static feature semantic association feature map with the following adaptive attention formula to obtain the dynamic feature-static feature semantic association adaptive enhancement feature map; wherein, the self-adaptive attention formula is:
Wherein, For the dynamic feature-static feature semantically associated feature map,/>For pooling processing,/>For the purpose of pooling the vectors,Is a weight matrix,/>Is a bias vector,/>For the activation process,/>For the initial meta-weight feature vector,/>Is the/>, of the initial meta-weight feature vectorCharacteristic value/>To correct the meta-weight feature vector,/>Is the dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram,/>Multiplying the feature value in the correction element weight feature vector by a feature matrix of the dynamic feature-static feature semantic association feature graph along the channel dimension;
the determining module for determining whether the application to be detected is a malicious application includes:
performing feature distribution optimization on the dynamic feature-static feature semantic association self-adaptive enhancement feature map to obtain an optimized dynamic feature-static feature semantic association self-adaptive enhancement feature map;
The optimized dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram is passed through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the application to be detected is a malicious application or not;
The feature distribution optimization is performed on the dynamic feature-static feature semantic association self-adaptive enhancement feature map to obtain an optimized dynamic feature-static feature semantic association self-adaptive enhancement feature map, which comprises the following steps: carrying out feature distribution optimization on the dynamic feature-static feature semantic association self-adaptive enhancement feature map by using the following optimization formula to obtain an optimized dynamic feature-static feature semantic association self-adaptive enhancement feature map; wherein, the optimization formula is:
Wherein, Self-adaptive enhanced feature map/>, representing the dynamic feature-static feature semantic associationPosition-by-position square of,/>Intermediate weight map trainable for parameters,/>For all single bitmaps with eigenvalues of 1,/>Is the dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram,/>Is the optimized dynamic characteristic-static characteristic semantic association self-adaptive enhanced characteristic diagram,/>Representing per-position addition,/>Representing multiplication by location.
8. The malicious application detection system of claim 7, wherein the semantic association feature extraction module comprises:
A sample covariance correlation matrix calculation unit, configured to calculate a sample covariance correlation matrix of the dynamic feature vector relative to the static feature vector to obtain a dynamic feature-static feature correlation matrix;
and the feature extraction unit is used for carrying out feature extraction on the dynamic feature-static feature association matrix by using a deep learning network model so as to obtain the dynamic feature-static feature semantic association feature map.
CN202410157834.7A 2024-02-04 2024-02-04 Malicious application detection method and system Active CN117688565B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410157834.7A CN117688565B (en) 2024-02-04 2024-02-04 Malicious application detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410157834.7A CN117688565B (en) 2024-02-04 2024-02-04 Malicious application detection method and system

Publications (2)

Publication Number Publication Date
CN117688565A CN117688565A (en) 2024-03-12
CN117688565B true CN117688565B (en) 2024-05-03

Family

ID=90137609

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410157834.7A Active CN117688565B (en) 2024-02-04 2024-02-04 Malicious application detection method and system

Country Status (1)

Country Link
CN (1) CN117688565B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101880686B1 (en) * 2018-02-28 2018-07-20 에스지에이솔루션즈 주식회사 A malware code detecting system based on AI(Artificial Intelligence) deep learning
CN110704841A (en) * 2019-09-24 2020-01-17 北京电子科技学院 Convolutional neural network-based large-scale android malicious application detection system and method
CN111027070A (en) * 2019-12-02 2020-04-17 厦门大学 Malicious application detection method, medium, device and apparatus
CN113420293A (en) * 2021-06-22 2021-09-21 北京计算机技术及应用研究所 Android malicious application detection method and system based on deep learning
CN117197438A (en) * 2023-09-18 2023-12-08 山东神戎电子股份有限公司 Target detection method based on visual saliency

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11481492B2 (en) * 2017-07-25 2022-10-25 Trend Micro Incorporated Method and system for static behavior-predictive malware detection

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101880686B1 (en) * 2018-02-28 2018-07-20 에스지에이솔루션즈 주식회사 A malware code detecting system based on AI(Artificial Intelligence) deep learning
CN110704841A (en) * 2019-09-24 2020-01-17 北京电子科技学院 Convolutional neural network-based large-scale android malicious application detection system and method
CN111027070A (en) * 2019-12-02 2020-04-17 厦门大学 Malicious application detection method, medium, device and apparatus
CN113420293A (en) * 2021-06-22 2021-09-21 北京计算机技术及应用研究所 Android malicious application detection method and system based on deep learning
CN117197438A (en) * 2023-09-18 2023-12-08 山东神戎电子股份有限公司 Target detection method based on visual saliency

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Malware Detection Using Machine Learning";Prabhat Singh et al.;《2021 International Conference on Technological Advancements and Innovations (ICTAI)》;20220114;全文 *
DBN和GRU混合的Android恶意软件检测模型;欧阳立;芦天亮;暴雨轩;李默;;中国人民公安大学学报(自然科学版);20200215(01);全文 *

Also Published As

Publication number Publication date
CN117688565A (en) 2024-03-12

Similar Documents

Publication Publication Date Title
Lu et al. Android malware detection based on a hybrid deep learning model
CN107180192B (en) Android malicious application detection method and system based on multi-feature fusion
Feng et al. A two-layer deep learning method for android malware detection using network traffic
CN106845240A (en) A kind of Android malware static detection method based on random forest
John et al. Graph convolutional networks for android malware detection with system call graphs
Li et al. Opcode sequence analysis of Android malware by a convolutional neural network
Bibi et al. A dynamic DL-driven architecture to combat sophisticated Android malware
Li et al. An Android malware detection method based on AndroidManifest file
Zhu et al. Android malware detection based on multi-head squeeze-and-excitation residual network
Song et al. Permission Sensitivity-Based Malicious Application Detection for Android
CN113360912A (en) Malicious software detection method, device, equipment and storage medium
Ding et al. Automaticlly learning featurs of android apps using cnn
Wang et al. A deep learning method for android application classification using semantic features
CN113468524B (en) RASP-based machine learning model security detection method
Kumar et al. Optimal Unification of Static and Dynamic Features for Smartphone Security Analysis.
CN117688565B (en) Malicious application detection method and system
Congyi et al. Method for detecting Android malware based on ensemble learning
Wang et al. Malware detection using cnn via word embedding in cloud computing infrastructure
Gao et al. Quorum chain-based malware detection in android smart devices
Liu et al. Learning-based detection for malicious android application using code vectorization
Sharma et al. Deep learning applications in cyber security: a comprehensive review, challenges and prospects
Amrutha et al. Multimodal deep learning method for detection of malware in android using static and dynamic features
CN109784047B (en) Program detection method based on multiple features
CN113420293A (en) Android malicious application detection method and system based on deep learning
Lee et al. An android malware detection system using a knowledge-based permission counting method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant