CN115018046A - Deep learning method for detecting malicious traffic of mobile APP - Google Patents

Deep learning method for detecting malicious traffic of mobile APP Download PDF

Info

Publication number
CN115018046A
CN115018046A CN202210533158.XA CN202210533158A CN115018046A CN 115018046 A CN115018046 A CN 115018046A CN 202210533158 A CN202210533158 A CN 202210533158A CN 115018046 A CN115018046 A CN 115018046A
Authority
CN
China
Prior art keywords
matrix
value
deep learning
mobile app
cidm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210533158.XA
Other languages
Chinese (zh)
Other versions
CN115018046B (en
Inventor
陆凯
胡香利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HAINAN VOCATIONAL COLLEGE OF POLITICAL SCIENCE AND LAW
Original Assignee
HAINAN VOCATIONAL COLLEGE OF POLITICAL SCIENCE AND LAW
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HAINAN VOCATIONAL COLLEGE OF POLITICAL SCIENCE AND LAW filed Critical HAINAN VOCATIONAL COLLEGE OF POLITICAL SCIENCE AND LAW
Priority to CN202210533158.XA priority Critical patent/CN115018046B/en
Publication of CN115018046A publication Critical patent/CN115018046A/en
Application granted granted Critical
Publication of CN115018046B publication Critical patent/CN115018046B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Algebra (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Virology (AREA)
  • Evolutionary Biology (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a deep learning method for detecting malicious traffic of a mobile APP (application), which comprises the following steps of firstly, adopting a related information decision matrix CIDM (common information model) to select network traffic characteristics, firstly constructing the related information decision matrix CIDM, and then carrying out attribute scoring. And secondly, detecting by adopting a malicious flow detection model based on a capsule neural network. Compared with other most advanced malicious software detection technologies, the method provided by the invention has the advantages that the accuracy rate and the recall rate are respectively improved by 9.71% and 20.18%.

Description

Deep learning method for detecting malicious traffic of mobile APP
Technical Field
The invention relates to a deep learning method for detecting malicious traffic of a mobile APP.
Background
With the popularity of the internet and mobile devices, malware has become a major threat to the growing mobile ecosystem. The statistical report of kabushi showed that by the end of 2021, the number of new malicious files detected each day reached 38.16 ten thousand, which increased by 6.1% compared to the last year. Although mobile antivirus scanners provide security protection mechanisms for Android devices, more and more advanced mobile malware may still penetrate into mobile systems by bypassing these mechanisms. As more and more user privacy information is carried by mobile devices, development of an efficient malware detection scheme is urgently needed.
Malware detection techniques can be divided into three types: static analysis, dynamic analysis, and network traffic analysis. The essential difference between these three methods is that they use different functions of different malware. The static analysis method uses the application code and its binary structure as features. However, to avoid being detected by antivirus scanners, malware authors use techniques such as repackaging and code obfuscation to generate malware variants. The dynamic analysis method is characterized by calling relation between functions during the running of the application program. This method needs to be done on a specific sandbox and needs enough execution to cover the behavior of the application. When a malware author repackages malware or obfuscated code, the functionality of the above method will change significantly, resulting in a degradation of the performance of the detection model. From another perspective, at runtime, these malware variants have similar malicious behavior. In other words, malware-triggered malicious traffic is similar. The network traffic analysis takes application-triggered network traffic as a research object, and the method extracts statistical characteristics (such as data packet size and data packet interval) or HTTP header semantic characteristics (such as host and method) from the network traffic for analysis. Thus, the network traffic analysis method overcomes the disadvantages of static and dynamic analysis, because some traffic characteristics are similar even if malicious code changes significantly.
Machine learning provides a number of methods to handle malware detection. Deep learning is often a better option if the only goal is to accurately detect malware. Research shows that deep learning shows excellent performance in different application fields compared with other machine learning technologies. Also, deep learning has been studied in the field of malware detection, and high performance has been achieved. However, deep learning algorithms for malware detection modeling are almost always based on convolutional neural networks. By pooling, the convolutional neural network facilitates analysis, and some local information is lost, resulting in reduced robustness.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a deep learning method for detecting malicious traffic of a mobile APP.
The technical scheme adopted by the invention for solving the technical problems is as follows: a deep learning method for detecting malicious traffic of mobile APP comprises the following steps,
firstly, a related information decision matrix CIDM is adopted to select network flow characteristics,
firstly, constructing a related information decision matrix CIDM:
the correlation coefficient between each pair of features is calculated using equation (1) and a matrix of correlation coefficients C is obtained, where Var (A) i ) Is a characteristic value A calculated by the formula (2) i Variance between values, M is the characteristic number, mu i Is an element A i Average value of (d); cov (A) i A j ) As shown in formula (3), is characteristic A i And A j The covariance between i and M is more than or equal to 1 and j is more than or equal to M, if the value of the elements in the matrix C is less than 0, the elements are converted into opposite numbers, namely the elements in the matrix are all non-negative;
establishing an initial correlation decision matrix O of the matrix C; each element value of each row of the current matrix O corresponds to a sequence number of its column. For example, the value of the ith left column is i, and then the elements of each row of the matrix C are arranged in ascending order according to the value of each row of the corresponding matrix O, so as to obtain O' of CDM; finally, the iteration is statistically analyzed by width in the local matrix to determine which features have been reduced;
Figure 901647DEST_PATH_IMAGE001
(1)
Figure 716151DEST_PATH_IMAGE002
(2)
Figure 508526DEST_PATH_IMAGE003
(3)
Figure 646247DEST_PATH_IMAGE004
(4)
Figure 718239DEST_PATH_IMAGE005
(5)
then, attribute scoring is carried out: calculating the frequency of each existing characteristic in a local matrix, and combining the mean value and the variance of the correlation coefficients of all the characteristics for scoring; the score value is used as a basis for judging feature reduction; the scoring equation is shown in equation (6), in whichave(C i ) Andvar(Ci) Means and variance, S _ score (A), of row i of the matrix C i ) Representing the attribute A in the local matrix in the current iteration i The statistical frequency of (1) is obtained
Figure 66044DEST_PATH_IMAGE006
(6)
And secondly, detecting by adopting a malicious flow detection model based on the capsule neural network.
The capsule neural network comprises the following processing steps: three capsules v 1 、v 2 And v 3 Is used as input for the next capsule; v. of 1 、v 2 And v 3 Are multiplied by two other matrices w respectively 1 、w 2 And w 3 To obtain u 1 、u 2 And u 3 (ii) a Then, for u 1 、u 2 And u 3 Performing a weighted sum to obtain s obtained by extrusion; v is obtained by extrusion; parameter w 1 And w 2 Obtained by back propagation learning; c. C 1 、c 2 And c 3 Is the coupling coefficient. Further, said c 1 , c 2 And c 3 Is determined by a dynamic routing algorithm.
Further, the capsule neural network uses the interval loss function equation (10)
Figure 271897DEST_PATH_IMAGE007
(10)
Wherein E k Is the presence of k groups, the presence is 1, the absence is 0; m is a unit of + 0.9, punishment false positive, presence of class k, but absence of prediction; m is At 0.1, the penalty is false negative, there is no class k, but there is a prediction.
The invention has the beneficial effects that:
the method has the following advantages in the mobile APP malicious flow detection:
1. a characteristic selection method based on a related information decision matrix (CIDM) is provided, soft dimension reduction is adopted to reduce the dimension of high-dimensional data, the characteristic of hard dimension reduction that characteristic information is easy to lose in a common characteristic selection method is overcome, flow characteristic correlation information can be effectively reserved, and subsequent detection performance is improved.
2. A new malware detection method is proposed that combines feature selection and a malware detection model. The malicious software detection model is a capsule network based on a deep learning algorithm, and overcomes the defect of poor robustness of the enhancement of operational data of a convolutional neural network pool. It is known that the capsule network is applied to the field of malware detection for the first time, and the robustness of a detection model is improved through the capsule network.
3. The effectiveness of the method of the present invention was evaluated by some detailed experiments and our method was compared with the most advanced malware detection techniques. Experiments show that compared with the most advanced malicious software detection technology, the method provided by the invention has the advantages that the accuracy rate and the recall rate are respectively improved by 9.71 percent and 20.18 percent.
Drawings
FIG. 1 is a CIDM construction process;
FIG. 2 is a step of dimension reduction algorithm 1;
FIG. 3 is a step of dimension reduction algorithm 2;
FIG. 4 is a process of a capsule network;
FIG. 5 is a step of a dynamic routing algorithm;
FIG. 6 is a processing flow of a capsule network dynamic routing algorithm;
Detailed Description
For a better understanding of the present invention, embodiments of the present invention are explained in detail below with reference to fig. 1 to 6. The invention provides a characteristic selection method based on a related information decision matrix (CIDM), wherein a related engineering process of optimizing the CIDM from an initial related decision matrix (CDM), a characteristic attribute scoring method for judging a characteristic reduction basis and a characteristic selection dimension reduction algorithm according to the CIDM and a characteristic scoring method are provided. The invention provides a method for applying a capsule network (Capsnet) with a deep learning algorithm to mobile APP malicious traffic detection for the first time, wherein a related capsule network operation mechanism, a core routing algorithm in a coupling coefficient dynamic decision process in operation and a malicious traffic detection margin loss function are provided.
Firstly, the method of the invention adopts a related information decision matrix (CIDM) to select network flow characteristics, and the specific implementation steps are as follows:
(1) constructing a related Information Decision matrix CIDM (correlation Information Decision matrix):
reference information is obtained from the CIDM that determines which features are redundant, and the CIDM is optimized from a Correlation Decision Matrix (CDM). The generation process of CDM is shown in fig. 1. First, we calculate the correlation coefficient between each pair of features using equation (1) and obtain the correlation coefficient matrix c i ) Is a characteristic value A calculated by the formula (2) i Variance between values, M is the number of features, mu i Is an element A i Average value of (a). Furthermore, Cov (A) i A j ) As shown in formula (3), is characteristic A i And A j The covariance between. In these equations, 1 ≦ i, j ≦ M, if the value of the elements in matrix C is less than 0, it is converted to the opposite number, i.e., the elements in the matrix are all non-negative. Second, an initial correlation decision matrix O for matrix C is established. Each element value of each row of the current matrix O corresponds to a sequence number of its column. For example, the value of the ith left column is i. Then, the elements of each row of matrix C are arranged in ascending order according to the value of each row of corresponding matrix O to obtain CDMO' is added. Finally, we perform a statistical analysis of the iteration with width in the local matrix (boxed position in matrix O' in FIG. 1) to determine which features are reduced. Of course, it is also possible to arrange the element values of each row of corresponding matrix O in ascending order and obtain O ' of CDM instead of arranging the element values of each row of corresponding matrix O in descending order to obtain O ' of CDM in constructing a related information decision matrix (CIDM) '
Figure 139490DEST_PATH_IMAGE001
(1)
Figure 772597DEST_PATH_IMAGE002
(2)
Figure 607698DEST_PATH_IMAGE003
(3)
Figure 351663DEST_PATH_IMAGE004
(4)
Figure 339341DEST_PATH_IMAGE008
(5)
Furthermore, if we consider only the reduced-dimension correlation, we will reduce the dominant features in some extreme cases, e.g., there are three vectors: a = [1,0,0,0,0,0 ]],b=[0,1,0,0,0,0],c=[1,1,1,0,1,0]。corrcoe⨍ (x, y) is a correlation coefficient function of two vectors x and y. Then it is determined that,corrcoe⨍(a,b)=0.2,corrcoe⨍ (a, c) = corrcoe ⨍ (b, c) = 0.32. The feature c with high information content should be reduced according to the rule that the higher the correlation between features, the lower the amount of information they carry. This is clearly erroneous. This is as if the two low information features a and b were ruling out their difference c. To avoid this, in fact, it does happen that this happens in the data used in the latter experiment, taking into accountIt is important to consider the amount of information. Informationaoi(i) Represented by the number of non-zero elements of feature i. All elements in the matrix C are adjusted by formula (4) and formula (5).
Wherein the content of the first and second substances,
Figure 143349DEST_PATH_IMAGE009
is a weighting factor that adjusts the range of correlations and information to the same order of magnitude in equation (4).ave_CRepresents the average of all the elements in the matrix C,ave_aoi_rto representaoi_rAverage value of (a). Gamma is used for controlling the proportion of correlation and information in assignment, and the value range is [1, ∞ ]. As is obvious from the formula (4), when gamma is equal to 1, only the correlation is considered; when it is towards being infinite, only the amount of information is considered. Parameter(s)aFor controlling the strength of the low information feature selection, we take 0.9 here. Now we can get the matrix CIDM according to the flow of fig. 1.
(2) And (3) attribute scoring: local matrix formed bywidthColumn sumMAnd (4) row composition. In this matrix, the frequency of each existing feature is calculated, and the mean and variance of the correlation coefficients for all features are combined and scored. The score value is used as a basis for judging feature reduction. The scores are sorted, and the iteration takes the first width feature with the largest score as a reduction object instead of using the occurrence frequency as a judgment basis, and adding coefficients among features by the mean value and the variance of the correlation so as to avoid the occurrence frequency among the same features caused by ambiguity when the reduction object is selected. Specifically, the score equation is shown in equation (6), whereave(C i ) Andvar(Ci) Means and variance, S _ score (A), of row i of matrix C i ) Representing the attribute A in the local matrix in the current iteration i The statistical frequency of (1), i.e. the score.
Figure 200167DEST_PATH_IMAGE010
(6)
(3) And (3) dimension reduction algorithm: according to the CIDM and a feature scoring method, a feature selection method based on the CIDM is provided. The pseudo code for the specific implementation of the dimension reduction algorithm is shown in fig. 2 and 3.
Through the description of the pseudo code of the algorithm, the whole dimension reduction process can be clearly understood. In the details of Algorithm 1 of FIG. 2, the parametersgoal_dimRepresenting dimensions to reduce data setX. The purpose of this parameter is to match the input interface of the subsequent classification model. E is an identity matrix of M rows and M columns. And (5) iterating to determine the iteration number in the whole reduction process.remainderIs the remainder of the difference between the original data dimension M and the gold _ dim, the difference beingwidthProvided is a method. Parameter(s)widthAs determined by experimentation. If it is notremaindeAn r not equal to 0 indicates that the algorithm has completed the operation reduction process, and a reduction of the remainder property smaller than the iteration is required. It should be noted that the use of a local matrix to score features in each iteration is a trade-off between defining features that are highly correlated.
For example, in an extreme case, there is a high correlation between two features. Currently, only one needs to be reduced. How to choose between the two. Statistical scoring based on local matrices may solve this problem well.
Then, the method adopts a malicious flow detection model based on a capsule network (Capsnet), and comprises the following specific implementation steps:
to introduce the detection model based on the capsule network more clearly, we describe the capsule network detection model from three aspects of the operation mechanism of the single capsule, the core algorithm (dynamic routing) and the loss function of the capsule network.
(1) And (3) capsule treatment: the processing of the capsules is shown in fig. 4, where the output of three capsules serves as the input for the next capsule. Three capsules v 1 、v 2 And v 3 Is used as input for the next capsule. v. of 1 、v 2 And v 3 Are multiplied by two other matrices w respectively 1 、w 2 And w 3 To obtain u 1 、u 2 And u 3 . Then, for u 1 、u 2 And u 3 Weighted sum is carried out to obtain a product obtained by extrusionAnd s. v is obtained by extrusion, only changing length and not direction. Parameter w 1 And w 2 Obtained by back propagation learning. c. C 1 、c 2 And c 3 Referred to as the coupling coefficient. They used the dynamic decision of the capsule at the time of testing. This decision process is called dynamic routing and the details are given in the next subsection. The values of u, s and v are calculated from the equations (7), (8) and (9).
Figure 623189DEST_PATH_IMAGE011
(7)
Figure 590008DEST_PATH_IMAGE012
(8)
Figure 96076DEST_PATH_IMAGE013
(9)
(2) Dynamic routing: c. C 1 , c 2 And c 3 The selection of (c) is determined by a dynamic routing algorithm, the pseudo code of which is the algorithm shown in fig. 5. First, there must be a set of parameters B, the initial values of which are all zero, wherein
Figure 374610DEST_PATH_IMAGE014
,
Figure 991537DEST_PATH_IMAGE015
,
Figure 688228DEST_PATH_IMAGE016
,...,
Figure 99618DEST_PATH_IMAGE017
Correspond to { c } 1 ,c 2 ,c 3 ,...,c i }. Assuming that T iterations are run, T being a predetermined hyper-parameter, the process flow is shown in fig. 6. It should be noted that the initial value of the parameter of the core routing algorithm in the dynamic decision process of the coupling coefficient during the operation of the capsule network can be replaced by other values, for exampleSuch as the initial value 0 of the parameter B instead.
(3) Loss function: the capsule neural network provides two loss functions, one is edge loss, for classification tasks, and the other is reconstruction loss, for sample reconstruction. As our task is to detect malicious traffic, gap loss is exploited. The interval loss function equation is shown in equation (10).
Figure 131028DEST_PATH_IMAGE007
(10)
Wherein E k Is the presence of k classes, with presence being 1 and absence being 0. m is + At 0.9, false positives are penalized, class k is present, but not predicted. m is At 0.1, the penalty is false negative, there is no class k, but there is a prediction.
The invention adopts a specific experimental design and a method for detecting malicious traffic of mobile APP based on a capsule network, and the method comprises the following steps:
(1) data set
The invention discloses a data set in a Lexical Mining of macromolecular URLs for classic Android Malware paper. For the network traffic collection method, the above paper randomly sends some events to the device using the Android-based tool software monkey to trigger network traffic during each application execution. To avoid that network traffic is mixed by different applications, they execute only one application at a time. This dataset provides information for the method, host, page, and name fields in the URL. Each sample is represented by 1708 features. The specific amounts of benign traffic and malicious traffic are shown in table 1. Our feature selection work was based on 1708 features per sample.
Label NO.
Benign 25,276
Malicious 11,251
TABLE 1 data set-related Attribute information
(2) Experimental device
1. Parameter setting analysis: width and γ are key parameters for dimension reduction herein. Different parameter values may affect the efficiency of the dimensionality reduction. In order to obtain appropriate parameter values in a data set environment and ensure high dimensionality reduction efficiency and a stable dimensionality reduction process, experiments for setting the width and gamma are designed correspondingly.
2. Characteristic analysis: the purpose of the feature analysis is to verify the validity of the dimension reduction method we propose. To achieve this goal, we have experimented with data without and with reduced dimensions using a classification algorithm. We use four of the most popular algorithms, Decision Tree (DT), Random Forest (RF), Logistic Regression (LR), K-nearest neighbor (KNN).
3. And (3) analyzing a model: model analysis focuses on the evaluation of the capsule network, showing the performance difference between the capsule network and other deep learning networks. We chose Convolutional Neural Networks (CNNs) as comparison objects. To ensure fairness and rationality, all methods used the same training and test sets in the evaluation experiments.
4. Comprehensive analysis: to further validate the effectiveness of this approach, we focused our approach on other most advanced malware detection techniques in the integrated analysis.
(3) Evaluation index
The evaluation metrics we used are accuracy, precision, recall, and F-value, which are calculated based on the fuzzy matrix. The fuzzy matrix is shown in table 2 where TP is true, meaning that the true label of the sample is positive and the result of the model prediction is also positive. TN is true negative, meaning that the true label of the sample is negative, and the model predicts it as negative. FP was false positive, meaning that the true signature of the sample was negative, but the model predicted positive. FN was false negative, which means that the true signature of the sample was positive, but the model predicted negative. The equations we use for the four indices are shown below.
Figure 286066DEST_PATH_IMAGE019
TABLE 2 fuzzy matrix
Figure 493056DEST_PATH_IMAGE020
(11)
Figure 685134DEST_PATH_IMAGE021
(12)
Figure 344786DEST_PATH_IMAGE022
(13)
Figure 162569DEST_PATH_IMAGE023
(14)
Figure 958487DEST_PATH_IMAGE024
(15)
It is to be noted that, in this context, unexplained terms are generic names in the art, and method steps not described in detail are also common knowledge of the person skilled in the art. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, or apparatus.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (5)

1. A deep learning method for detecting malicious traffic of mobile APP is characterized by comprising the following steps,
firstly, a related information decision matrix CIDM is adopted to select network flow characteristics,
firstly, constructing a related information decision matrix CIDM:
calculating the correlation coefficient between each pair of features by using the formula (1) and obtaining a correlation coefficient matrix C, wherein Var (A) i ) Is a characteristic value A calculated by the formula (2) i Variance between values, M is the number of features, mu i Is an element A i Average value of (d); cov (A) i A j ) As shown in formula (3), is characteristic A i And A j The covariance between i and M is more than or equal to 1 and j is more than or equal to M, if the value of the elements in the matrix C is less than 0, the elements are converted into opposite numbers, namely the elements in the matrix are all non-negative;
establishing an initial correlation decision matrix O of the matrix C; each element value of each row of the current matrix O corresponds to a sequence number of its column.
2. For example, the value of the ith left column is i, and then the elements of each row of the matrix C are arranged in ascending order according to the value of each row of the corresponding matrix O, so as to obtain O' of CDM; finally, the iteration is statistically analyzed by width in the local matrix to determine which features are reduced;
Figure DEST_PATH_IMAGE001
(1)
Figure DEST_PATH_IMAGE002
(2)
Figure DEST_PATH_IMAGE003
(3)
Figure DEST_PATH_IMAGE004
(4)
Figure DEST_PATH_IMAGE005
(5)
then, attribute scoring is carried out: calculating the frequency of each existing characteristic in a local matrix, and combining the mean value and the variance of the correlation coefficients of all the characteristics for scoring; the score value is used as a basis for judging feature reduction; the scoring equation is shown in equation (6), in whichave(C i ) Andvar(Ci) Means and variance, S _ score (A), of row i of the matrix C i ) Representing the attribute A in the local matrix in the current iteration i The statistical frequency of (1) is obtained
Figure DEST_PATH_IMAGE006
(6)
And secondly, detecting by adopting a malicious flow detection model based on the capsule neural network.
3. The method of claim 1, wherein the capsule neural network is processed by: three capsules v 1 、v 2 And v 3 The output vector of (a) is used as input for the next capsule; v. of 1 、v 2 And v 3 Are multiplied by two other matrices w respectively 1 、w 2 And w 3 To obtain u 1 、u 2 And u 3 (ii) a Then, for u 1 、u 2 And u 3 Performing a weighted sum to obtain s obtained by extrusion; v is obtained by extrusion; parameter w 1 And w 2 Obtained by back propagation learning; c. C 1 、c 2 And c 3 Is the coupling coefficient.
4. The deep learning method for detecting mobile APP malicious traffic as claimed in claim 2, wherein c is 1 , c 2 And c 3 Is determined by a dynamic routing algorithm.
5. The deep learning method for mobile APP malicious traffic detection as claimed in claim 2, wherein the capsule neural network employs interval loss function equation (10)
Figure DEST_PATH_IMAGE007
(10)
Wherein E k Is the presence of k groups, with presence being 1 and absence being 0; m is + 0.9, punishment false positive, presence of class k, but absence of prediction; m is At 0.1, the penalty is false negative, there is no class k, but there is a prediction.
CN202210533158.XA 2022-05-17 2022-05-17 Deep learning method for detecting malicious flow of mobile APP Active CN115018046B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210533158.XA CN115018046B (en) 2022-05-17 2022-05-17 Deep learning method for detecting malicious flow of mobile APP

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210533158.XA CN115018046B (en) 2022-05-17 2022-05-17 Deep learning method for detecting malicious flow of mobile APP

Publications (2)

Publication Number Publication Date
CN115018046A true CN115018046A (en) 2022-09-06
CN115018046B CN115018046B (en) 2023-09-15

Family

ID=83068709

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210533158.XA Active CN115018046B (en) 2022-05-17 2022-05-17 Deep learning method for detecting malicious flow of mobile APP

Country Status (1)

Country Link
CN (1) CN115018046B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190130247A1 (en) * 2017-10-31 2019-05-02 General Electric Company Multi-task feature selection neural networks
CN112733727A (en) * 2021-01-12 2021-04-30 燕山大学 Electroencephalogram consciousness dynamic classification method based on linear analysis and feature decision fusion
US20210150412A1 (en) * 2019-11-20 2021-05-20 The Regents Of The University Of California Systems and methods for automated machine learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190130247A1 (en) * 2017-10-31 2019-05-02 General Electric Company Multi-task feature selection neural networks
US20210150412A1 (en) * 2019-11-20 2021-05-20 The Regents Of The University Of California Systems and methods for automated machine learning
CN112733727A (en) * 2021-01-12 2021-04-30 燕山大学 Electroencephalogram consciousness dynamic classification method based on linear analysis and feature decision fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王澍玮 等: "基于网络流量的安卓恶意软件识别", 无线电工程 *
范雪莉 等: "基于互信息的主成分分析特征选择算法", 控制与决策 *

Also Published As

Publication number Publication date
CN115018046B (en) 2023-09-15

Similar Documents

Publication Publication Date Title
Alasmary et al. Analyzing and detecting emerging Internet of Things malware: A graph-based approach
EP3474177B1 (en) System and method of detecting malicious files using a trained machine learning model
EP3474173B1 (en) System and method detecting malicious files using machine learning
US10867042B2 (en) System and method for training a malware detection model
EP3432186B1 (en) System and method of machine learning of malware detection model
US11403396B2 (en) System and method of allocating computer resources for detection of malicious files
EP3674947B1 (en) System and method for detection of a malicious file
RU2724710C1 (en) System and method of classifying objects of computer system
EP3474175B1 (en) System and method of managing computing resources for detection of malicious files based on machine learning model
Iliou et al. Towards a framework for detecting advanced web bots
More et al. Trust-based voting method for efficient malware detection
Falor et al. A deep learning approach for detection of SQL injection attacks using convolutional neural networks
Taylor et al. Using variational autoencoders to increase the performance of malware classification
Jyothish et al. Effectiveness of machine learning based android malware detectors against adversarial attacks
CN115018046A (en) Deep learning method for detecting malicious traffic of mobile APP
Aleshkin et al. Predicting the growth of total number of users, devices and epidemics of malware in internet based on analysis of statistics with the detection of near-periodic growth features
Guntupalli et al. Enhancing the Security by Analyzing the Behaviour of Multiple Classification Algorithms with Dimensionality Reduction to Obtain Better Accuracy
Pinto et al. Static analysis on disassembled files: A deep learning approach to malware classification
Wu et al. ConvDroid: Lightweight Neural Network based Andoird Malware Detection.
EP3674948B1 (en) System and method for classification of objects of a computer system
Kaithal et al. System Malware Detection on Android Application File Packages Using Heuristic Optimizer through Hybrid Approach EDT-ABO Algorithm
Gaurav et al. Adaptive Defense Mechanisms Against Phishing Threats in 6G Wireless Environments
Alowaidi Modified Intrusion Detection Tree with Hybrid Deep Learning Framework based Cyber Security Intrusion Detection Model
Han et al. Implementing Automated Safety Circuit Breakers of Large Language Models for Prompt Integrity
George AdaBoost IDS to detect Zero Day attacks and reduce false positives

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant