CN104866763A - Permission-based Android malicious software hybrid detection method - Google Patents

Permission-based Android malicious software hybrid detection method Download PDF

Info

Publication number
CN104866763A
CN104866763A CN201510282507.5A CN201510282507A CN104866763A CN 104866763 A CN104866763 A CN 104866763A CN 201510282507 A CN201510282507 A CN 201510282507A CN 104866763 A CN104866763 A CN 104866763A
Authority
CN
China
Prior art keywords
application
authority
application program
vector
good
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510282507.5A
Other languages
Chinese (zh)
Other versions
CN104866763B (en
Inventor
李晓红
赵仁
焦浩峰
胡静
许光全
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201510282507.5A priority Critical patent/CN104866763B/en
Publication of CN104866763A publication Critical patent/CN104866763A/en
Application granted granted Critical
Publication of CN104866763B publication Critical patent/CN104866763B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/563Static detection by source code analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/03Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
    • G06F2221/033Test or assess software

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Storage Device Security (AREA)

Abstract

The invention discloses a permission-based Android malicious software hybrid detection method. The method comprises the following steps: steps one, decompiling an Android application program and obtaining application program application permissions; step two, combining a system setting permission to carry out permission detection on the application program application permissions; dividing all applications to be detected into a kind application set, a malicious application set and a suspicious application set according to the difference of the conditions of the application program application permissions; step three, dynamically acquiring and detecting the behaviors of the application programs in the suspicious application set, collecting interface calling related to sensitive applications, giving vector space representation, and performing application program vectorization; step four, obtaining the detection result of kind application programs meeting safety detection standard through safety detection. Compared with the prior art, the permission-based Android malicious software hybrid detection method integrates two affecting factors of euclidean distance and cosine similarity, and the obtained detection result is more comprehensive and higher in accuracy.

Description

Based on the Android malware mixing detection method of authority
Technical field
The software security that the present invention relates to computer network and computer security detects and the field such as mobile terminal safety, particularly a kind of fairness of secure exchange agreement and the checking of non-repudiation.
Background technology
Along with developing rapidly of mobile communication technology and mobile hardware equipment, people are more and more stronger to the dependence of smart mobile phone in daily life work, and therefore the market share of Android increases rapidly.As the mobile terminal intelligent operating system of main flow, Android allows user by downloading and install third-party application to meet consumers' demand.But, because third party market lacks supervision and management, cause the continuous increase of Android platform Malware and mutation quantity thereof.The security of this phenomenon to Android platform constitutes huge threat.
The increasing rapidly of the rising of the Android market share and Android malware quantity makes to carry out research to Android malware analysis and resolution and is significant.And the deficiency of android system design itself reflects the necessity of this research further.
In recent years, the analysis and resolution for mobile terminal Malware has become a very important part in security study, and researcher has done large quantifier elimination in this respect.Present stage is mainly divided into based in authority and Behavior-based control these two about the safety research of Android application program.Rights management is carried out managing and detecting mainly for the authority of application application; The security of Behavior-based control detects the behavioral characteristics that mainly embodies in operational process with application program for foundation, in conjunction with other data analysing method, provides judgement to the security of application.Detection method based on authority has feature fast and efficiently in some cases, but undesirable for feature unconspicuous application Detection results; Behavior-based detection has that information acquisition amount is large, analytical approach improves feature accurately, but testing result may because information covers do not cause wrong report comprehensively, and it is not very high for significantly applying detection time long, efficiency for feature.
The present invention is directed to above-mentioned present situation, propose a kind of method of the hybrid detection based on authority.First according to the level of security of application software application authority, Preliminary detection is carried out to application, good will application and malicious application can be detected; Secondly, follow the tracks of behavior when suspicious application runs, collect the interface interchange relevant to sensitive permission, provide space vector and represent, and by the proper vector of TF-IDF algorithm computing application, adopt the detection that the detection method such as compute euclidian distances and cosine similarity realizes suspicious application.The contrast of experimental result and other work shows that the method for this patent improves the accuracy rate of Android malware detection really.
Summary of the invention
In order to overcome the problem of above-mentioned prior art, the present invention proposes a kind of Android malware mixing detection method based on authority, to detect for the purpose of Android mobile terminal application security, proposing one can the mixing detection method of analysis & verification Malware, and achieves this testing tool.
The present invention proposes a kind of Android malware mixing detection method based on authority, the method comprises and not living:
Step one, decompiling is carried out to Android application program, the program that is applied application authority;
Step 2, coupling system setting authority application programs application authority carries out authority detection; According to the difference of application program authority situation, all application to be detected are divided into good will application sets, malicious application collection and suspicious application sets;
Step 3, Dynamic Acquisition carry out detection of dynamic for the application behavior in suspicious application sets, collect the interface interchange relevant with sensitive application, provide vector space and represent, and carry out application program vectorization;
Step 4, to detect through security, obtain and meet the testing result of " the good will application program " of security examination criteria.
Compared with prior art, these two aspects of safety research of the safety research based on authority of Android application program and Behavior-based control combine and can maximize favourable factors and minimize unfavourable ones to a certain extent by the present invention, make the detection of significantly applying feature have feature fast, also had both behavioral value analytical approach simultaneously and improved and feature accurately.Target of the present invention detects whether Android application software is Malware.
Accompanying drawing explanation
Fig. 1 is the Android malware mixing detection method overall flow figure based on authority;
Fig. 2 is common sensitive permission sample statistics figure;
Fig. 3 is the testing result comparison diagram of the present invention and MDBC method.
Embodiment
Below in conjunction with accompanying drawing, the specific embodiment of the present invention is described in detail, if these embodiments exist exemplary content, should not be construed to limitation of the present invention.
The present invention proposes a kind of hybrid detection framework based on authority: 1) first carry out Preliminary detection according to the authority of application application, detect good will application and malicious application; Then follow the tracks of the behavior of suspicious application, collect the interface interchange relevant to sensitive permission and detect, and then determine application type; 2) vector space model is introduced.Time suspicious application is detected, according to the sensitive information collected, algebraization is carried out to application, introduce vector space model and represent application; 3) Euclidean distance and cosine similarity method is adopted.The vector space compute euclidian distances that upper use is drawn also is carried out to cosine similarity and is compared, and suspect application programs is categorized as good will and malicious application the most at last.
According to the security of Android application program, Android application program is divided three classes: good will application, malicious application and suspicious application.Good will application refers to the application in use can not implementing malicious act to mobile phone and private data.Malicious application refers to the application in use implementing malicious act to mobile phone and private data.Suspicious application refers to the indefinite application of current security type, may implement malicious act in the process used to mobile phone and private data, also may implement malicious act to mobile phone and private data.
After three classes on done Android application program divide, design the overall flow of the mixing detection method based on authority of the present invention as shown in Figure 1.Testing process mainly comprises four steps: authority detects, dynamic behaviour obtains, will apply vectorization and security detects.Each step completes different functions, and four cooperate mutually, finally completes and detects the security of application.This flow process specifically describes: carry out decompiling to Android application program, the program that is applied application authority (generation list); Coupling system setting authority application programs application authority carries out authority filtration (namely to first doing Preliminary detection according to the authority of application application); According to the difference of application program authority situation, all application to be detected are divided into good will application sets, malicious application collection and suspicious application sets; So far, the detection of the application that good will application sets and malicious application are concentrated completes, the detection of dynamic after not needing to carry out.Dynamic Acquisition carries out detection of dynamic for the application behavior in suspicious application sets, collect the interface interchange relevant with sensitive application, provide vector space to represent, and carry out application vectorization (calculating the proper vector of application), eventually pass security to detect, provide " good will application program " this testing result meeting security examination criteria.
Four markingoff pins below compare detailed introduction to four detecting steps.
One, authority detects
In android system, if application program will complete certain behavior, just first to file corresponding authority must be obtained.Otherwise application program cannot call the API corresponding with this authority, application program is caused to complete the behavior.So, for the Preliminary detection of application program, the detected rule based on authority can be designed, the application program (safety application program and applied for the application program of system-level authority) with obvious characteristic can be detected by these rules.State in the authority information of the application program AndroidManifest.xml file in application source code bag, by decompiling instrument, the AndroidManifest.xml file of application program can be obtained, and then the authority information of application program can be obtained.
Android system itself provides hundreds of authority of four kinds of level of securitys.Four kinds of level of securitys are: Normal, Dangerous, Signature and SignatureOrSystem.These authorities are divided into 12 classes, such as access location information, accesses network and access personal information etc.In order to test needs, the authority of Normal and SignatureOrSystem group deposited separately, the authority of Dangerous and Signature group is deposited according to category classification.
Definition set Apper={per i, per 2..., per nrepresent the authority set that application is applied for, per ithe authority of (1≤i≤n) representative application application;
The set A ndper={perSet of all authorities of definition Android 1, perSet 2..., perSet 14, perSet i(1≤i≤12) represent the set of sensitive permission information in each classification, perSet 13represent the set of Normal group authority information, perSet 14represent the set of SignatureOrSystem group authority information.
The rule that static rights detects can be expressed as:
If 1 AppPer ∩ perSet 13=AppPer, application can be judged as good will application;
If 2 AppPer ∩ perSet 14≠ φ, application can be judged as malicious application;
If 3 AppPer ∩ perSet i≠ φ, 1≤i≤12, the sensitive permission of the i-th class has been applied in application, may there is the deliberate threat belonging to such.
Use above-mentioned filtering rule can realize the preliminary classification of application program, the authority filtering out those applications is all the application program of Normal group authority and the application program of having applied for SignatureOrSystem group authority.For the application of application sensitive permission, the type of applying for sensitive permission can also be determined, a statistics can be done to the application situation of sensitive permission, have a preliminary understanding to the feature of sample.
Two, dynamic behaviour obtains
Apply for that because not all the Android application program of sensitive permission is all malice, so need to detect to authority the suspicious sample obtained to carry out further behavioral value.According to the behavioural characteristic that application program shows when running, bonding behavior characteristic detection method can realize the final classification to suspect application programs.The work that dynamic behaviour obtains collects behavioural characteristic when application program is run, for behavioral value afterwards provides data.
The accuracy of behavioral value depends on the completeness of the behavior sequence feature of the application got to a great extent.Comprise various assembly in Android application, a series of interface interchange can be triggered by assembly.In order to the behavioural characteristic collecting suspicious application as much as possible, in simulator when installation and operation application, use monkeyrunner that all component of application is run one time.This testing tool can send sequence of events stream to application, obtains the behavioural characteristic be applied in when receiving various event.
Behavioural information when application program is run has certain embodiment at the every aspect of android system.Consider legibility and the operability of behavioural information, the present invention is mainly through obtaining the method call information of Framework and native layer, by android system itself Log mechanism and DroidBox can obtain application program in this two-layer behavioural information, these information can reflect the behavioural characteristic of application more accurately.Because DroidBox can catch the method call of native layer, so some malicious act walking around Framework layer also can be captured to, be conducive to the completeness improving behavioural information.
When the operation obtaining application program after interface message, use shell-command can show the behavioural information that gets intuitively or behavioural information be saved in a text with for further analysis.Next, needing the interface message to obtaining to filter, retaining the interface message corresponding with sensitive permission.In order to the corresponding relation of the function interface that defines the competence, matched interfaces and authority information, can add up with security-related interface interchange information in application programs operational process according to this corresponding relation.
Three, vectorization is applied
For method set ε: the={ f of the interface interchange information architecture application program obtained 1, f 2, f 3... f i... f n.Wherein, f i(1≤i≤n) represents i-th interface of this application call, n represents total number of the interface that all Dangerous and Signature authorities are corresponding in the daily record of collecting, each application program can represent with a ε, introduce C and represent the set be made up of ε, represent the set of suspect application programs.
Definition w i,jat application ε jin method f ithe number of times occurred.If f ido not occur, then define w i,j=0.Like this, ε jjust vector form can be expressed as: ε j={ w 1, j, w 2, j, w 3, j..., w n,j.
In order to represent these information, introduce vector space model (Vector Space Model (VSM)), these should can be used as Algebraic Expression like this, each component in vector is non-negative.That the coordinate points (i, j) in VSM represents is application ε jmethod f iinformation.
In order to obtain the proper vector of each application, need to calculate weight to each method in application.Adopt TF-IDF algorithm herein, in the algorithm, for application ε jin method f i, the computing method of weight weight (i, j) are:
Weight (i, j)=tf i,jidf iformula (5-1)
Wherein, tf i,jrepresent application ε jin method f ithe frequency occurred, idf imethod for expressing f iinverse document frequency.Eigenvector algorithm is described below:
Input: the vector representation ε of application to be detected jwith the set C of the vector representation of one group of application;
Export: the proper vector of application to be detected.
Start:
Variable declarations:
Sum: ε jmiddle digits sum;
NumberOfApps: the sum of the element that set C comprises;
Count: calculate the sum comprising the application of certain method in C, initial value is 0;
sum=w 1,j+w 2,j+…+w n,j
numOfApps←|C|;
For ε jin each method f i, calculate weight (i, j) as follows:
tf i , j = ( double ) ( w i , j sum )
For each vector representation ε in C j, computing method f iinverse document frequency idf i, concrete computation process is: the quantity numberOfApps of vector in set of computations C, adds up f simultaneously ithe number of times whether occurred in each vector in set C obtains count, by the total numberOfApps of vector divided by the vector sum count comprised, obtains business and takes the logarithm;
weight(i,j)=tf i,j·idf i
Finally obtain ε jproper vector
Export the proper vector of application program to be detected
The coded representation of this computation process is as follows:
if(w i,j!=0)
count++;
idf i = log ( ( double ) ( numberOfApps count ) ) ;
weight(i,j)=tf i,j*idf i
Finally obtain ε jproper vector
Output characteristic vector
Terminate.
Four, safety detecting method
For the application program that security type is known, the expression of their proper vector in vector space can be determined, these vectors using by as calculate and classify basis.When a UNKNOWN TYPE application program is detected time, need the distance that the proper vector calculating it is applied with good will, malice two class.According to result of calculation, classification is realized to this application.When calculating distance, introduce the method that two kinds calculate distance:
Euclidean distance: the length of line segment between this index expression two points, computing method are:
d ( x , y ) = Σ ( x i - y i ) 2 Formula (5-2)
Here, x is first point, and y is second point, and x iand y ithe value of i-th coordinate of first point and second point respectively.
Cosine similarity: this index can evaluate two vectorial similarity degrees by calculating two vectorial angle cosine.Experiment still adopts the title of " distance ", and uses 1-cosSimiliarity as the computing method of distance:
d ( x , y ) = 1 - cos θ = 1 - u · v | u | · | v | Formula (5-3)
Here, u and v is the vector representing x and y respectively, and θ represents the angle between these two vectors, and uv represents these two vectorial inner products, | u| and | v| represents two vectorial length respectively.The span of d (x, y) is from 0 to 1, and 1 represents that two vectors are completely dissimilar, and 0 represents that two vectors have high similarity.
In order to carry out comprehensive evaluation to an application, the method for comprehensive three kinds of final distances of calculating is minor increment, mean distance and ultimate range respectively.By these three kinds of disposal routes, provide the overall assessment that is applied to good will application and malicious application distance, then according to the size of distance, judgement is provided to the classification belonging to application.
After these three range indexs calculating application to be measured, adopt the standard as classification that numerical value in similar index is little.Such as: as MaxLen [0] <MaxLen [1], application to be detected is judged as good will application; Otherwise application to be detected is judged as malicious application.
Distance detection algorithm is described below:
Input: the vectorial δ of the application to be detected and proper vector set Set={ δ of the control group 1, δ 2..., δ m;
Export: application to be detected divides the result of calculation of three distances being clipped to good will and malice control group;
Pre-service: the proper vector set of control group is divided into according to good will application and malicious application: Set ben={ δ b1, δ b2..., δ bjand Set mal={ δ m1, δ m2..., δ mk, wherein b represents good will application program, and m represents malicious application, and j represents the quantity of good will application program, and k represents the quantity of malicious application.
Start:
Variable declarations: distType: represent distance type of detection: 1 represents Euclidean distance, and 2 represent cosine similarity;
switch(distTpye):
Case 1 (calculating Euclidean distance):
For δ=(x 1, x 2..., x n) and δ bi=(y 1, y 2..., y n) ∈ Set ben, calculate:
DisToBen [ i ] = Math . sqrt ( &Sigma; i = 1 n ( x i - y i ) 2 ) ;
For δ=(x 1, x 2..., x n) and δ mi=(y 1, y 2..., y n) ∈ Set mal, calculate:
DisToMal [ i ] = Math . sqrt ( &Sigma; i = 1 n ( x i - y i ) 2 ) ;
break;
Case 2 (calculating Euclidean distance): n represents the number of element in this proper vector, i represents the position of this proper vector in set, j represents the quantity of good will application program, k represents the quantity of malicious application, DisToBen [i] represents the distance detecting i-th good will proper vector in proper vector and set, the distance of i-th malice proper vector during DisToMal [i] expression detects proper vector and gathers:
For δ=(x 1, x 2..., x n) and δ bi=(y 1, y 2..., y n) ∈ Set ben, calculate:
DisToBen [ i ] = 1 - Math . cos ( ( &Sigma; i = 1 n x i * y i ) / ( &Sigma; i = 1 n x i 2 * &Sigma; i = 1 n y i 2 ) ) ;
For δ=(x 1, x 2..., x n) and δ mi=(y 1, y 2..., y n) ∈ Set mal, calculate:
DisToMal [ i ] = 1 - Math . cos ( ( &Sigma; i = 1 n x i * y i ) / ( &Sigma; i = 1 n x i 2 * &Sigma; i = 1 n y i 2 ) ) ;
break;
MaxLen[0]=maxDist(DisToBen[],j),AvgLen[0]=avgDist(DisToBen[],j),
MinLen[0]=minDist(DisToBen[],j);
MaxLen[1]=maxDist(DisToMal[],k),AvgLen[1]=avgDist(DisToMal[],k),
MinLen[1]=minDist(DisToMal[],k);
Export the result calculating distance and obtain;
Algorithm terminates.
Five, evaluation measures
In order to make evaluation to the accuracy detected, researchers have proposed multiple evaluation measures.The scheme introduced is adopted to make evaluation to test method herein below.
First, we introduce definition below:
N ben → ben: good will application is judged as the number of good will application; n ben → mal: good will application is judged as the number of malicious application; n mal → ben: malicious application is judged as the number of good will application; n mal → mal: malicious application is judged as the number of malicious application.Like this, we provide accuracy and error rate is defined as follows:
Acc = n ben &RightArrow; ben + n mal &RightArrow; mal n ben &RightArrow; ben + n mal &RightArrow; mal + n ben &RightArrow; mal + n mal &RightArrow; ben Formula (5-4)
Err = n ben &RightArrow; mal + n mal &RightArrow; ben n ben &RightArrow; ben + n mal &RightArrow; mal + n ben &RightArrow; mal + n mal &RightArrow; ben Formula (5-5)
Similar, it is as follows that introducing defines FPR (false positive rate) and TPR (true positive rate):
FPR = n ben &RightArrow; mal n ben &RightArrow; ben + n ben &RightArrow; mal Formula (5-6)
TPR = n mal &RightArrow; mal n mal &RightArrow; ben + n mal &RightArrow; mal Formula (5-7)
This literary grace carrys out the accuracy income evaluation to testing result in this way, and this evaluation method has simply, effective feature.Operability is stronger in the application, is conventional experimental evaluation method.
Experimental situation
The experiment of this patent is mainly based on following environment: Ubuntu 13.04 operating system, Android 2.3 simulator, DroidBox 2.3, Python 2.7, Java 1.7.The automatic test work instruments such as monkeyrunner and monkey have also been used in the process of experiment.
Six, data set
In order to verify the validity of detection method, experiment adopts from Google Play application market, and 982 samples of third-party application market and Android Malware Genome Project are as data set.In order to determine the quantity of data centralization good will sample and malice sample, before experiment, with F-Secure, Avast, LBE and Kingsoft, safety detection is carried out to sample.In these detect, if there is more than two fail-safe software testing result to be maliciously, so sample will be judged as malice sample, otherwise is just judged to be good will sample.Malice sample is made to be mistaken for the probability of good will sample so extremely low.After safety detection, shown in the source obtaining experimental data and security statistical conditions table 1 thereof.
Experimental verification and result
(1) authority filter result
Authority according to the application of data centralization sample carries out Preliminary detection to sample.If the authority of the whole Normal of being groups of application application or applied for the authority of SignatureOrSystem group, application is understood and is directly judged to be good will or maliciously, and does not need the detection carrying out next step.Other application is divided into suspicious application, and these application are the objects detected further.After authority detects, the statistics of this three class of data centralization application is as shown in table 2.
In malicious application, have some malicious acts to be very common, the behaviors such as such as accesses network, contact person and information, these behaviors complete the support needing corresponding sensitive permission smoothly.Therefore, in order to understand the application situation of data centralization sample to this few class sensitive permission, the application situation for this few class sensitive permission carries out detecting and adding up, and obtains as shown in table 3.
In order to represent statistics more intuitively, Fig. 1 provides the histogram of the common sensitive permission application situation of data centralization.This figure illustrates the application situation of this six classes sensitive permission intuitively: what applications were maximum is the sensitive permission of accessing mobile phone state class, and minimum is then the sensitive permission of accessing SD card class.
(2) authority filter result is analyzed
Through filtering, in the data centralization that experiment adopts, the sample of 20.17% can determine security class, wherein has the good will application of 15.89% and the malicious application of 4.28%.Security classes of other application be can not determine, are divided into suspicious application, and this testing result compares and tallies with the actual situation, because the security of a lot of sample cannot be judged by static nature.Such as: normal application also may apply for the authority networked, but can not do and send the operation such as privacy information or malicious downloading.
Carry out adding up to the application situation of common sensitive permission and find: the sample number of data centralization application access mobile phone state, network, contact person and info class authority is many.Wherein, the sensitive permission more than a part of sample application two class.These suspicious samples need to carry out further behavioral value.And along with the variation of application and development, sensitive permission more than increasing application application two class.The security of these samples can finally must determined through behavioral value.
(3) behavioral value result
Test the method detection of suspicious application being adopted to cross validation.Detect remaining suspicious application through authority and be equally divided into 3 groups.We select one group as a control group in turn when detection, and the sample of this group can be divided into benevolent software and Malware two class by the detection of fail-safe software.We are by the sample of control group installation and operation on simulator, collect function call information when sample runs; Utilize the method for chapter 3 to carry out vectorization and characterization to sample, afterwards sample is expressed as the vector in space coordinates.We can obtain the proper vector set of good will sample and the proper vector set of malice sample by this method.These set of eigenvectors cooperations are the basis of classification, and samples of other groups realize detecting the security of application by the distance calculating two class samples in control group (good will sample and malice sample) and gather.
According to the proper vector feature of control group in experiment, select 10 respectively, 15,20 most representative behavioural characteristics are as the standard of classifying.Euclidean distance and cosine similarity is successively adopted to test as classification foundation.When one group carry out as a control group testing complete after obtain TPR and FPR that this group tests, finally TPR and FPR of three groups of experiments is averaged and obtains experimental result.Through detecting, acquired results is as shown in table 4 and table 5.
(4) behavioral value results contrast
From above-mentioned experimental result, the result that the testing result using cosine similarity to obtain can obtain than Euclidean distance is on the whole good, and accuracy rate is high.But this situation is not absolute, can see from table when adopting Euclidean distance and mean distance can be better than the testing result adopting cosine similarity and ultimate range to obtain as index as the result obtained when measurement index.This illustrates that the effect that the evaluation criterion chosen when application program detects plays at some time can be more important than detection method.
Experimental result shows, is better than the result of Euclidean distance calculating with cosine similarity as the result that distance calculating method obtains; Be better than as the result that measurement index obtains the result that minimum and maximum distance obtains with mean distance.The best effects of test method can reach the TPR of 91.2%, controls 2.1% by FPR simultaneously, and total precision reaches 95.8%, and Detection results is more satisfactory.In order to weigh experimental result, with the work of the people such as Suleiman Y.Yerima be called HDPA to such as showing 6:(context of methods, the method for Suleiman Y.Yerima is called MDBC)
In order to the comparative result of both displays more intuitively, represent that the Detection results of two kinds of detection methods compares with Fig. 2.Can find out intuitively in the drawings, detection method in this paper has just surmounted MDBC when the feature chosen is more than 15, and the FPR of context of methods is also significantly less than the method for contrast simultaneously, indicates the validity of context of methods.
(5) behavioral value interpretation of result
Comparative result can be found out by experiment, adopts cosine similarity and mean distance to be best as the effect obtained when classification foundation.FPR controls in very low level by the method while the high TPR of guarantee, reaches the target of the low rate of false alarm of high accuracy of experiment design itself.
Find after by analysis: adopt cosine similarity more can reflect the similarity between similar sample as criterion distance.Because malice sample of the same type has similar behavior expression, but method call number of times is not necessarily similar, this causes the space length between similar sample comparatively large, causes erroneous judgement.
The result of Euclidean distance and cosine similarity two kinds of detection meanss is compared and can find: Euclidean distance as the Detection results of criterion distance not necessarily than the difference of cosine similarity because the type difference calculating distance also can affect the accuracy of detection method to a certain extent.The optimal result that the present invention detects considers two kinds of influence factors and obtains.
Table 1, experiment sample statistical form
Table 2, authority testing result statistical form
Table 3, common sensitive permission application situation statistical form
The experimental result that table 4, Euclidean distance obtain as computing method
The experimental result that table 5, cosine similarity obtain as computing method
Comparing of the result of table 6, this experimental result and MDBC

Claims (5)

1., based on an Android malware mixing detection method for authority, it is characterized in that, the method includes the steps of:
Step one, decompiling is carried out to Android application program, the program that is applied application authority;
Step 2, coupling system setting authority application programs application authority carries out authority detection; According to the difference of application program authority situation, all application to be detected are divided into good will application sets, malicious application collection and suspicious application sets;
Step 3, Dynamic Acquisition carry out detection of dynamic for the application behavior in suspicious application sets, collect the interface interchange relevant with sensitive application, provide vector space and represent, and carry out application program vectorization;
Step 4, to detect through security, obtain and meet the testing result of " the good will application program " of security examination criteria.
2. as claimed in claim 1 based on the Android malware mixing detection method of authority, it is characterized in that, the Rule Expression that in described step 2, authority detects is:
If AppPer ∩ is perSet 13=AppPer, application is judged as good will application;
If AppPer ∩ is perSet 14≠ φ, application is judged as malicious application;
If AppPer ∩ is perSet i≠ φ, 1≤i≤12, the sensitive permission of the i-th class has been applied in application, may there is the deliberate threat belonging to such.
Wherein, Apper={per i, per 2..., per nrepresent the authority set that application is applied for, per ithe authority of (1≤i≤n) representative application application; Andper={perSet 1, perSet 2..., perSet 14represent the set of Android all authorities, perSet i(1≤i≤12) represent the set of sensitive permission information in each classification, perSet 13represent the set of Normal group authority information, perSet 14represent the set of SignatureOrSystem group authority information.
3., as claimed in claim 1 based on the Android malware mixing detection method of authority, it is characterized in that, described step 3 specifically comprises following process:
Installation and operation application program in simulator, uses monkeyrunner that all component of application program is run one time, sends sequence of events stream by this testing tool to application program, obtains application program receiving the behavioural characteristic when various event.
When the operation obtaining application program after interface message, shell-command is used to show the behavioural information that gets or behavioural information be saved in a text with for further analysis; Next, the interface message obtained is filtered, retains the interface message corresponding with sensitive permission; Matched interfaces and authority information, add up with security-related interface interchange information according in this corresponding relation application programs operational process.
4., as claimed in claim 1 based on the Android malware mixing detection method of authority, it is characterized in that, described step 3 specifically comprises following process:
Input, the vector representation ε of application program to be detected jwith the set C of the vector representation of one group of application program;
Start:
Variable declarations:
Sum: ε jmiddle digits sum;
NumberOfApps: the sum of the element that set C comprises;
Count: calculate the sum comprising the application of certain method in C, initial value is 0;
sum=w 1,j+w 2,j+…+w n,j
numOfApps←|C|;
For ε jin each method f i, calculate weight (i, j) as follows:
tf i , j = ( double ) ( w i , j sum )
For each vector representation ε in C j, computing method f iinverse document frequency idf i, concrete computation process is: the quantity numberOfApps of vector in set of computations C, adds up f simultaneously ithe number of times whether occurred in each vector in set C obtains count, by the total numberOfApps of vector divided by the vector sum count comprised, obtains business and takes the logarithm;
weight(i,j)=tf i,j·idf i
Finally obtain ε jproper vector
Export the proper vector of application program to be detected
5. as claimed in claim 1 based on the Android malware mixing detection method of authority, it is characterized in that, the distance that safety detection in described step 4 is applied with good will, malice two class by the proper vector calculating application behavior realizes, and specifically comprises following algorithm:
Input the vectorial δ of the application to be detected and proper vector set Set={ δ of the control group 1, δ 2..., δ m;
Pre-service: the proper vector set of control group is divided into according to good will application program and malicious application: Set ben={ δ b1, δ b2..., δ bjand Set mal={ δ m1, δ m2..., δ mk; Wherein b represents good will application program, and m represents malicious application, and j represents the quantity of good will application program, and k represents the quantity of malicious application;
Variable declarations: distType represents distance type of detection: 1 represents Euclidean distance, and 2 represent cosine similarity;
Situation one, calculating Euclidean distance, wherein: n represents the number of element in this proper vector, i represents the position of this proper vector in set, j represents the quantity of good will application program, k represents the quantity of malicious application, DisToBen [i] represents the distance detecting i-th good will proper vector in proper vector and set, the distance of i-th malice proper vector during DisToMal [i] expression detects proper vector and gathers:
For δ=(x 1, x 2..., x n) and δ bi=(y 1, y 2..., y n) ∈ Set ben, calculate:
DisToBen [ i ] = Math . sqrt ( &Sigma; i = 1 n ( x i - y i ) 2 ) ;
For δ=(x 1, x 2..., x n) and δ mi=(y 1, y 2..., y n) ∈ Set mal, calculate:
DisToMal [ i ] = Math . sqrt ( &Sigma; i = 1 n ( x i - y i ) 2 ) ;
Situation two, calculating Euclidean distance:
For δ=(x 1, x 2..., x n) and δ bi=(y 1, y 2..., y n) ∈ Set ben, calculate:
DisToBen [ i ] = 1 - Math . cos ( ( &Sigma; i = 1 n x i * y i ) / ( &Sigma; i = 1 n x i 2 * &Sigma; i = 1 n y i 2 ) ) ;
For δ=(x 1, x 2..., x n) and δ mi=(y 1, y 2..., y n) ∈ Set mal, calculate:
DisToMal [ i ] = 1 - Math . cos ( ( &Sigma; i = 1 n x i * y i ) / ( &Sigma; i = 1 n x i 2 * &Sigma; i = 1 n y i 2 ) ) ;
MaxLen[0]=maxDist(DisToBen[],j),AvgLen[0]=avgDist(DisToBen[],j),
MinLen[0]=minDist(DisToBen[],j);
MaxLen[1]=maxDist(DisToMal[],k),AvgLen[1]=avgDist(DisToMal[],k),
MinLen [1]=minDist (DisToMal [], k); Export the result of calculation that application program to be detected divides three distances being clipped to good will and malice control group.
CN201510282507.5A 2015-05-28 2015-05-28 Android malware mixing detection method based on permission Expired - Fee Related CN104866763B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510282507.5A CN104866763B (en) 2015-05-28 2015-05-28 Android malware mixing detection method based on permission

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510282507.5A CN104866763B (en) 2015-05-28 2015-05-28 Android malware mixing detection method based on permission

Publications (2)

Publication Number Publication Date
CN104866763A true CN104866763A (en) 2015-08-26
CN104866763B CN104866763B (en) 2019-02-26

Family

ID=53912585

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510282507.5A Expired - Fee Related CN104866763B (en) 2015-05-28 2015-05-28 Android malware mixing detection method based on permission

Country Status (1)

Country Link
CN (1) CN104866763B (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105426762A (en) * 2015-12-28 2016-03-23 重庆邮电大学 Static detection method for malice of android application programs
CN106548073A (en) * 2016-11-01 2017-03-29 北京大学 Screening method based on malice APK of convolutional neural networks
CN106557695A (en) * 2015-09-25 2017-04-05 卓望数码技术(深圳)有限公司 A kind of malicious application detection method and system
WO2017084451A1 (en) * 2015-11-18 2017-05-26 腾讯科技(深圳)有限公司 Method and apparatus for identifying malicious software
CN106874761A (en) * 2016-12-30 2017-06-20 北京邮电大学 A kind of Android system malicious application detection method and system
CN107169350A (en) * 2017-05-10 2017-09-15 国网江苏省电力公司电力科学研究院 A kind of detection and blocking-up method for Mobile solution using abnormal authority
CN107330325A (en) * 2017-06-30 2017-11-07 北京金山安全管理系统技术有限公司 The authentication method and device of application file
CN107944274A (en) * 2017-12-18 2018-04-20 华中科技大学 A kind of Android platform malicious application off-line checking method based on width study
CN108073803A (en) * 2016-11-18 2018-05-25 北京京东尚科信息技术有限公司 For detecting the method and device of malicious application
CN108491722A (en) * 2018-03-30 2018-09-04 广州汇智通信技术有限公司 A kind of malware detection method and system
CN108509796A (en) * 2017-02-24 2018-09-07 中国移动通信集团公司 A kind of detection method and server of risk
CN108763958A (en) * 2018-06-01 2018-11-06 中国科学院软件研究所 Intelligent mobile terminal sensitive data authority checking defect inspection method based on deep learning
CN109639884A (en) * 2018-11-21 2019-04-16 惠州Tcl移动通信有限公司 A kind of method, storage medium and terminal device based on Android monitoring sensitive permission
CN110390198A (en) * 2019-07-31 2019-10-29 阿里巴巴集团控股有限公司 Risk method for inspecting, device and the electronic equipment of a kind of pair of small routine
CN111353146A (en) * 2020-05-25 2020-06-30 腾讯科技(深圳)有限公司 Method, device, equipment and storage medium for detecting sensitive permission of application program
WO2021027831A1 (en) * 2019-08-15 2021-02-18 中兴通讯股份有限公司 Malicious file detection method and apparatus, electronic device and storage medium
CN114356788A (en) * 2022-03-21 2022-04-15 大鲲智联(成都)科技有限公司 Application program detection method, device, equipment and medium based on user information

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB0921587D0 (en) * 2008-12-11 2010-01-27 Scansafe Ltd Malware detection
CN103500307A (en) * 2013-09-26 2014-01-08 北京邮电大学 Mobile internet malignant application software detection method based on behavior model
CN103617393A (en) * 2013-11-28 2014-03-05 北京邮电大学 Method for mobile internet malicious application software detection based on support vector machines
CN104376262A (en) * 2014-12-08 2015-02-25 中国科学院深圳先进技术研究院 Android malware detecting method based on Dalvik command and authority combination
CN104598825A (en) * 2015-01-30 2015-05-06 南京邮电大学 Android malware detection method based on improved Bayesian algorithm

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB0921587D0 (en) * 2008-12-11 2010-01-27 Scansafe Ltd Malware detection
CN103500307A (en) * 2013-09-26 2014-01-08 北京邮电大学 Mobile internet malignant application software detection method based on behavior model
CN103617393A (en) * 2013-11-28 2014-03-05 北京邮电大学 Method for mobile internet malicious application software detection based on support vector machines
CN104376262A (en) * 2014-12-08 2015-02-25 中国科学院深圳先进技术研究院 Android malware detecting method based on Dalvik command and authority combination
CN104598825A (en) * 2015-01-30 2015-05-06 南京邮电大学 Android malware detection method based on improved Bayesian algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
卢文清: "基于混合特征的Android 恶意软件静态检测", 《无线电通信技术》 *

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106557695B (en) * 2015-09-25 2019-05-10 卓望数码技术(深圳)有限公司 A kind of malicious application detection method and system
CN106557695A (en) * 2015-09-25 2017-04-05 卓望数码技术(深圳)有限公司 A kind of malicious application detection method and system
US10635812B2 (en) 2015-11-18 2020-04-28 Tencent Technology (Shenzhen) Company Limited Method and apparatus for identifying malicious software
WO2017084451A1 (en) * 2015-11-18 2017-05-26 腾讯科技(深圳)有限公司 Method and apparatus for identifying malicious software
CN105426762B (en) * 2015-12-28 2018-08-14 重庆邮电大学 A kind of static detection method that android application programs are malicious
CN105426762A (en) * 2015-12-28 2016-03-23 重庆邮电大学 Static detection method for malice of android application programs
CN106548073A (en) * 2016-11-01 2017-03-29 北京大学 Screening method based on malice APK of convolutional neural networks
CN108073803A (en) * 2016-11-18 2018-05-25 北京京东尚科信息技术有限公司 For detecting the method and device of malicious application
CN106874761A (en) * 2016-12-30 2017-06-20 北京邮电大学 A kind of Android system malicious application detection method and system
CN108509796A (en) * 2017-02-24 2018-09-07 中国移动通信集团公司 A kind of detection method and server of risk
CN108509796B (en) * 2017-02-24 2022-02-11 中国移动通信集团公司 Method for detecting risk and server
CN107169350A (en) * 2017-05-10 2017-09-15 国网江苏省电力公司电力科学研究院 A kind of detection and blocking-up method for Mobile solution using abnormal authority
CN107330325A (en) * 2017-06-30 2017-11-07 北京金山安全管理系统技术有限公司 The authentication method and device of application file
CN107944274A (en) * 2017-12-18 2018-04-20 华中科技大学 A kind of Android platform malicious application off-line checking method based on width study
CN108491722A (en) * 2018-03-30 2018-09-04 广州汇智通信技术有限公司 A kind of malware detection method and system
CN108763958A (en) * 2018-06-01 2018-11-06 中国科学院软件研究所 Intelligent mobile terminal sensitive data authority checking defect inspection method based on deep learning
CN109639884A (en) * 2018-11-21 2019-04-16 惠州Tcl移动通信有限公司 A kind of method, storage medium and terminal device based on Android monitoring sensitive permission
CN110390198A (en) * 2019-07-31 2019-10-29 阿里巴巴集团控股有限公司 Risk method for inspecting, device and the electronic equipment of a kind of pair of small routine
CN110390198B (en) * 2019-07-31 2023-09-29 创新先进技术有限公司 Risk inspection method and device for small program and electronic equipment
WO2021027831A1 (en) * 2019-08-15 2021-02-18 中兴通讯股份有限公司 Malicious file detection method and apparatus, electronic device and storage medium
CN111353146A (en) * 2020-05-25 2020-06-30 腾讯科技(深圳)有限公司 Method, device, equipment and storage medium for detecting sensitive permission of application program
CN111353146B (en) * 2020-05-25 2020-08-25 腾讯科技(深圳)有限公司 Method, device, equipment and storage medium for detecting sensitive permission of application program
CN114356788A (en) * 2022-03-21 2022-04-15 大鲲智联(成都)科技有限公司 Application program detection method, device, equipment and medium based on user information

Also Published As

Publication number Publication date
CN104866763B (en) 2019-02-26

Similar Documents

Publication Publication Date Title
CN104866763A (en) Permission-based Android malicious software hybrid detection method
Shang et al. Android malware detection method based on naive Bayes and permission correlation algorithm
CN105229661B (en) Method, computing device and the storage medium for determining Malware are marked based on signal
Islam et al. Anomaly detection techniques based on kappa-pruned ensembles
Gu et al. Leaps: Detecting camouflaged attacks with statistical learning guided by program analysis
CN107273747A (en) The method for extorting software detection
CN105447388B (en) A kind of Android malicious code detection system based on weight and method
CN112488716B (en) Abnormal event detection system
CN102045358A (en) Intrusion detection method based on integral correlation analysis and hierarchical clustering
CN115412354B (en) Network security vulnerability detection method and system based on big data analysis
Dehkordy et al. A new machine learning-based method for android malware detection on imbalanced dataset
Ban et al. Integration of multi-modal features for android malware detection using linear SVM
CN110162975A (en) A kind of multistep abnormal point detecting method based on neighbour&#39;s propagation clustering algorithm
CN111600905A (en) Anomaly detection method based on Internet of things
Parkinson et al. GraphBAD: A general technique for anomaly detection in security information and event management
CN109413047A (en) Determination method, system, server and the storage medium of Behavior modeling
KR20210110765A (en) Method for providing ai-based big data de-identification solution
CN112287345B (en) Trusted edge computing system based on intelligent risk detection
Angelelli et al. Cyber-risk perception and prioritization for decision-making and threat intelligence
CN111782908A (en) WEB violation operation behavior detection method based on data mining cluster analysis
CN114168949B (en) Application software anomaly detection method and system applied to artificial intelligence
Chen et al. MalCommunity: A graph-based evaluation model for malware family clustering
CN112367336B (en) Webshell interception detection method, device, equipment and readable storage medium
Kang et al. A modified flowdroid based on chi-square test of permissions
Taheri et al. Cyberattack triage using incremental clustering for intrusion detection systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190226