CN109753800A - Merge the Android malicious application detection method and system of frequent item set and random forests algorithm - Google Patents
Merge the Android malicious application detection method and system of frequent item set and random forests algorithm Download PDFInfo
- Publication number
- CN109753800A CN109753800A CN201910002795.2A CN201910002795A CN109753800A CN 109753800 A CN109753800 A CN 109753800A CN 201910002795 A CN201910002795 A CN 201910002795A CN 109753800 A CN109753800 A CN 109753800A
- Authority
- CN
- China
- Prior art keywords
- feature
- sample
- frequent
- permission
- item collection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of Android (Android) malice detection methods for merging frequent item set (Apriori) algorithm and random forests algorithm, are related to technical field of information processing.Decompiling is carried out to Android application sample, according to permission and function call static nature is extracted from each decompiling file, to obtain the incidence relation in sample set between permission;The frequent 3- item collection of malice sample and normal sample is excavated based on Apriori algorithm, and then sensitive applications programming interface (Application Programming Interface, API) function call is combined to generate feature;Study and classification to feature is realized using random forest grader, to realize that Android malicious application detects.It is detected using the malice that the present invention carries out Android application software, system resources consumption is low, and has very high Detection accuracy.
Description
Technical field
The present invention relates to network securitys, information security detection field, and in particular to a kind of Android malicious application detection side
Method.
Background technique
Android (Android) as current intelligent terminal system most popular in the world, it is open, free with platform the features such as
It is widely used in the world.Therefore, target of attack is targeted by Android and put down by many malicious code researchers
Platform.With technological progress, the cost of manufacture of Android rogue program is also lower and lower, leads to the quantity of Android malware
It is growing day by day.It is shown according to the data that 360 internet security centers are issued, the newly-increased malice of intercepting and capturing Android platform in 2017 is soft
It is 757.3 ten thousand, part sample, average 3.1 ten thousand newly-increased daily.Malware uses the new technologies and methods such as digging mine wooden horse, Botnet
It frequently launches a offensive, including steals userspersonal information, the indecent behaviors such as malice fee suction bring massive losses to user.It faces
How so extensive malicious attack, effectively realize the detection to Android malicious application, becomes current Android platform peace
Full matter of utmost importance.
Static detection and dynamic detection are broadly divided into the malice detection of Android application at present.Static detection refers to
Without running application software, and the reverse-engineerings means such as decompiling are used, its source program is analyzed, its feature is extracted, than
Such as signature, permission, directly analysis characteristic behavior.Stationary detection technique is mainly to using program-described file
(AndroidManifest.xml) and grammar file (smali) code file carries out feature extraction.Guo et al. passes through parsing
Information labels in AndroidManifest.xml and smali code file extract the class of application, permission, component, signature, each
The processed data of kind and starting information etc..Rashidi B et al. is by permission and application programming interface (Application
Programming Interface, API) function call is as characteristic set, using support vector machines (Support Vector
Machine, SVM) and K- neighbour (K-Nearest Neighbor, KNN) algorithm malicious application is detected, but exist many
Erroneous judgement.Machine learning can be achieved the detection of Android application software manual, improve the efficiency of analysis, but rely on and mention
The application feature taken.
The dynamic detection of Android malicious application refers in application software operational process, passes through injection, hook (HOOK) etc.
Technology obtains the feature of the application, but defect is that software is needed to run, and system resources consumption is excessive.In dynamic detection research side
Face, Mahindru et al. uses tracker (Strace) acquisition applications software action data, and sends it to Analysis Service end, benefit
With these behavior samples of classifier training, finally judge to apply whether contain malicious act using K- nearest neighbor algorithm.Singh L etc.
People uses API Hook technology, carries out Hook to sensitive API in Android platform, once system or application are to specific API
When calling, calling function can be intercepted and captured, proxy function is redirected it to obtain details, behavioural information can be obtained.
Summary of the invention
The technical problem to be solved by the present invention is to for the disadvantages mentioned above of the prior art, by calling machine learning to calculate
Method is learnt and is detected to Android application, and Android malicious application detection complexity is reduced, and saves system resources consumption,
On solving high dimensional feature and mechanized classification test problems, the Detection accuracy to Malware is further improved.
The technical solution that the present invention solves above-mentioned technical problem is to propose a kind of fusion frequent item set (Apriori) algorithm
With the Android malicious application detection method of random forests algorithm, comprising the following steps: it is anti-to carry out batch to Android application software
Compiling, the software permission that is applied and sensitive API function static nature;The frequent item set for excavating permission feature makees permission feature
Dimension-reduction treatment obtains the frequent 3- item collection of permission, to obtain the incidence relation in sample set between permission;Excavate malice sample
With the frequent 3- item collection of normal sample, it is calculated together as feature construction feature set using information gain with sensitive API function
Method is screened and is scored to the characteristic attribute in feature set, is extracted important feature, is constructed corresponding vector space;Using
Random forests algorithm carries out study and classification and Detection to vector space, carries out just to the vector space of normal sample and malice sample
Often or the attribute of malice marks.
The present invention further comprises using static analysis tools to carry out decompiling to application software before feature extraction, obtaining
To so file (lib), smali and AndroidManifest.xml comprising resource file (res), third party software development kit
File, include various resource files, source code and the other static code features of the application software in file.
The present invention further comprises extracting feature, parsing using programming language (python) script
The all permissions that application is extracted in the extended markup language files such as AndroidManifest.xml obtain permission feature, use
Method function in python -- os.walk () traverses all smali files, extracts each sample according to canonical matching process
Sensitive API function.
The present invention further comprises that the frequent 3- item collection for excavating permission feature specifically includes: respectively from malice sample and just
Permission, which is extracted, in normal sample constructs authority set;The 1- item collection of Mining Frequent authority set: the support of each permission in authority set is calculated
S is spent, beta pruning is carried out to the frequent 1- item collection for being unsatisfactory for minimum support min_s, obtains Candidate Set L1, then to L1In element into
Row connection;Using the Candidate Set after connection as new sample set, Mining Frequent 2- item collection: to being unsatisfactory for minimum support min_s
Frequent 2- item collection carry out beta pruning, form new Candidate Set L2, repeat, until obtaining frequent 3- item collection.
The present invention further comprises being specifically included using information gain (information gain, IG) algorithm, is calculated special
The entropy of sign and the difference of its conditional entropy obtain the IG value of this feature, and IG value shows that more greatly degree of correlation is bigger, according to related journey
Degree retains important feature, and important feature is matched with application software each in system, constructs corresponding vector respectively
Space.Building vector space specifically includes, the building feature vector (x different comprising application software1,x2,…,xn) feature set
X calls formula ν: s → { 0,1 }|X|, vector space ν is constructed according to the feature vector in set X, wherein s indicates some application
Software, per one-dimensional corresponding with feature a certain in X in ν, if s includes a certain feature, in vector space ν with this feature pair
The ident value answered is 1, is otherwise 0.
The present invention also proposes a kind of Android malicious application detection system for merging Apriori algorithm and random forests algorithm
System, comprising: characteristic extracting module, feature processing block and random forest sorting algorithm module, characteristic extracting module is to by criticizing
The Android application software for measuring decompiling carries out feature extraction, the software permission that is applied and sensitive API function static nature;
Feature processing block excavates the frequent item set of permission feature, makees dimension-reduction treatment to permission feature, obtains the frequent 3- item collection of permission,
To obtain the incidence relation in sample set between permission, excavate the frequent 3- item collection of malice sample and normal sample, by its with
Sensitive API function sieves the characteristic attribute in feature set together as feature construction feature set, using information gain algorithm
Choosing and scoring, extract important feature, construct corresponding vector space;Random forest sorting algorithm module to vector space into
Row study and classification and Detection carry out normal or malice attribute to the vector space of normal sample and malice sample and mark.
The present invention is extracted using static detection mode using data characteristics, and then uses Apriori algorithm to data characteristics
The frequent 3- item collection of permission in normal and Malware is excavated, then merges sensitive API and calls function, is created using random forest
Classifier learns and classifies to it.Further, it is obtained using IG algorithm by the entropy of calculating feature and the difference of its conditional entropy
The IG value of this feature is retained important feature and is constructed respectively corresponding using matching algorithm to application software each in system
Vector space.The present invention carries out higher-dimension permission feature to excavate its frequent 3- item collection, less on system resources consumption.
Detailed description of the invention
Fig. 1 is the Android malicious application detection model for merging Apriori algorithm and random forests algorithm.
Specific embodiment
It elaborates below in conjunction with attached drawing to specific implementation process of the invention.
Fig. 1 show the present invention using detection system model schematic.In order to realize the inspection to Android system malicious application
It surveys, the present invention merges Apriori algorithm Mining Frequent 3- item collection and random forests algorithm is classified, and proposes a kind of Android malice
Using detection system, which includes characteristic extracting module, feature processing block and random forest sorting algorithm module.
Decompiling will be carried out in the sample set of the normal software being collected into and Malware in batches first, after decompiling
Application program describes the power that application program is extracted in file AndroidManifest.xml and grammar file smali file
It limits (Android permission) and sensitive applications programming interface api function calls, be then directed to permission feature mining
The frequent 3- item collection sequence of permission is found in the syntagmatic in normal sample and malice sample between permission, and combines API quick
Function is felt as learning characteristic, feature selecting is optimized to it using IG algorithm, and further, the important feature of reservation is embedded in
Feature vector forms vector space, and finally it is trained and is classified using random forests algorithm, to detect Android malice
Using.
It is illustrated below for each section.
(1) characteristic extracting module, using programming language Python script batch compilation sample set, after extracting decompiling
The feature of AndroidManifest.xml and smali file, the feature of extraction mainly include permission feature and sensitive API function.
For permission feature extraction, corresponding permission feature is extracted from some access right of application, is such as parsed
The all permissions of application are extracted in AndroidManifest.xml file.Due to when user using in system a certain function or
When accessing certain sensitive datas, it will apply for the power applied in access right, such as AndroidManifest.xml file
Limit ----android.permission.READ_PHONE_STATE indicates that telephone state permission is read in application;For sensitivity
Api function, one programming language (Java) class of each smali file representative, the various systems for containing application calling are answered
With interface function, use the method in python --- os.walk () function traverses all smali files, from this document with
Function (invoke) beginning is called, occurred api function is traversed according to string matching, extracts various kinds from all functions
This sensitive API function.By the sensitive API function for traversing each sample that all smali files extract, so that it may correspondingly obtain
Application software potentially malicious behavior.Byte code files due to smali as Android virtual machine (Dalvik), each smali
One java class of file representative contains the various system application interface functions of application calling;Since Malware generates
Malicious act must call corresponding api function.Therefore, using the sensitive API function of calling all in sample set as random
The learning characteristic of forest algorithm, it is trained after to detect malicious application.
Before carrying out feature extraction to each sample software, it is necessary to carry out decompiling to sample set.Decompiling can be used
File with .apk suffix is carried out decompiling by tool Apktool, includes resource file (res), third party sdk to obtain
The files such as so file (lib), smali and AndroidManifest.xml, this class file include various resource files, source code,
With other static natures.
Usual Malware can apply for some dangerous permission combinations, these groups before generating malicious act in terms of permission
Credit union mutually relies on and generates malicious act.Therefore, the dangerous class permission of Malware not only request slip one, and can apply endangering
Dangerous class permission combination, such as in malice sample, application permission combination is usually READ_SMS (short message reading), READ_
PHONE_STATE (reading mobile phone state), WRITE_SMS (editing short message) three, the executable privacy of user that reads are re-send to
Malicious operations such as elsewhere, and rarely have this permission to combine in normal software, according to permission combine in different dangerous permissions
Working in coordination, there are potentially malicious behaviors, therefore can determine whether it for Malware.
Apriori algorithm is the algorithm for the Mining Boolean Association Rules frequent item set that Agrawal et al. is proposed.Apriori
The frequent 3- item collection of algorithm excavation permission.Obtain a large amount of permission feature and sensitive API function.However, the power usually obtained
It is very big to limit characteristic dimension, computation complexity is high, therefore, using the frequent item set for excavating permission feature based on Apriori algorithm
Dimension-reduction treatment is carried out to permission characteristic dimension, to obtain the frequent 3- item collection of permission.The frequent 3- item collection of permission feature is excavated, with
The incidence relation in sample set between permission is obtained, its specific step is described as follows.
The frequent 3- item collection of permission is excavated based on Apriori algorithm, concretely, this is extracted from all samples using Shen
Normal software sample authority set P and malice sample authority set M please, wherein P={ p1,p2,…,pnRepresent normal software sample
Authority set, indicate whole applied n permissions of normal software sample, M={ m1,m2,…,mxRepresent the power of malice sample
Limit collection indicates applied x permission in whole malice samples.It is excavated respectively for the authority set of normal sample and malice sample
Frequent 3- item collection.Following method specifically can be used:
To the authority set Mining Frequent 1- item collection of sample permission: calculating the support S of each permission in sample authority set, table
Show the probability that the permission occurs in all sample sets, beta pruning carried out to the frequent 1- item collection for being unsatisfactory for minimum support min_s,
To obtain the set for meeting condition, and as Candidate Set L1, then to L1In element be attached;It then will be after connection
Candidate Set includes all 2- item collections, then the Mining Frequent 2- item collection from new sample set, to discontented as new sample set at this time
The frequent 2- item collection of sufficient minimum support min_s carries out beta pruning, forms new Candidate Set L2, according to above-mentioned steps, repeat,
Until obtaining the frequent 3- item collection of sample authority set.
Connection: in a certain frequent n- item collection set, before being found downwards since the first item (for example i-th) of the set
The nth elements of all elements in i and j are then connected into the (n+1)th item collection by n-1 same items (such as jth item).
From normal software sample authority set P, p is calculated separately1,p2,…,pnSupport of the frequency of appearance as the element
Spend S, minimum support be P in the minimum appearance of each element frequency and between 0 to 1, after Mining Frequent 1- item collection, according to
Minimum support carries out beta pruning and connection, finally obtains the frequent 3- item collection of normal sample.
From malice sample authority set M, m is calculated separately1,m2,…,mxSupport S of the frequency of appearance as the element,
After Mining Frequent 1- item collection, beta pruning and connection are carried out according to minimum support, finally obtain frequent 3 item collection of malice sample.
(2) characteristic processing
After the frequent 3- item collection for excavating malice sample and normal sample using Apriori algorithm, by itself and sensitive API letter
Number is screened and is scored to characteristic attribute using information gain IG algorithm together as feature.IG algorithm is by calculating feature
Comentropy and the difference of its conditional entropy obtain the IG value of this feature, which shows that more greatly degree of correlation is bigger.Entropy calculates: root
Probability P (the C occurred respectively according to normal software in sample set or Malwarei), according to formula:The comentropy H (C) of sample set is calculated.The calculating of conditional entropy: according to formula:Respectively ith feature conditional entropy H (Y | Xi).Therefore, according to formula IGi=H
(C)-H(Y|Xi) the IG value that calculates ith feature is, in order to screen that advantageous classification is normal in multiple features of comforming or Malware
Feature so that the uncertain reduction degree of feature is maximum, therefore the feature that IG value is 0 is rejected, and is not 0 by its residual value
Feature be retained as important feature.
Definition set X is the feature set that application software retains, and includes different feature (x in feature set1,x2,…,xn),
In, n is important characteristic.According to formula, ν: s → { 0,1 }|X|, according to the feature construction vector space ν in set X, s is enabled to indicate
Some application software, wherein per one-dimensional corresponding with feature a certain in X in ν.If s includes this feature, in vector space ν with
The corresponding ident value of this feature is 1, is otherwise 0, and whether ident value representative contains this feature.
It is empty that corresponding vector is constructed respectively to application software each in system using matching algorithm according to the method described above
Between ν, then, after Feature Selection, building one include n feature feature set, each sample of correspondence generate it is different to
Quantity space ν, and it is deposited into MySQL database, the input as random forest categorization module.
(3) random forests algorithm is classified
After obtaining feature vector, detection substantially becomes a kind of classification problem.Since the result of detection is normal and malice
Two classes, so detection substantially just belongs to two classification problems.And random forests algorithm is very suitable to solve two classification problems.It utilizes
The vector space ν of acquisition is realized using random forest sorting algorithm and is classified.
Following methods specifically can be used, Supervised classification: for known to being collected into normal and malice sample set it is each
Application software belongs to normal or Malware according to each application software, in each vector space corresponding with each application software
Behind, normal or malice attribute mark is carried out to each application software, as described in following formula.
Wherein V (S) indicates all application software set, and normal indicates that the application software belongs to normal software, malware
Indicate that the application software belongs to Malware.
After obtaining the vector space of training sample set, it is trained to obtain random forest grader.It will be to be measured soft
Part obtains vector space ν after feature extraction and characteristic processing, and ν at this time is free of normal or malware identifier, with sky
It is white or '? ' its value is replaced, then examined using vector space of the random forest grader of training sample to the software under testing
Classification is surveyed, is in the result normal software or Malware with normal the or malware string representation software under testing, by
This can realize the detection to Malware.
The present invention utilize inverse compiling technique, to application software sample collection carry out batch decompiling, in file permission and
Api function extracts.In face of higher-dimension permission feature, dimension-reduction treatment is carried out using Apriori algorithm, obtains the frequent 3- of permission
Item collection carries out Feature Selection by information gain, further obtains important feature in conjunction with sensitive API function.By important feature
Be mapped to vector space, indicated with 0 or 1, and normal use and malicious application are marked, finally obtain with it is markd to
Quantity space.Sample set is learnt and classified using random forests algorithm.
Claims (12)
1. a kind of Android malicious application detection method for merging frequent item set algorithm and random forests algorithm, which is characterized in that packet
It includes following steps: batch decompiling being carried out to Android Android application software and obtains sample set, the software permission that is applied and quick
Feel application programming interface api function static nature;The frequent item set for excavating permission feature makees dimension-reduction treatment to permission feature,
The frequent 3- item collection of permission is obtained, to obtain the incidence relation in sample set between permission;Excavate malice sample and normal sample
This frequent 3- item collection, respectively by the frequent 3- item collection of malice sample and normal sample its with sensitive API function together as spy
Construction feature collection is levied, the characteristic attribute in feature set is screened and scored using information gain algorithm, extracts important feature,
Construct corresponding vector space;Study and classification and Detection are carried out to vector space using random forest grader, to normal
The vector space of sample and malice sample carries out normal or malice attribute label.
2. method according to claim 1, which is characterized in that use static analysis tools to application software before feature extraction
Carry out decompiling, obtain comprising resource file res, third party software development kit so file lib, grammar file smali and answer
With include in program-described file AndroidManifest.xml the various resource files of the application software, source code and its
Its static code feature.
3. method according to claim 1, which is characterized in that extract feature, parsing using programming language python script
The all permissions that application is extracted in AndroidManifest.xml file obtain permission feature, use the method letter in python
Number --- os.walk () traverses all smali files, and the sensitivity of all samples in sample set is extracted according to canonical matching process
Api function.
4. method according to claim 1, which is characterized in that the frequent 3- item collection for excavating permission feature specifically includes: respectively
Permission is extracted from malice sample or normal sample constructs authority set;The 1- item collection of Mining Frequent authority set: it calculates in authority set
The support S of each permission carries out beta pruning to the frequent 1- item collection for being unsatisfactory for minimum support min_s, obtains Candidate Set L1, then
To L1In element be attached;Using the Candidate Set after connection as new 2- item collection, Mining Frequent 2- item collection: to being unsatisfactory for most
The frequent 2- item collection of small support min_s carries out beta pruning, forms new Candidate Set L2, repeat, it is 3- frequent until obtaining
Collection.
5. method according to claim 1, which is characterized in that had using information gain (InformationGain, IG) algorithm
Body includes the probability P (C occurred respectively according to normal software in sample set or Malwarei), according to formula:The comentropy H (C) for calculating sample set, according to formula:Calculate ith feature conditional entropy H (Y | Xi), according to formula IGi=H (C)-H (Y |
Xi) the IG value of ith feature is calculated, IG value shows more greatly frequent 3- intensities of related journey malice sample and normal sample more
Greatly, according to degree of correlation retain important feature, important feature is matched with application software each in system, respectively building and
Corresponding vector space.
6. method according to claim 5, which is characterized in that building vector space specifically includes, and the feature that IG value is 0 is picked
It removes, and the feature that its residual value is not 0 is retained as important feature, the building feature vector different comprising application software sample
(x1,x2,…,xn) feature set X, call formula ν: s → { 0,1 }|X|, vector space is constructed according to the feature vector in set X
ν, wherein s indicates some application software, per one-dimensional corresponding with feature a certain in X in ν, if s includes a certain feature,
Ident value corresponding with this feature is 1 in vector space ν, is otherwise 0.
7. a kind of Android malicious application detection system for merging frequent item set algorithm and random forests algorithm, comprising: feature extraction
Module, feature processing block and random forest sorting algorithm module, which is characterized in that characteristic extracting module is compiled to by batch is anti-
The Android application software translated carries out feature extraction, the software permission that is applied and sensitive API function static nature;At feature
The frequent item set that module excavates permission feature is managed, dimension-reduction treatment is made to permission feature, obtains the frequent 3- item collection of permission, to obtain
Incidence relation in sample set between permission excavates the frequent 3- item collection of malice sample and normal sample, by itself and sensitive API
Function is screened and is commented to the characteristic attribute in feature set together as feature construction feature set, using information gain algorithm
Point, important feature is extracted, corresponding vector space is constructed;Random forest sorting algorithm module learns vector space
And classification and Detection, normal or malice category is carried out using vector space of the random forest grader to normal sample and malice sample
Property label.
8. detection system according to claim 7, which is characterized in that carried out using static analysis tools to application software anti-
Compiling obtains the file comprising res, lib, smali and AndroidManifest.xml, includes the application software in file
Various resource files, source code and other static code features.
9. detection system according to claim 7, which is characterized in that feature is extracted using programming language python script,
The all permissions that application is extracted in parsing AndroidManifest.xml file obtain permission feature, use os.walk () letter
All smali files are gone through several times, and the sensitive API function of each sample is extracted according to canonical matching process.
10. detection system according to claim 7, which is characterized in that the frequent 3- item collection for excavating permission feature is specifically wrapped
It includes: extracting permission building authority set from malice sample or normal sample respectively;The 1- item collection of Mining Frequent authority set: power is calculated
Limit concentrates the support S of each permission, carries out beta pruning to the frequent 1- item collection for being unsatisfactory for minimum support min_s, obtains candidate
Collect L1, then to L1In element be attached;Using the Candidate Set after connection as new sample set, Mining Frequent 2- item collection: to not
The frequent 2- item collection for meeting minimum support min_s carries out beta pruning, forms new Candidate Set L2, repeat, until obtaining frequency
Numerous 3- item collection.
11. detection system according to claim 7, which is characterized in that specifically included using IG algorithm, calculate the entropy of feature
The difference of value and its conditional entropy obtains the IG value of this feature, is occurred respectively according to normal software in sample set or Malware general
Rate P (Ci), according to formula:The comentropy H (C) for calculating sample set, according to formula:Calculate ith feature conditional entropy H (Y | Xi), according to formula IGi=H (C)-H (Y |
Xi) the IG value of ith feature is calculated, IG value shows more greatly frequent 3- intensities of related journey malice sample and normal sample more
Greatly, according to degree of correlation retain important feature, important feature is matched with application software each in system, respectively building and
Corresponding vector space.
12. detection system according to claim 11, which is characterized in that IG value shows that more greatly degree of correlation is bigger, according to
Degree of correlation retains important feature, and important feature is matched with application software each in system, and building is corresponding to it respectively
Vector space, building vector space have including, by IG value be 0 feature reject, and by its residual value be not 0 feature retain
As important feature, the building feature vector (x different comprising application software sample1,x2,…,xn) feature set X, call formula
ν: s → { 0,1 }|X|, vector space ν is constructed according to the feature vector in set X, wherein s indicates some application software, every in ν
It is one-dimensional corresponding with feature a certain in X, if s includes a certain feature, ident value corresponding with this feature in vector space ν
It is 1, is otherwise 0.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910002795.2A CN109753800B (en) | 2019-01-02 | 2019-01-02 | Android malicious application detection method and system fusing frequent item set and random forest algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910002795.2A CN109753800B (en) | 2019-01-02 | 2019-01-02 | Android malicious application detection method and system fusing frequent item set and random forest algorithm |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109753800A true CN109753800A (en) | 2019-05-14 |
CN109753800B CN109753800B (en) | 2023-04-07 |
Family
ID=66405239
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910002795.2A Active CN109753800B (en) | 2019-01-02 | 2019-01-02 | Android malicious application detection method and system fusing frequent item set and random forest algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109753800B (en) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110851834A (en) * | 2019-11-18 | 2020-02-28 | 北京工业大学 | Android malicious application detection method integrating multi-feature classification |
CN111324893A (en) * | 2020-02-17 | 2020-06-23 | 电子科技大学 | Detection method and background system for android malicious software based on sensitive mode |
CN111460452A (en) * | 2020-03-30 | 2020-07-28 | 中国人民解放军国防科技大学 | Android malicious software detection method based on frequency fingerprint extraction |
CN111723371A (en) * | 2020-06-22 | 2020-09-29 | 上海斗象信息科技有限公司 | Method for constructing detection model of malicious file and method for detecting malicious file |
WO2020233322A1 (en) * | 2019-05-21 | 2020-11-26 | 暨南大学 | Description-entropy-based intelligent detection method for big data mobile software similarity |
CN112000954A (en) * | 2020-08-25 | 2020-11-27 | 莫毓昌 | Malicious software detection method based on feature sequence mining and simplification |
CN112035836A (en) * | 2019-06-04 | 2020-12-04 | 四川大学 | Malicious code family API sequence mining method |
CN112100621A (en) * | 2020-09-11 | 2020-12-18 | 哈尔滨工程大学 | Android malicious application detection method based on sensitive permission and API |
CN112287345A (en) * | 2020-10-29 | 2021-01-29 | 中南大学 | Credible edge computing system based on intelligent risk detection |
CN112446026A (en) * | 2019-09-03 | 2021-03-05 | 中移(苏州)软件技术有限公司 | Malicious software detection method and device and storage medium |
CN112464232A (en) * | 2020-11-21 | 2021-03-09 | 西北工业大学 | Android system malicious software detection method based on mixed feature combination classification |
CN112632539A (en) * | 2020-12-28 | 2021-04-09 | 西北工业大学 | Dynamic and static mixed feature extraction method in Android system malicious software detection |
CN112651024A (en) * | 2020-12-29 | 2021-04-13 | 重庆大学 | Method, device and equipment for malicious code detection |
CN113378171A (en) * | 2021-07-12 | 2021-09-10 | 东北大学秦皇岛分校 | Android lasso software detection method based on convolutional neural network |
CN113378167A (en) * | 2021-06-30 | 2021-09-10 | 哈尔滨理工大学 | Malicious software detection method based on improved naive Bayes algorithm and gated loop unit mixing |
CN113592103A (en) * | 2021-07-26 | 2021-11-02 | 东方红卫星移动通信有限公司 | Software malicious behavior identification method based on integrated learning and dynamic analysis |
CN113949514A (en) * | 2020-07-16 | 2022-01-18 | 中国电信股份有限公司 | Application override detection method, device and storage medium |
CN115249048A (en) * | 2022-09-16 | 2022-10-28 | 西南民族大学 | Confrontation sample generation method |
CN115878421A (en) * | 2022-12-09 | 2023-03-31 | 国网湖北省电力有限公司信息通信公司 | Data center equipment-level fault prediction method, system and medium based on log time sequence correlation characteristic mining |
CN117708813A (en) * | 2023-11-30 | 2024-03-15 | 四川大学 | Security detection method and system for software development environment |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105138916A (en) * | 2015-08-21 | 2015-12-09 | 中国人民解放军信息工程大学 | Multi-track malicious program feature detecting method based on data mining |
CN105530265A (en) * | 2016-01-28 | 2016-04-27 | 李青山 | Mobile Internet malicious application detection method based on frequent itemset description |
CN105550583A (en) * | 2015-12-22 | 2016-05-04 | 电子科技大学 | Random forest classification method based detection method for malicious application in Android platform |
CN105740712A (en) * | 2016-03-09 | 2016-07-06 | 哈尔滨工程大学 | Android malicious act detection method based on Bayesian network |
CN106845220A (en) * | 2015-12-07 | 2017-06-13 | 深圳先进技术研究院 | A kind of Android malware detecting system and method |
CN106845240A (en) * | 2017-03-10 | 2017-06-13 | 西京学院 | A kind of Android malware static detection method based on random forest |
CN107169355A (en) * | 2017-04-28 | 2017-09-15 | 北京理工大学 | A kind of worm homology analysis method and apparatus |
CN107180192A (en) * | 2017-05-09 | 2017-09-19 | 北京理工大学 | Android malicious application detection method and system based on multi-feature fusion |
US20180046796A1 (en) * | 2016-08-12 | 2018-02-15 | Duo Security, Inc. | Methods for identifying compromised credentials and controlling account access |
CN108108616A (en) * | 2017-12-19 | 2018-06-01 | 努比亚技术有限公司 | Malicious act detection method, mobile terminal and storage medium |
US20180322287A1 (en) * | 2016-05-05 | 2018-11-08 | Cylance Inc. | Machine learning model for malware dynamic analysis |
CN108958215A (en) * | 2018-06-01 | 2018-12-07 | 天泽信息产业股份有限公司 | A kind of engineering truck failure prediction system and its prediction technique based on data mining |
-
2019
- 2019-01-02 CN CN201910002795.2A patent/CN109753800B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105138916A (en) * | 2015-08-21 | 2015-12-09 | 中国人民解放军信息工程大学 | Multi-track malicious program feature detecting method based on data mining |
CN106845220A (en) * | 2015-12-07 | 2017-06-13 | 深圳先进技术研究院 | A kind of Android malware detecting system and method |
CN105550583A (en) * | 2015-12-22 | 2016-05-04 | 电子科技大学 | Random forest classification method based detection method for malicious application in Android platform |
CN105530265A (en) * | 2016-01-28 | 2016-04-27 | 李青山 | Mobile Internet malicious application detection method based on frequent itemset description |
CN105740712A (en) * | 2016-03-09 | 2016-07-06 | 哈尔滨工程大学 | Android malicious act detection method based on Bayesian network |
US20180322287A1 (en) * | 2016-05-05 | 2018-11-08 | Cylance Inc. | Machine learning model for malware dynamic analysis |
US20180046796A1 (en) * | 2016-08-12 | 2018-02-15 | Duo Security, Inc. | Methods for identifying compromised credentials and controlling account access |
CN106845240A (en) * | 2017-03-10 | 2017-06-13 | 西京学院 | A kind of Android malware static detection method based on random forest |
CN107169355A (en) * | 2017-04-28 | 2017-09-15 | 北京理工大学 | A kind of worm homology analysis method and apparatus |
CN107180192A (en) * | 2017-05-09 | 2017-09-19 | 北京理工大学 | Android malicious application detection method and system based on multi-feature fusion |
CN108108616A (en) * | 2017-12-19 | 2018-06-01 | 努比亚技术有限公司 | Malicious act detection method, mobile terminal and storage medium |
CN108958215A (en) * | 2018-06-01 | 2018-12-07 | 天泽信息产业股份有限公司 | A kind of engineering truck failure prediction system and its prediction technique based on data mining |
Non-Patent Citations (3)
Title |
---|
ALI IDRI 等: "A data mining-based approach for cardiovascular dysautonomias diagnosis and treatment", 《2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY》 * |
杨宏宇 等: "基于改进随机森林算法的Android恶意软件检测", 《通信学报》 * |
赵弋: "Android平台恶意应用静态检测方法的研究", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 * |
Cited By (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020233322A1 (en) * | 2019-05-21 | 2020-11-26 | 暨南大学 | Description-entropy-based intelligent detection method for big data mobile software similarity |
CN112035836A (en) * | 2019-06-04 | 2020-12-04 | 四川大学 | Malicious code family API sequence mining method |
CN112446026A (en) * | 2019-09-03 | 2021-03-05 | 中移(苏州)软件技术有限公司 | Malicious software detection method and device and storage medium |
CN110851834B (en) * | 2019-11-18 | 2024-02-27 | 北京工业大学 | Android malicious application detection method integrating multi-feature classification |
CN110851834A (en) * | 2019-11-18 | 2020-02-28 | 北京工业大学 | Android malicious application detection method integrating multi-feature classification |
CN111324893B (en) * | 2020-02-17 | 2022-05-10 | 电子科技大学 | Detection method and background system for android malicious software based on sensitive mode |
CN111324893A (en) * | 2020-02-17 | 2020-06-23 | 电子科技大学 | Detection method and background system for android malicious software based on sensitive mode |
CN111460452A (en) * | 2020-03-30 | 2020-07-28 | 中国人民解放军国防科技大学 | Android malicious software detection method based on frequency fingerprint extraction |
CN111460452B (en) * | 2020-03-30 | 2022-09-09 | 中国人民解放军国防科技大学 | Android malicious software detection method based on frequency fingerprint extraction |
CN111723371A (en) * | 2020-06-22 | 2020-09-29 | 上海斗象信息科技有限公司 | Method for constructing detection model of malicious file and method for detecting malicious file |
CN111723371B (en) * | 2020-06-22 | 2024-02-20 | 上海斗象信息科技有限公司 | Method for constructing malicious file detection model and detecting malicious file |
CN113949514B (en) * | 2020-07-16 | 2024-01-26 | 中国电信股份有限公司 | Application override detection method, device and storage medium |
CN113949514A (en) * | 2020-07-16 | 2022-01-18 | 中国电信股份有限公司 | Application override detection method, device and storage medium |
CN112000954A (en) * | 2020-08-25 | 2020-11-27 | 莫毓昌 | Malicious software detection method based on feature sequence mining and simplification |
CN112000954B (en) * | 2020-08-25 | 2024-01-30 | 华侨大学 | Malicious software detection method based on feature sequence mining and simplification |
CN112100621A (en) * | 2020-09-11 | 2020-12-18 | 哈尔滨工程大学 | Android malicious application detection method based on sensitive permission and API |
CN112100621B (en) * | 2020-09-11 | 2022-05-20 | 哈尔滨工程大学 | Android malicious application detection method based on sensitive permission and API |
CN112287345A (en) * | 2020-10-29 | 2021-01-29 | 中南大学 | Credible edge computing system based on intelligent risk detection |
CN112287345B (en) * | 2020-10-29 | 2024-04-16 | 中南大学 | Trusted edge computing system based on intelligent risk detection |
CN112464232B (en) * | 2020-11-21 | 2024-04-09 | 西北工业大学 | Android system malicious software detection method based on mixed feature combination classification |
CN112464232A (en) * | 2020-11-21 | 2021-03-09 | 西北工业大学 | Android system malicious software detection method based on mixed feature combination classification |
CN112632539A (en) * | 2020-12-28 | 2021-04-09 | 西北工业大学 | Dynamic and static mixed feature extraction method in Android system malicious software detection |
CN112632539B (en) * | 2020-12-28 | 2024-04-09 | 西北工业大学 | Dynamic and static hybrid feature extraction method in Android system malicious software detection |
CN112651024A (en) * | 2020-12-29 | 2021-04-13 | 重庆大学 | Method, device and equipment for malicious code detection |
CN113378167A (en) * | 2021-06-30 | 2021-09-10 | 哈尔滨理工大学 | Malicious software detection method based on improved naive Bayes algorithm and gated loop unit mixing |
CN113378171B (en) * | 2021-07-12 | 2022-06-21 | 东北大学秦皇岛分校 | Android lasso software detection method based on convolutional neural network |
CN113378171A (en) * | 2021-07-12 | 2021-09-10 | 东北大学秦皇岛分校 | Android lasso software detection method based on convolutional neural network |
CN113592103A (en) * | 2021-07-26 | 2021-11-02 | 东方红卫星移动通信有限公司 | Software malicious behavior identification method based on integrated learning and dynamic analysis |
CN115249048A (en) * | 2022-09-16 | 2022-10-28 | 西南民族大学 | Confrontation sample generation method |
CN115878421A (en) * | 2022-12-09 | 2023-03-31 | 国网湖北省电力有限公司信息通信公司 | Data center equipment-level fault prediction method, system and medium based on log time sequence correlation characteristic mining |
CN115878421B (en) * | 2022-12-09 | 2023-11-14 | 国网湖北省电力有限公司信息通信公司 | Data center equipment level fault prediction method, system and medium |
CN117708813A (en) * | 2023-11-30 | 2024-03-15 | 四川大学 | Security detection method and system for software development environment |
CN117708813B (en) * | 2023-11-30 | 2024-06-21 | 四川大学 | Security detection method and system for software development environment |
Also Published As
Publication number | Publication date |
---|---|
CN109753800B (en) | 2023-04-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109753800A (en) | Merge the Android malicious application detection method and system of frequent item set and random forests algorithm | |
CN106572117B (en) | A kind of detection method and device of WebShell file | |
CN105184160B (en) | A kind of method of the Android phone platform application program malicious act detection based on API object reference relational graphs | |
CN111639337B (en) | Unknown malicious code detection method and system for massive Windows software | |
CN103106365B (en) | The detection method of the malicious application software on a kind of mobile terminal | |
CN109684840A (en) | Based on the sensitive Android malware detection method for calling path | |
CN105229661B (en) | Method, computing device and the storage medium for determining Malware are marked based on signal | |
Zhu et al. | Android malware detection based on multi-head squeeze-and-excitation residual network | |
CN105138916B (en) | Multi-trace rogue program characteristic detection method based on data mining | |
CN102567661A (en) | Program recognition method and device based on machine learning | |
CN108734012A (en) | Malware recognition methods, device and electronic equipment | |
CN113139192B (en) | Third party library security risk analysis method and system based on knowledge graph | |
CN113076538B (en) | Method for extracting embedded privacy policy of mobile application APK file | |
KR102120200B1 (en) | Malware Crawling Method and System | |
US20210334371A1 (en) | Malicious File Detection Technology Based on Random Forest Algorithm | |
Martín et al. | A new tool for static and dynamic Android malware analysis | |
CN113468524B (en) | RASP-based machine learning model security detection method | |
CN113297580B (en) | Code semantic analysis-based electric power information system safety protection method and device | |
Sanz et al. | Instance-based anomaly method for Android malware detection | |
CN114817924B (en) | AST (AST) and cross-layer analysis based android malicious software detection method and system | |
CN106503552A (en) | The Android malware detecting system that is excavated with pattern of traffic based on signature and method | |
CN115292674A (en) | Fraud application detection method and system based on user comment data | |
CN112257076A (en) | Vulnerability detection method based on random detection algorithm and information aggregation | |
CN114579965A (en) | Malicious code detection method and device and computer readable storage medium | |
CN107018152A (en) | Message block method, device and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |