CN107194251A - Android platform malicious application detection method and device - Google Patents

Android platform malicious application detection method and device Download PDF

Info

Publication number
CN107194251A
CN107194251A CN201710214419.0A CN201710214419A CN107194251A CN 107194251 A CN107194251 A CN 107194251A CN 201710214419 A CN201710214419 A CN 201710214419A CN 107194251 A CN107194251 A CN 107194251A
Authority
CN
China
Prior art keywords
android
data flow
measured
applications
application
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710214419.0A
Other languages
Chinese (zh)
Other versions
CN107194251B (en
Inventor
朱大立
金昊
杨莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN201710214419.0A priority Critical patent/CN107194251B/en
Publication of CN107194251A publication Critical patent/CN107194251A/en
Application granted granted Critical
Publication of CN107194251B publication Critical patent/CN107194251B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection

Abstract

The present invention provides a kind of Android platform malicious application detection method and device, wherein, methods described includes:FlowDroid instruments are called, the static data flow feature of Android applications to be measured is extracted;The static data flow feature that Android to be measured is applied is handled using SUSI technologies, the characteristic vector of the data flow of Android applications to be measured is generated;By the good depth confidence network detection model of the characteristic vector input training in advance of the Android to be measured of the generation data flow applied, obtain Android applications to be measured whether be malicious application testing result.The present invention can accurately be detected to Android platform malicious application, dynamic stain is avoided to follow the trail of the path covering problem existed, static data flow analysis technology is overcome to need two big challenges of the target element to being communicated between application operational process progress accurate modeling and accurate securing component, accurate comprehensive extraction to Android application sensitive traffics is realized, while overcoming the limitation that traditional shallow-layer machine learning algorithm exists when building detection model.

Description

Android platform malicious application detection method and device
Technical field
The present invention relates to mobile security and machine learning techniques field, more particularly to a kind of Android platform malicious application Detection method and device.
Background technology
In mobile intelligent terminal field, there are a large amount of Malwares, they tend to situation about not discovering in user Under it is hidden obtain user and be stored in private data in equipment, and be sent in the mailbox of attacker or server, to user's Financial security and personal secrets bring great puzzlement.As Android platform intelligent terminal is popularized, Android intelligence Privacy in terminal is stolen attack and also increasingly paid attention to malicious application detection technique by people.
At present, the existing data stream analysis techniques detected to Android platform malicious application mainly include:Dynamic is dirty Two kinds of point tracking and static data flow analysis technology.Dynamic stain follow the trail of be by carrying out stain mark to sensitive data, and Dynamic tracing is carried out to stain data using during operation, judges whether that occurring malice reveals.Static data flow analysis technology is logical The function call graph for building application is crossed, and is analyzed one by one wherein reaching function, sensitive source information is propagated To monitor its information flow flow direction in the entire system.
But, dynamic stain follow the trail of face how in overlay program all code paths challenge;Moreover, part malice should With the presence that can interpolate that Dynamical Monitor, and its malicious act is hidden, cause testing result to there is certain false negative.It is static Data stream analysis techniques are needed to accurately being modeled using operational process;Moreover, Android application programs have been used largely Inter-component communication, a component, which can send intent and call, may be placed into data in another component, intent.How The target element communicated exactly between securing component is also a difficult point of static data flow analysis technology.
Further, since machine learning in recent years and the extensive use of data mining technology, at present also using based on traditional machine Device learning algorithm is detected to Android platform malicious application.Such method extracts the power of application first by static method The features such as limit, API Calls, function call, or called using the behavior of dynamic approach extraction application, system parameter variations, system Etc. feature;Then, machine learning algorithm is chosen, such as:Decision tree, naive Bayesian, SVMs etc., to these characteristics It is trained, builds malicious application detection model;And the security of application is finally judged using the model.
But, for same type of application behavioural characteristic, different machine learning algorithms has different testing results. The application behavioural characteristic that suitable machine learning algorithm handles suitable type is chosen, it is most important to final testing result.Together When, traditional machine learning algorithm has the model structure of shallow-layer, is had a certain impact for final testing result.
In consideration of it, a kind of Android platform malicious application detection method and device how are provided, to avoid dynamic stain from chasing after Track exist path covering problem, overcome static data flow analysis technology need to application operational process carry out accurate modeling and The two big challenges of target element communicated between accurate securing component are needed, are realized to the accurate of Android application sensitive traffics Comprehensively extract, while overcoming the limitation that traditional shallow-layer machine learning algorithm exists when building detection model, realization pair The accurate detection of Android platform malicious application turns into the current technical issues that need to address.
The content of the invention
To solve above-mentioned technical problem, the present invention provides a kind of Android platform malicious application detection method and device, Dynamic stain can be avoided to follow the trail of the path covering problem existed, overcome static data flow analysis technology to need to application operation stream Two big challenges of the target element communicated between Cheng Jinhang accurate modelings and the accurate securing component of needs, realizing should to Android With the accurate comprehensive extraction of sensitive traffic, while overcoming traditional shallow-layer machine learning algorithm to exist when building detection model Limitation, realize accurate detection to Android platform malicious application.
In a first aspect, the present invention provides a kind of Android platform malicious application detection method, including:
FlowDroid instruments are called, the static data flow feature of Android applications to be measured is extracted;
Using SUSI technologies, the static data flow feature to the Android applications to be measured is handled, generated to be measured The characteristic vector of the data flow of Android applications;
By the good depth confidence network of the characteristic vector input training in advance of the Android to be measured of the generation data flows applied Detection model, obtain Android to be measured application whether be malicious application testing result.
Alternatively, it is good in the characteristic vector input training in advance of the Android to be measured by the generation data flows applied Depth confidence network detection model before, methods described also includes:
Android application samples are obtained, the Android applications sample includes:Safe Android applications sample and malice Android application samples;
FlowDroid instruments are called, the static data flow feature of the Android applications sample is extracted;
Using SUSI technologies, the static data flow feature to the Android applications sample is handled, generation The characteristic vector of the data flow of Android application samples;
It is trained according to the characteristic vector of the data flow of the Android applications sample, builds the inspection of depth confidence network Survey model.
Alternatively, the characteristic vector according to the sample Android data flows applied is trained, and builds depth Confidence network detection model, including:
By unlabelled safe Android applications sample and the characteristic vector of the data flow of malice Android application samples Boltzmann machine RBM input is restricted as the bottom, using unsupervised learning method, successively pre-training multilayer from bottom to top RBM, generates depth confidence network DBN, until DBN networks are in poised state;
Increase a classification layer after last hidden layer of the DBN networks;
The characteristic vector of the safe Android applications sample of mark and the data flow of malice Android application samples is defeated Enter the classification layer, using supervised learning method, the parameter of each layer of whole network is successively finely tuned from top to bottom, until convergence.
Alternatively, the data flow by the safe Android applications sample of mark and malice Android application samples Characteristic vector inputs the classification layer, and using supervised learning method, the parameter of each layer of whole network is successively finely tuned from top to bottom, Until convergence, including:
The characteristic vector of the safe Android applications sample of mark and the data flow of malice Android application samples is defeated Enter the classification layer, using backpropagation BP algorithm, the parameter of each layer of whole network is finely tuned with having supervision, until convergence.
Alternatively, it is described obtain Android to be measured application whether be malicious application testing result after, methods described Also include:
If Android applications to be measured are malicious applications, it is malicious application to point out Android applications to be measured described in user, And show that Android applications to be measured are the analysis reports of malicious application to user;
If Android applications to be measured are not malicious applications, it is not that malice should to point out Android applications to be measured described in user With.
Second aspect, the present invention provides a kind of Android platform malicious application detection means, including:
Second extraction module, for calling FlowDroid instruments, the static data flow for extracting Android applications to be measured is special Levy;
Second processing module, for utilizing SUSI technologies, the static data flow feature to the Android applications to be measured is entered Row processing, generates the characteristic vector of the data flow of Android applications to be measured;
Detection module is good for the characteristic vector of the data flow of the Android to be measured applications of generation to be inputted into training in advance Depth confidence network detection model, obtain Android to be measured application whether be malicious application testing result.
Alternatively, described device also includes:
Acquisition module, for obtaining Android application samples, the Android applications sample includes:Safe Android Using sample and malice Android application samples;
First extraction module, for calling FlowDroid instruments, extracts the static data of the Android applications sample Flow feature;
First processing module, for utilizing SUSI technologies, the static data flow feature to the Android applications sample is entered Row processing, generates the characteristic vector of the data flow of Android application samples;
Module is built, the characteristic vector for the data flow according to the Android applications sample is trained, built deep Spend confidence network detection model.
Alternatively, the structure module, including:
Pre-training unit, for by the number of unlabelled safe Android applications sample and malice Android application samples Boltzmann machine RBM input is restricted as the bottom according to the characteristic vector of stream, using unsupervised learning method, from bottom to top Successively pre-training multilayer RBM, generates depth confidence network DBN, until DBN networks are in poised state;
Adding unit, for increasing a classification layer after last hidden layer of the DBN networks;
Fine-adjusting unit, for by the data flow of the safe Android applications sample of mark and malice Android application samples Characteristic vector input the classification layer, using supervised learning method, the ginseng of each layer of whole network is successively finely tuned from top to bottom Number, until convergence.
Alternatively, the fine-adjusting unit, specifically for
The characteristic vector of the safe Android applications sample of mark and the data flow of malice Android application samples is defeated Enter the classification layer, using backpropagation BP algorithm, the parameter of each layer of whole network is finely tuned with having supervision, until convergence.
Alternatively, described device also includes:
First reminding module, if being malicious application for Android to be measured applications, is pointed out to be measured described in user Android applications are malicious applications, and show that Android applications to be measured are the analysis reports of malicious application to user;
Second reminding module, if being not malicious application for Android to be measured applications, is pointed out to be measured described in user Android applications are not malicious applications.
As shown from the above technical solution, Android platform malicious application detection method and device of the invention, by calling FlowDroid instruments, extract the static data flow feature of Android applications to be measured, using SUSI technologies to described to be measured The static data flow feature of Android applications is handled, and generates the characteristic vector of the data flow of Android applications to be measured, will The characteristic vector of the data flow of the Android to be measured applications of generation inputs the good depth confidence network detection model of training in advance, Obtain Android to be measured application whether be malicious application testing result, thereby, it is possible to avoid dynamic stain from following the trail of the road existed Footpath covering problem, overcomes static data flow analysis technology to need to carry out accurate modeling to application operational process and need accurately to obtain Two big challenges of the target element of inter-component communication are taken, the accurate comprehensive extraction to Android application sensitive traffics is realized, The limitation for overcoming traditional shallow-layer machine learning algorithm to exist when building detection model simultaneously, realizes and Android platform is disliked The accurate detection of meaning application.
Brief description of the drawings
A kind of schematic flow sheet for Android platform malicious application detection method that Fig. 1 provides for one embodiment of the invention;
Fig. 2 be Fig. 1 in step 101 call FlowDroid instruments extract one illustrate application program in from source to The schematic diagram of sink data flow;
A kind of structural representation for Android platform malicious application detection means that Fig. 3 provides for one embodiment of the invention.
Embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention In accompanying drawing, clear, complete description is carried out to the technical scheme in the embodiment of the present invention, it is clear that described embodiment is only Only it is a part of embodiment of the invention, rather than whole embodiments.Based on embodiments of the invention, ordinary skill people The every other embodiment that member is obtained under the premise of creative work is not made, belongs to the scope of protection of the invention.
Fig. 1 shows that a kind of flow for Android platform malicious application detection method that one embodiment of the invention is provided is shown It is intended to, as shown in figure 1, the Android platform malicious application detection method of the present embodiment is as described below.
101st, FlowDroid instruments are called, the static data flow feature of Android applications to be measured is extracted.
It is understood that an Android is applied and is included multiple components, such as Activity, Service, Content Provider, Broadcast Receiver, wherein, Activity is the main entrance of analysis.With traditional java applet not Together, principal function is not present in Android applications (program), therefore during analysis, it is impossible to looked for simply by principal function Controlling stream graph is built to the entrance and exit of program.But, each component of Android applications has function to reflect the group The life cycle of part, can build the controlling stream graph that Android is applied according to these life cycles.Should in order to generate Android Controlling stream graph, the present embodiment calls Arzt, and the Open-Source Tools FlowDroid that S. et al. is proposed is by Activity life cycles Controlling stream call relation between all call back functions is simulated with an empty principal function.Applied below with some Android In the example of one section of code explain FlowDroid realization principle.
The application program reads the positional information of user, including longitude, dimension by calling the API that Baidu map is provided And detailed address, positional information is then sent to " on+1 11 " by cell-phone number by short message.This simple application program Common privacy attack mode is reflected, sensitive traffic flows to SMS API (sink) from LocationClient (source), So as to result in privacy leakage.
The data-flow analysis process of FlowDroid instruments is that execution sequence is related, can be divided into two parts:Forward Stain is analyzed has arrived where for finding out contaminated variable transferring, and on-demand alias analysis backward is used to search All alias to same contaminated heap positions before source.Referring to Fig. 2, FlowDroid is carried using following step Take the data flow from source to sink in above-mentioned application program:
(1) longitude, latitude and detail from source acquisitions are as the variable being contaminated, by forward Pass to function loc (detail, latitude, longitude);
(2) when function loc (detail, latitude, longitude) is called, discovery longitude, Latitude and detail comes in as parameter transmission.Therefore perform backward analysis, find call function parameter a, b and C is that parameter longitude, latitude and detail of called function pollute to a, b and c.
(3) continue to follow the trail of a, b and c, l is contaminated;
(4) it is the return value of called function to find l, is analyzed backward, location is polluted;
(5) forward analysis is done to location, judges that it is passed to sink.
It is understood that because the analysis process of FlowDroid instruments is context-sensitive, therefore, it is possible to effectively Differentiation is called to differences of the function loc () based on different parameters.Meanwhile, FlowDroid instruments use special hand-written summary side Formula processing is called to built-in function.In addition, FlowDroid instruments expand the scale of operation using substantial amounts of optimization means, together When reduce noise, so as to the accurate sensitive traffic characteristic comprehensively extracted inside Android applications.
102nd, using SUSI technologies, the static data flow feature to the Android applications to be measured is handled, and generation is treated Survey the characteristic vector of the data flow (standardization) of Android applications.
It is understood that being included completely in the information for the original static data flow characteristics that FlowDroid instruments are extracted Source and sink function names.However, in Android function libraries, there are thousands of source and sink functions, these functions In only only a few can be called in one application.If being used as feature, characteristic vector with the data flow between these functions It is sparse vector, it is necessary to be handled to improve training result.Therefore, the present embodiment utilizes Rasthofer, what S. et al. was proposed SUSI technologies are handled the static data flow feature that FlowDroid instruments are extracted, the feature of generation data flow (standardization) Vector.SUSI technologies are based on machine learning algorithm, can judge that it belongs to source functions or sink letters according to function code Number;Meanwhile, the source functions that there is currently and sink functions are respectively divided into 17 source classes and 19 by SUSI technologies Sink classes;Wherein:
17 source classes include:
(1)UNIQUE_IDENTIFIER;
(2)LOCATION_INFORMATION;
(3)NETWORK_INFORMATION;
(4)ACCOUNT_INFORMATION;
(5)FILE_INFORMATION;
(6)BLUETOOTH_INFORMATION;
(7)DATABASE_INFORMATION;
(8)EMAIL;
(9)SYNCHRONIZATION_DATA;
(10)SMS_MMS;
(11)CONTACT_INFORMATION;
(12)CALENDAR_INFORMATION;
(13)SYSTEM_SETTING;
(14)IMAGE;
(15)BROWSER_INFORMATION;
(16)NFC;
(17)NO_CATEGORY。
19 sink classes include:
(1)LOCATION_INFORMATION;
(2)PHONE_CONNECTION;
(3)VOIP;
(4)PHONE_STATE;
(5)EMAIL;
(6)BLUETOOTH;
(7)ACCOUNT_SETTING;
(8)AUDIO;
(9)SYNCHRONIZATION_DATA;
(10)NETWORK;
(11)FILE;
(12)LOG;
(13)SMS_MMS;
(14)CONTACT_INFORMATION;
(15)CALENDAR_INFORMATION;
(16)SYSTEM_SETTING;
(17)NFC;
(18)BROWSER_INFORMATION;
(19)NO_CATEGORY。
By SUSI classify can more clearly understand certain malicious application reveal information category, and information leakage road Footpath.
The present embodiment combines two kinds of technologies of FlowDroid instruments and SUSI, can be with from each Android applications 323 data flow characteristics are extracted, characteristic vector can be expressed as:
Features (app)=(src_category1→sink_category1,src_category1→sink_ category2,…,src_category17→src_category18,src_category17→src_category19)
As a source classes src_categoryi(i=1,2 ..., 17) and a sink classes sink_categoryj(j =1,2 ..., 19) when there is data flow between, the corresponding value src_category in characteristic vectori→sink_categoryj =1, otherwise, the value is 0.
It is understood that step 102 is to be unfavorable for training for the FlowDroid data flow characteristics presence extracted Data flow characteristics are handled, the characteristic vector of formation can be preferably by problem using the sensitive API classification tool SUSI increased income The flow direction of the internal sensitive data of each application of reaction.
103rd, the characteristic vector of the Android to be measured of the generation data flows applied is inputted into the good depth confidence of training in advance Network detection model, obtain Android to be measured application whether be malicious application testing result.
In a particular application, before the step 103, methods described also includes the step S1-S4 not shown in figure:
S1, acquisition Android application samples, the Android applications sample include:Safe Android applications sample and Malice Android application samples.
Specifically, the step S1 can constantly be captured newest Android malice from network using crawler technology and be answered The malice Android application samples in the Android applications sample are used as, and using crawler technology constantly from network The newest Android safety applications of crawl are used as the safe Android applications sample in the Android applications sample.
In a particular application, the Android applications sample can be divided into two parts by the step S1:A part is not The safe Android applications sample and malice Android application samples of mark, another part should for the safe Android of mark With sample and malice Android application samples.
S2, call FlowDroid instruments, extract the static data flow feature of Android application samples.
Specifically, the principle of this step FlowDroid instruments may refer to the explanation to above-mentioned steps 101, herein no longer Repeat.
S3, using SUSI technologies, the static data flow feature to the Android applications sample is handled, generation The characteristic vector of the data flow of Android application samples.
Specifically, this step SUSI technologies may refer to the explanation to above-mentioned steps 102, and here is omitted.
S4, it is trained according to the characteristic vector of the data flow of the Android applications sample, builds depth confidence network Detection model.
It is understood that the depth confidence network detection model based on deep learning has more than traditional shallow Model The structure of deep layer and more powerful feature descriptive power, therefore, it can to excavate application data stream feature to a deeper level and reflect The application security come, with higher Detection accuracy.
Specifically, the step S4 can specifically include the step S41-S43 not shown in figure:
S41, by unlabelled safe Android applications sample and the feature of the data flow of malice Android application samples Vector is restricted Boltzmann machine RBM input as the bottom, using unsupervised learning method, from bottom to top successively pre-training Multilayer RBM, generates depth confidence network DBN, until DBN networks are in poised state.
It should be noted that depth confidence network DBN is made up of multiple RBM layers of Boltzmann machines that are restricted, it is each RBM layers Including receiving the visible layer of input data and the hidden layer of output data, interlayer, which exists, to be connected, but is not deposited between the unit in layer In connection.
For example, the step S41 can by the feature of the data flow of unlabelled safe Android applications sample to Amount is restricted Boltzmann machine RBM input as the bottom, from bottom to top unsupervisedly successively pre- using successively greedy algorithm Multilayer RBM is trained, depth confidence network DBN is generated, until DBN networks are in poised state.
S42, one classification layer of increase after last hidden layer of the DBN networks.
S43, by the feature of the safe Android applications sample of mark and the data flow of malice Android application samples to The amount input classification layer, using supervised learning method, successively finely tunes the parameter of each layer of whole network from top to bottom, until receiving Hold back.
For example, the step S43 can apply the safe Android applications sample and malice Android of mark The characteristic vector of the data flow of sample inputs the classification layer, is calculated using backpropagation (Back Propagation, abbreviation BP) Method, finely tunes the parameter of each layer of whole network with having supervision, until convergence.
It should be noted that because each layer of RBM training process is separate, to each layer of RBM instruction The expression best to visible layer of the experienced hidden layer that can only obtain this layer.But, whole DBN networks be not to input data most Good expression.Therefore, a classification layer (supervised learning network), such as BP are increased after last hidden layer of DBN networks Neutral net.Whole DBN networks can so be regarded as to the BP neural network of a multilayer, be finely adjusted top-downly, this Individual process is considered as the initialization to a deep layer BP network weight parameter.DBN this training process can effectively change The problem of being apt to local optimum that may be present and long training time.According to the difference of concrete application, DBN networks are uppermost prison Different graders can be chosen by superintending and directing learning layer (layer of classifying).Since then, the detection Android malice based on deep learning algorithm The depth confidence network detection model training of application is completed.
In a particular application, after the step 103, the present embodiment methods described can also include:
If Android applications to be measured are malicious applications, it is malicious application to point out Android applications to be measured described in user, And show that Android applications to be measured are the detailed analysis reports of malicious application to user;
If Android applications to be measured are not malicious applications, it is not that malice should to point out Android applications to be measured described in user With.
Due to realizing that privacy steals the Android malicious applications of attack in the mode of the internal sensitive data of processing application, There is very big difference in one side, on the other hand there is necessarily identical with other malicious applications again with Android safety applications Part.Therefore, the present embodiment analyzes this dissimilarity and similitude using machine learning algorithm, focuses mainly on Android and puts down The state of development of the data stream analysis techniques of platform intelligent terminal and malicious application detection technique based on machine learning algorithm, is based on Sensitive data stream information inside Android applications carries out depth analysis, probes into malicious application and safety applications in sensitive data Dissimilarity in processing, so as to be detected to malicious application.
The Android platform malicious application detection method of the present embodiment, it is to be measured by calling FlowDroid instruments to extract The static data flow feature of Android applications, utilizes static data flow feature of the SUSI technologies to the Android applications to be measured Handled, generate the characteristic vector of the data flow of Android applications to be measured, the data that the Android to be measured of generation is applied The characteristic vector of stream inputs the good depth confidence network detection model of training in advance, and whether obtain Android applications to be measured is to dislike Anticipate the testing result of application, Android platform malicious application is carried out to detect compared with high-accuracy thereby, it is possible to realize, can be with Avoid dynamic stain from following the trail of the path covering problem existed, overcome static data flow analysis technology to need to entering using operational process Two big challenges of the target element communicated between row accurate modeling and the accurate securing component of needs, are realized to Android using quick The accurate comprehensive extraction of data flow is felt, while overcoming the office that traditional shallow-layer machine learning algorithm exists when building detection model It is sex-limited, it can largely improve the verification and measurement ratio to unknown malicious application.
Fig. 3 shows that a kind of structure for Android platform malicious application detection means that one embodiment of the invention is provided is shown It is intended to, as shown in figure 3, the Android platform malicious application detection means of the present embodiment, including:Second extraction module 31, second Processing module 32 and detection module 33;Wherein:
Second extraction module 31, for calling FlowDroid instruments, extracts the static data flow of Android applications to be measured Feature;
Second processing module 32, for utilizing SUSI technologies, to the static data flow feature of the Android applications to be measured Handled, generate the characteristic vector of the data flow of Android applications to be measured;
Detection module 33, for the characteristic vector of the data flow of the Android to be measured applications of generation to be inputted into training in advance Good depth confidence network detection model, obtain Android applications to be measured whether be malicious application testing result.
In a particular application, this described device can also be included not shown in figure:
Acquisition module, for obtaining Android application samples, the Android applications sample includes:Safe Android Using sample and malice Android application samples;
First extraction module, for calling FlowDroid instruments, extracts the static data of the Android applications sample Flow feature;
First processing module, for utilizing SUSI technologies, the static data flow feature to the Android applications sample is entered Row processing, generates the characteristic vector of the data flow of Android application samples;
Module is built, the characteristic vector for the data flow according to the Android applications sample is trained, built deep Spend confidence network detection model.
In a particular application, the acquisition module can constantly be captured newest using crawler technology from network Android malicious applications are as the malice Android application samples in the Android applications sample, and utilize reptile skill Art constantly captures newest Android safety applications as the safe Android in the Android applications sample from network Using sample.
In a particular application, the Android applications sample can be divided into two parts by the acquisition module:A part is Unlabelled safe Android applications sample and malice Android application samples, another part are the safe Android of mark Using sample and malice Android application samples.
In a particular application, the structure module, can be included not shown in figure:
Pre-training unit, for by the number of unlabelled safe Android applications sample and malice Android application samples Boltzmann machine RBM input is restricted as the bottom according to the characteristic vector of stream, using unsupervised learning method, from bottom to top Successively pre-training multilayer RBM, generates depth confidence network DBN, until DBN networks are in poised state;
Adding unit, for increasing a classification layer after last hidden layer of the DBN networks;
Fine-adjusting unit, for by the data flow of the safe Android applications sample of mark and malice Android application samples Characteristic vector input the classification layer, using supervised learning method, the ginseng of each layer of whole network is successively finely tuned from top to bottom Number, until convergence.
Specifically, for example, the fine-adjusting unit can be by the safe Android applications sample and malice of mark The characteristic vector of the data flow of Android application samples inputs the classification layer, using backpropagation BP algorithm, supervises micro- The parameter of each layer of whole network is adjusted, until convergence.
In a particular application, the present embodiment described device can also be included not shown in figure:
First reminding module, if being malicious application for Android to be measured applications, is pointed out to be measured described in user Android applications are malicious applications, and show that Android applications to be measured are the analysis reports of malicious application to user;
Second reminding module, if being not malicious application for Android to be measured applications, is pointed out to be measured described in user Android applications are not malicious applications.
Due to realizing that privacy steals the Android malicious applications of attack in the mode of the internal sensitive data of processing application, There is very big difference in one side, on the other hand there is necessarily identical with other malicious applications again with Android safety applications Part.Therefore, the present embodiment analyzes this dissimilarity and similitude using machine learning algorithm, focuses mainly on Android and puts down The state of development of the data stream analysis techniques of platform intelligent terminal and malicious application detection technique based on machine learning algorithm, is based on Sensitive data stream information inside Android applications carries out depth analysis, probes into malicious application and safety applications in sensitive data Dissimilarity in processing, so as to be detected to malicious application.
It should be noted that for device/system embodiment, because it is substantially similar to embodiment of the method, so What is described is fairly simple, and the relevent part can refer to the partial explaination of embodiments of method.
The Android platform malicious application detection means of the present embodiment, can realize and Android platform malicious application is entered Row is detected compared with high-accuracy, and dynamic stain can be avoided to follow the trail of the path covering problem existed, static data flow analysis is overcome The two of the target element that technology needs communicate carrying out accurate modeling and the accurate securing component of needs to application operational process are big Challenge, realizes the accurate comprehensive extraction to Android application sensitive traffics, while overcoming traditional shallow-layer machine learning algorithm The limitation existed when building detection model, can largely improve the verification and measurement ratio to unknown malicious application.
The Android platform malicious application detection means of the present embodiment, can be used for the skill for performing preceding method embodiment Art scheme, its implementing principle and technical effect are similar, and here is omitted.
It should be understood by those skilled in the art that, embodiments herein can be provided as method, system or computer program Product.Therefore, the application can be using the reality in terms of complete hardware embodiment, complete software embodiment or combination software and hardware Apply the form of example.Moreover, the application can be used in one or more computers for wherein including computer usable program code The computer program production that usable storage medium is implemented on (including but is not limited to magnetic disk storage, CD-ROM, optical memory etc.) The form of product.
The application is the flow with reference to method, equipment (system) and computer program product according to the embodiment of the present application Figure and/or block diagram are described.It should be understood that can be by every first-class in computer program instructions implementation process figure and/or block diagram Journey and/or the flow in square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided The processor of all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices is instructed to produce A raw machine so that produced by the instruction of computer or the computing device of other programmable data processing devices for real The device for the function of being specified in present one flow of flow chart or one square frame of multiple flows and/or block diagram or multiple square frames.
These computer program instructions, which may be alternatively stored in, can guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory, which is produced, to be included referring to Make the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one square frame of block diagram or The function of being specified in multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that in meter Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented processing, thus in computer or The instruction performed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one The step of function of being specified in individual square frame or multiple square frames.
It should be noted that herein, such as first and second or the like relational terms are used merely to a reality Body or operation make a distinction with another entity or operation, and not necessarily require or imply these entities or deposited between operating In any this actual relation or order.Moreover, term " comprising ", "comprising" or its any other variant are intended to Nonexcludability is included, so that process, method, article or equipment including a series of key elements not only will including those Element, but also other key elements including being not expressly set out, or also include being this process, method, article or equipment Intrinsic key element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that Also there is other identical element in process, method, article or equipment including the key element.Term " on ", " under " etc. refers to The orientation or position relationship shown is, based on orientation shown in the drawings or position relationship, to be for only for ease of the description present invention and simplify Description, rather than indicate or imply that the device or element of meaning must have specific orientation, with specific azimuth configuration and behaviour Make, therefore be not considered as limiting the invention.Unless otherwise clearly defined and limited, term " installation ", " connected ", " connection " should be interpreted broadly, for example, it may be being fixedly connected or being detachably connected, or be integrally connected;Can be Mechanically connect or electrically connect;Can be joined directly together, can also be indirectly connected to by intermediary, can be two The connection of element internal.For the ordinary skill in the art, above-mentioned term can be understood at this as the case may be Concrete meaning in invention.
In the specification of the present invention, numerous specific details are set forth.Although it is understood that, embodiments of the invention can To be put into practice in the case of these no details.In some instances, known method, structure and skill is not been shown in detail Art, so as not to obscure the understanding of this description.Similarly, it will be appreciated that disclose in order to simplify the present invention and helps to understand respectively One or more of individual inventive aspect, above in the description of the exemplary embodiment of the present invention, each of the invention is special Levy and be grouped together into sometimes in single embodiment, figure or descriptions thereof.However, should not be by the method solution of the disclosure Release and be intended in reflection is following:I.e. the present invention for required protection requirement is than the feature that is expressly recited in each claim more Many features.More precisely, as the following claims reflect, inventive aspect is to be less than single reality disclosed above Apply all features of example.Therefore, it then follows thus claims of embodiment are expressly incorporated in the embodiment, Wherein each claim is in itself as the separate embodiments of the present invention.It should be noted that in the case where not conflicting, this The feature in embodiment and embodiment in application can be mutually combined.The invention is not limited in any single aspect, Any single embodiment is not limited to, any combination and/or the displacement of these aspects and/or embodiment is also not limited to.And And, can be used alone the present invention each aspect and/or embodiment or with other one or more aspects and/or its implementation Example is used in combination.
Finally it should be noted that:Various embodiments above is merely illustrative of the technical solution of the present invention, rather than its limitations;To the greatest extent The present invention is described in detail with reference to foregoing embodiments for pipe, it will be understood by those within the art that:Its according to The technical scheme described in foregoing embodiments can so be modified, or which part or all technical characteristic are entered Row equivalent substitution;And these modifications or replacement, the essence of appropriate technical solution is departed from various embodiments of the present invention technology The scope of scheme, it all should cover among the claim of the present invention and the scope of specification.

Claims (10)

1. a kind of Android platform malicious application detection method, it is characterised in that including:
FlowDroid instruments are called, the static data flow feature of Android applications to be measured is extracted;
Using SUSI technologies, the static data flow feature to the Android applications to be measured is handled, generated to be measured The characteristic vector of the data flow of Android applications;
By the good depth confidence network detection of the characteristic vector input training in advance of the Android to be measured of the generation data flows applied Model, obtain Android to be measured application whether be malicious application testing result.
2. according to the method described in claim 1, it is characterised in that the data applied in the Android to be measured by generation The characteristic vector of stream is inputted before the good depth confidence network detection model of training in advance, and methods described also includes:
Android application samples are obtained, the Android applications sample includes:Safe Android applications sample and malice Android application samples;
FlowDroid instruments are called, the static data flow feature of the Android applications sample is extracted;
Using SUSI technologies, the static data flow feature to the Android applications sample is handled, and generation Android should With the characteristic vector of the data flow of sample;
It is trained according to the characteristic vector of the data flow of the Android applications sample, builds depth confidence network detection mould Type.
3. method according to claim 2, it is characterised in that the data flow applied according to the sample Android Characteristic vector be trained, build depth confidence network detection model, including:
Using the characteristic vector of unlabelled safe Android applications sample and the data flow of malice Android application samples as The bottom is restricted Boltzmann machine RBM input, using unsupervised learning method, from bottom to top successively pre-training multilayer RBM, Depth confidence network DBN is generated, until DBN networks are in poised state;
Increase a classification layer after last hidden layer of the DBN networks;
The characteristic vector of the safe Android applications sample of mark and the data flow of malice Android application samples is inputted into institute Classification layer is stated, using supervised learning method, the parameter of each layer of whole network is successively finely tuned from top to bottom, until convergence.
4. method according to claim 3, it is characterised in that the safe Android applications sample and evil by mark Anticipate Android application samples data flow the characteristic vector input classification layer, using supervised learning method, from top to bottom The parameter of each layer of whole network is successively finely tuned, until convergence, including:
The characteristic vector of the safe Android applications sample of mark and the data flow of malice Android application samples is inputted into institute Classification layer is stated, using backpropagation BP algorithm, the parameter of each layer of whole network is finely tuned with having supervision, until convergence.
5. the method according to any one of claim 1-4, it is characterised in that obtain Android applications to be measured described Whether be malicious application testing result after, methods described also includes:
If Android applications to be measured are malicious applications, it is malicious application to point out Android applications to be measured described in user, and to User shows that Android applications to be measured are the analysis reports of malicious application;
If Android applications to be measured are not malicious applications, it is not malicious application to point out Android applications to be measured described in user.
6. a kind of Android platform malicious application detection means, it is characterised in that including:
Second extraction module, for calling FlowDroid instruments, extracts the static data flow feature of Android applications to be measured;
Second processing module, for utilizing SUSI technologies, at the static data flow feature of the Android applications to be measured Reason, generates the characteristic vector of the data flow of Android applications to be measured;
Detection module, for the characteristic vector of the data flow of the Android to be measured applications of generation to be inputted into the good depth of training in advance Spend confidence network detection model, obtain Android to be measured application whether be malicious application testing result.
7. device according to claim 6, it is characterised in that described device also includes:
Acquisition module, for obtaining Android application samples, the Android applications sample includes:Safe Android applications Sample and malice Android application samples;
First extraction module, for calling FlowDroid instruments, the static data flow for extracting the Android applications sample is special Levy;
First processing module, for utilizing SUSI technologies, at the static data flow feature of the Android applications sample Reason, generates the characteristic vector of the data flow of Android application samples;
Module is built, the characteristic vector for the data flow according to the Android applications sample is trained, build depth and put Communication network detection model.
8. device according to claim 7, it is characterised in that the structure module, including:
Pre-training unit, for by the data flow of unlabelled safe Android applications sample and malice Android application samples Characteristic vector Boltzmann machine RBM input is restricted as the bottom, using unsupervised learning method, from bottom to top successively Pre-training multilayer RBM, generates depth confidence network DBN, until DBN networks are in poised state;
Adding unit, for increasing a classification layer after last hidden layer of the DBN networks;
Fine-adjusting unit, for by the spy of the safe Android applications sample of mark and the data flow of malice Android application samples The vector input classification layer is levied, using supervised learning method, the parameter of each layer of whole network is successively finely tuned from top to bottom, directly To convergence.
9. device according to claim 8, it is characterised in that the fine-adjusting unit, specifically for
The characteristic vector of the safe Android applications sample of mark and the data flow of malice Android application samples is inputted into institute Classification layer is stated, using backpropagation BP algorithm, the parameter of each layer of whole network is finely tuned with having supervision, until convergence.
10. the device according to any one of claim 6-9, it is characterised in that described device also includes:
First reminding module, if being malicious application for Android to be measured applications, points out Android to be measured described in user should Show that Android applications to be measured are the analysis reports of malicious application with being malicious application, and to user;
Second reminding module, if being not malicious application for Android to be measured applications, points out Android to be measured described in user Using not being malicious application.
CN201710214419.0A 2017-04-01 2017-04-01 Malicious application detection method and device for Android platform Active CN107194251B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710214419.0A CN107194251B (en) 2017-04-01 2017-04-01 Malicious application detection method and device for Android platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710214419.0A CN107194251B (en) 2017-04-01 2017-04-01 Malicious application detection method and device for Android platform

Publications (2)

Publication Number Publication Date
CN107194251A true CN107194251A (en) 2017-09-22
CN107194251B CN107194251B (en) 2020-02-14

Family

ID=59871820

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710214419.0A Active CN107194251B (en) 2017-04-01 2017-04-01 Malicious application detection method and device for Android platform

Country Status (1)

Country Link
CN (1) CN107194251B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108718310A (en) * 2018-05-18 2018-10-30 安徽继远软件有限公司 Multi-level attack signatures generation based on deep learning and malicious act recognition methods
CN109508545A (en) * 2018-11-09 2019-03-22 北京大学 A kind of Android Malware classification method based on rarefaction representation and Model Fusion
CN110096265A (en) * 2019-05-09 2019-08-06 趋新科技(北京)有限公司 A kind of software design approach based on data flow and element, software design tool and software running platform
CN110472415A (en) * 2018-12-13 2019-11-19 成都亚信网络安全产业技术研究院有限公司 A kind of determination method and device of rogue program
CN110532773A (en) * 2018-05-25 2019-12-03 阿里巴巴集团控股有限公司 Malicious access Activity recognition method, data processing method, device and equipment
CN110555305A (en) * 2018-05-31 2019-12-10 武汉安天信息技术有限责任公司 Malicious application tracing method based on deep learning and related device
CN110858247A (en) * 2018-08-23 2020-03-03 北京京东尚科信息技术有限公司 Android malicious application detection method, system, device and storage medium
CN112287341A (en) * 2020-09-22 2021-01-29 哈尔滨安天科技集团股份有限公司 Android malicious application detection method and device, electronic equipment and storage medium
CN113110986A (en) * 2020-01-13 2021-07-13 深信服科技股份有限公司 WebShell script file detection method and system
CN113111346A (en) * 2020-01-13 2021-07-13 深信服科技股份有限公司 Multi-engine WebShell script file detection method and system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101266550A (en) * 2007-12-21 2008-09-17 北京大学 Malicious code detection method
US7526804B2 (en) * 2004-02-02 2009-04-28 Microsoft Corporation Hardware assist for pattern matches
CN103793650A (en) * 2013-12-02 2014-05-14 北京邮电大学 Static analysis method and static analysis device for Android application program
CN104392174A (en) * 2014-10-23 2015-03-04 腾讯科技(深圳)有限公司 Generation method and device for characteristic vectors of dynamic behaviors of application program
CN105320887A (en) * 2015-10-12 2016-02-10 湖南大学 Static characteristic extraction and selection based detection method for Android malicious application
CN106096415A (en) * 2016-06-24 2016-11-09 康佳集团股份有限公司 A kind of malicious code detecting method based on degree of depth study and system
CN106228068A (en) * 2016-07-21 2016-12-14 江西师范大学 Android malicious code detecting method based on composite character

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7526804B2 (en) * 2004-02-02 2009-04-28 Microsoft Corporation Hardware assist for pattern matches
CN101266550A (en) * 2007-12-21 2008-09-17 北京大学 Malicious code detection method
CN103793650A (en) * 2013-12-02 2014-05-14 北京邮电大学 Static analysis method and static analysis device for Android application program
CN104392174A (en) * 2014-10-23 2015-03-04 腾讯科技(深圳)有限公司 Generation method and device for characteristic vectors of dynamic behaviors of application program
CN105320887A (en) * 2015-10-12 2016-02-10 湖南大学 Static characteristic extraction and selection based detection method for Android malicious application
CN106096415A (en) * 2016-06-24 2016-11-09 康佳集团股份有限公司 A kind of malicious code detecting method based on degree of depth study and system
CN106228068A (en) * 2016-07-21 2016-12-14 江西师范大学 Android malicious code detecting method based on composite character

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DALI ZHU 等: "Application of Modified BLP Model on Mobile Web Operating System", 《 2016 IEEE TRUSTCOM/BIGDATASE/ISPA》 *
徐林溪: "基于混合特征的恶意安卓程序检测方法研究与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
汤伟: "基于数据流特征向量识别的P2P僵尸网络检测方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108718310A (en) * 2018-05-18 2018-10-30 安徽继远软件有限公司 Multi-level attack signatures generation based on deep learning and malicious act recognition methods
CN108718310B (en) * 2018-05-18 2021-02-26 安徽继远软件有限公司 Deep learning-based multilevel attack feature extraction and malicious behavior identification method
CN110532773B (en) * 2018-05-25 2023-04-07 阿里巴巴集团控股有限公司 Malicious access behavior identification method, data processing method, device and equipment
CN110532773A (en) * 2018-05-25 2019-12-03 阿里巴巴集团控股有限公司 Malicious access Activity recognition method, data processing method, device and equipment
CN110555305A (en) * 2018-05-31 2019-12-10 武汉安天信息技术有限责任公司 Malicious application tracing method based on deep learning and related device
CN110858247A (en) * 2018-08-23 2020-03-03 北京京东尚科信息技术有限公司 Android malicious application detection method, system, device and storage medium
CN109508545B (en) * 2018-11-09 2021-06-04 北京大学 Android Malware classification method based on sparse representation and model fusion
CN109508545A (en) * 2018-11-09 2019-03-22 北京大学 A kind of Android Malware classification method based on rarefaction representation and Model Fusion
CN110472415A (en) * 2018-12-13 2019-11-19 成都亚信网络安全产业技术研究院有限公司 A kind of determination method and device of rogue program
CN110472415B (en) * 2018-12-13 2021-08-10 成都亚信网络安全产业技术研究院有限公司 Malicious program determination method and device
CN110096265A (en) * 2019-05-09 2019-08-06 趋新科技(北京)有限公司 A kind of software design approach based on data flow and element, software design tool and software running platform
CN110096265B (en) * 2019-05-09 2023-06-20 趋新科技(北京)有限公司 Software design method, software design tool and software operation platform based on data stream and element
CN113110986A (en) * 2020-01-13 2021-07-13 深信服科技股份有限公司 WebShell script file detection method and system
CN113111346A (en) * 2020-01-13 2021-07-13 深信服科技股份有限公司 Multi-engine WebShell script file detection method and system
CN112287341A (en) * 2020-09-22 2021-01-29 哈尔滨安天科技集团股份有限公司 Android malicious application detection method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN107194251B (en) 2020-02-14

Similar Documents

Publication Publication Date Title
CN107194251A (en) Android platform malicious application detection method and device
Jin et al. Why are they collecting my data? inferring the purposes of network traffic in mobile apps
Qu et al. Continuous-time link prediction via temporal dependent graph neural network
CN105830081A (en) Methods and systems of generating application-specific models for the targeted protection of vital applications
CN107392025A (en) Malice Android application program detection method based on deep learning
Wen et al. Asa: Adversary situation awareness via heterogeneous graph convolutional networks
Lee et al. Advanced sound classifiers and performance analyses for accurate audio-based construction project monitoring
Shezan et al. Read between the lines: An empirical measurement of sensitive applications of voice personal assistant systems
CN109886290A (en) Detection method, device, computer equipment and the storage medium of user's request
US20210209162A1 (en) Method for processing identity information, electronic device, and storage medium
CN114866358B (en) Automatic penetration testing method and system based on knowledge graph
CN107169360A (en) The detection method and system of a kind of source code security loophole
Drosou et al. An enhanced graph analytics platform (gap) providing insight in big network data
CN115004153A (en) Demonstration of nerve flow
CN110197375A (en) A kind of similar users recognition methods, device, similar users identification equipment and medium
KR101602480B1 (en) Illegal internet site filtering system and control method thereof, recording medium for performing the method
CN108241678A (en) The method for digging and device of interest point data
Wadhwa Smart cities: toward the surveillance society?
CN116980162A (en) Cloud audit data detection method, device, equipment, medium and program product
Nasri et al. Android malware detection system using machine learning
Woerndl et al. Logging user activities and sensor data on mobile devices
CN114513329A (en) Industrial Internet information security assessment method and device
Peng et al. A Survey of Security Protection Methods for Deep Learning Model
Singh et al. A Blueprint for Effective Pandemic Mitigation
Olber Artificial intelligence and future crime in the context of computer forensics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant