CN107194251A - Android platform malicious application detection method and device - Google Patents
Android platform malicious application detection method and device Download PDFInfo
- Publication number
- CN107194251A CN107194251A CN201710214419.0A CN201710214419A CN107194251A CN 107194251 A CN107194251 A CN 107194251A CN 201710214419 A CN201710214419 A CN 201710214419A CN 107194251 A CN107194251 A CN 107194251A
- Authority
- CN
- China
- Prior art keywords
- android
- data flow
- measured
- applications
- application
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
Abstract
The present invention provides a kind of Android platform malicious application detection method and device, wherein, methods described includes:FlowDroid instruments are called, the static data flow feature of Android applications to be measured is extracted;The static data flow feature that Android to be measured is applied is handled using SUSI technologies, the characteristic vector of the data flow of Android applications to be measured is generated;By the good depth confidence network detection model of the characteristic vector input training in advance of the Android to be measured of the generation data flow applied, obtain Android applications to be measured whether be malicious application testing result.The present invention can accurately be detected to Android platform malicious application, dynamic stain is avoided to follow the trail of the path covering problem existed, static data flow analysis technology is overcome to need two big challenges of the target element to being communicated between application operational process progress accurate modeling and accurate securing component, accurate comprehensive extraction to Android application sensitive traffics is realized, while overcoming the limitation that traditional shallow-layer machine learning algorithm exists when building detection model.
Description
Technical field
The present invention relates to mobile security and machine learning techniques field, more particularly to a kind of Android platform malicious application
Detection method and device.
Background technology
In mobile intelligent terminal field, there are a large amount of Malwares, they tend to situation about not discovering in user
Under it is hidden obtain user and be stored in private data in equipment, and be sent in the mailbox of attacker or server, to user's
Financial security and personal secrets bring great puzzlement.As Android platform intelligent terminal is popularized, Android intelligence
Privacy in terminal is stolen attack and also increasingly paid attention to malicious application detection technique by people.
At present, the existing data stream analysis techniques detected to Android platform malicious application mainly include:Dynamic is dirty
Two kinds of point tracking and static data flow analysis technology.Dynamic stain follow the trail of be by carrying out stain mark to sensitive data, and
Dynamic tracing is carried out to stain data using during operation, judges whether that occurring malice reveals.Static data flow analysis technology is logical
The function call graph for building application is crossed, and is analyzed one by one wherein reaching function, sensitive source information is propagated
To monitor its information flow flow direction in the entire system.
But, dynamic stain follow the trail of face how in overlay program all code paths challenge;Moreover, part malice should
With the presence that can interpolate that Dynamical Monitor, and its malicious act is hidden, cause testing result to there is certain false negative.It is static
Data stream analysis techniques are needed to accurately being modeled using operational process;Moreover, Android application programs have been used largely
Inter-component communication, a component, which can send intent and call, may be placed into data in another component, intent.How
The target element communicated exactly between securing component is also a difficult point of static data flow analysis technology.
Further, since machine learning in recent years and the extensive use of data mining technology, at present also using based on traditional machine
Device learning algorithm is detected to Android platform malicious application.Such method extracts the power of application first by static method
The features such as limit, API Calls, function call, or called using the behavior of dynamic approach extraction application, system parameter variations, system
Etc. feature;Then, machine learning algorithm is chosen, such as:Decision tree, naive Bayesian, SVMs etc., to these characteristics
It is trained, builds malicious application detection model;And the security of application is finally judged using the model.
But, for same type of application behavioural characteristic, different machine learning algorithms has different testing results.
The application behavioural characteristic that suitable machine learning algorithm handles suitable type is chosen, it is most important to final testing result.Together
When, traditional machine learning algorithm has the model structure of shallow-layer, is had a certain impact for final testing result.
In consideration of it, a kind of Android platform malicious application detection method and device how are provided, to avoid dynamic stain from chasing after
Track exist path covering problem, overcome static data flow analysis technology need to application operational process carry out accurate modeling and
The two big challenges of target element communicated between accurate securing component are needed, are realized to the accurate of Android application sensitive traffics
Comprehensively extract, while overcoming the limitation that traditional shallow-layer machine learning algorithm exists when building detection model, realization pair
The accurate detection of Android platform malicious application turns into the current technical issues that need to address.
The content of the invention
To solve above-mentioned technical problem, the present invention provides a kind of Android platform malicious application detection method and device,
Dynamic stain can be avoided to follow the trail of the path covering problem existed, overcome static data flow analysis technology to need to application operation stream
Two big challenges of the target element communicated between Cheng Jinhang accurate modelings and the accurate securing component of needs, realizing should to Android
With the accurate comprehensive extraction of sensitive traffic, while overcoming traditional shallow-layer machine learning algorithm to exist when building detection model
Limitation, realize accurate detection to Android platform malicious application.
In a first aspect, the present invention provides a kind of Android platform malicious application detection method, including:
FlowDroid instruments are called, the static data flow feature of Android applications to be measured is extracted;
Using SUSI technologies, the static data flow feature to the Android applications to be measured is handled, generated to be measured
The characteristic vector of the data flow of Android applications;
By the good depth confidence network of the characteristic vector input training in advance of the Android to be measured of the generation data flows applied
Detection model, obtain Android to be measured application whether be malicious application testing result.
Alternatively, it is good in the characteristic vector input training in advance of the Android to be measured by the generation data flows applied
Depth confidence network detection model before, methods described also includes:
Android application samples are obtained, the Android applications sample includes:Safe Android applications sample and malice
Android application samples;
FlowDroid instruments are called, the static data flow feature of the Android applications sample is extracted;
Using SUSI technologies, the static data flow feature to the Android applications sample is handled, generation
The characteristic vector of the data flow of Android application samples;
It is trained according to the characteristic vector of the data flow of the Android applications sample, builds the inspection of depth confidence network
Survey model.
Alternatively, the characteristic vector according to the sample Android data flows applied is trained, and builds depth
Confidence network detection model, including:
By unlabelled safe Android applications sample and the characteristic vector of the data flow of malice Android application samples
Boltzmann machine RBM input is restricted as the bottom, using unsupervised learning method, successively pre-training multilayer from bottom to top
RBM, generates depth confidence network DBN, until DBN networks are in poised state;
Increase a classification layer after last hidden layer of the DBN networks;
The characteristic vector of the safe Android applications sample of mark and the data flow of malice Android application samples is defeated
Enter the classification layer, using supervised learning method, the parameter of each layer of whole network is successively finely tuned from top to bottom, until convergence.
Alternatively, the data flow by the safe Android applications sample of mark and malice Android application samples
Characteristic vector inputs the classification layer, and using supervised learning method, the parameter of each layer of whole network is successively finely tuned from top to bottom,
Until convergence, including:
The characteristic vector of the safe Android applications sample of mark and the data flow of malice Android application samples is defeated
Enter the classification layer, using backpropagation BP algorithm, the parameter of each layer of whole network is finely tuned with having supervision, until convergence.
Alternatively, it is described obtain Android to be measured application whether be malicious application testing result after, methods described
Also include:
If Android applications to be measured are malicious applications, it is malicious application to point out Android applications to be measured described in user,
And show that Android applications to be measured are the analysis reports of malicious application to user;
If Android applications to be measured are not malicious applications, it is not that malice should to point out Android applications to be measured described in user
With.
Second aspect, the present invention provides a kind of Android platform malicious application detection means, including:
Second extraction module, for calling FlowDroid instruments, the static data flow for extracting Android applications to be measured is special
Levy;
Second processing module, for utilizing SUSI technologies, the static data flow feature to the Android applications to be measured is entered
Row processing, generates the characteristic vector of the data flow of Android applications to be measured;
Detection module is good for the characteristic vector of the data flow of the Android to be measured applications of generation to be inputted into training in advance
Depth confidence network detection model, obtain Android to be measured application whether be malicious application testing result.
Alternatively, described device also includes:
Acquisition module, for obtaining Android application samples, the Android applications sample includes:Safe Android
Using sample and malice Android application samples;
First extraction module, for calling FlowDroid instruments, extracts the static data of the Android applications sample
Flow feature;
First processing module, for utilizing SUSI technologies, the static data flow feature to the Android applications sample is entered
Row processing, generates the characteristic vector of the data flow of Android application samples;
Module is built, the characteristic vector for the data flow according to the Android applications sample is trained, built deep
Spend confidence network detection model.
Alternatively, the structure module, including:
Pre-training unit, for by the number of unlabelled safe Android applications sample and malice Android application samples
Boltzmann machine RBM input is restricted as the bottom according to the characteristic vector of stream, using unsupervised learning method, from bottom to top
Successively pre-training multilayer RBM, generates depth confidence network DBN, until DBN networks are in poised state;
Adding unit, for increasing a classification layer after last hidden layer of the DBN networks;
Fine-adjusting unit, for by the data flow of the safe Android applications sample of mark and malice Android application samples
Characteristic vector input the classification layer, using supervised learning method, the ginseng of each layer of whole network is successively finely tuned from top to bottom
Number, until convergence.
Alternatively, the fine-adjusting unit, specifically for
The characteristic vector of the safe Android applications sample of mark and the data flow of malice Android application samples is defeated
Enter the classification layer, using backpropagation BP algorithm, the parameter of each layer of whole network is finely tuned with having supervision, until convergence.
Alternatively, described device also includes:
First reminding module, if being malicious application for Android to be measured applications, is pointed out to be measured described in user
Android applications are malicious applications, and show that Android applications to be measured are the analysis reports of malicious application to user;
Second reminding module, if being not malicious application for Android to be measured applications, is pointed out to be measured described in user
Android applications are not malicious applications.
As shown from the above technical solution, Android platform malicious application detection method and device of the invention, by calling
FlowDroid instruments, extract the static data flow feature of Android applications to be measured, using SUSI technologies to described to be measured
The static data flow feature of Android applications is handled, and generates the characteristic vector of the data flow of Android applications to be measured, will
The characteristic vector of the data flow of the Android to be measured applications of generation inputs the good depth confidence network detection model of training in advance,
Obtain Android to be measured application whether be malicious application testing result, thereby, it is possible to avoid dynamic stain from following the trail of the road existed
Footpath covering problem, overcomes static data flow analysis technology to need to carry out accurate modeling to application operational process and need accurately to obtain
Two big challenges of the target element of inter-component communication are taken, the accurate comprehensive extraction to Android application sensitive traffics is realized,
The limitation for overcoming traditional shallow-layer machine learning algorithm to exist when building detection model simultaneously, realizes and Android platform is disliked
The accurate detection of meaning application.
Brief description of the drawings
A kind of schematic flow sheet for Android platform malicious application detection method that Fig. 1 provides for one embodiment of the invention;
Fig. 2 be Fig. 1 in step 101 call FlowDroid instruments extract one illustrate application program in from source to
The schematic diagram of sink data flow;
A kind of structural representation for Android platform malicious application detection means that Fig. 3 provides for one embodiment of the invention.
Embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention
In accompanying drawing, clear, complete description is carried out to the technical scheme in the embodiment of the present invention, it is clear that described embodiment is only
Only it is a part of embodiment of the invention, rather than whole embodiments.Based on embodiments of the invention, ordinary skill people
The every other embodiment that member is obtained under the premise of creative work is not made, belongs to the scope of protection of the invention.
Fig. 1 shows that a kind of flow for Android platform malicious application detection method that one embodiment of the invention is provided is shown
It is intended to, as shown in figure 1, the Android platform malicious application detection method of the present embodiment is as described below.
101st, FlowDroid instruments are called, the static data flow feature of Android applications to be measured is extracted.
It is understood that an Android is applied and is included multiple components, such as Activity, Service, Content
Provider, Broadcast Receiver, wherein, Activity is the main entrance of analysis.With traditional java applet not
Together, principal function is not present in Android applications (program), therefore during analysis, it is impossible to looked for simply by principal function
Controlling stream graph is built to the entrance and exit of program.But, each component of Android applications has function to reflect the group
The life cycle of part, can build the controlling stream graph that Android is applied according to these life cycles.Should in order to generate Android
Controlling stream graph, the present embodiment calls Arzt, and the Open-Source Tools FlowDroid that S. et al. is proposed is by Activity life cycles
Controlling stream call relation between all call back functions is simulated with an empty principal function.Applied below with some Android
In the example of one section of code explain FlowDroid realization principle.
The application program reads the positional information of user, including longitude, dimension by calling the API that Baidu map is provided
And detailed address, positional information is then sent to " on+1 11 " by cell-phone number by short message.This simple application program
Common privacy attack mode is reflected, sensitive traffic flows to SMS API (sink) from LocationClient (source),
So as to result in privacy leakage.
The data-flow analysis process of FlowDroid instruments is that execution sequence is related, can be divided into two parts:Forward
Stain is analyzed has arrived where for finding out contaminated variable transferring, and on-demand alias analysis backward is used to search
All alias to same contaminated heap positions before source.Referring to Fig. 2, FlowDroid is carried using following step
Take the data flow from source to sink in above-mentioned application program:
(1) longitude, latitude and detail from source acquisitions are as the variable being contaminated, by forward
Pass to function loc (detail, latitude, longitude);
(2) when function loc (detail, latitude, longitude) is called, discovery longitude,
Latitude and detail comes in as parameter transmission.Therefore perform backward analysis, find call function parameter a, b and
C is that parameter longitude, latitude and detail of called function pollute to a, b and c.
(3) continue to follow the trail of a, b and c, l is contaminated;
(4) it is the return value of called function to find l, is analyzed backward, location is polluted;
(5) forward analysis is done to location, judges that it is passed to sink.
It is understood that because the analysis process of FlowDroid instruments is context-sensitive, therefore, it is possible to effectively
Differentiation is called to differences of the function loc () based on different parameters.Meanwhile, FlowDroid instruments use special hand-written summary side
Formula processing is called to built-in function.In addition, FlowDroid instruments expand the scale of operation using substantial amounts of optimization means, together
When reduce noise, so as to the accurate sensitive traffic characteristic comprehensively extracted inside Android applications.
102nd, using SUSI technologies, the static data flow feature to the Android applications to be measured is handled, and generation is treated
Survey the characteristic vector of the data flow (standardization) of Android applications.
It is understood that being included completely in the information for the original static data flow characteristics that FlowDroid instruments are extracted
Source and sink function names.However, in Android function libraries, there are thousands of source and sink functions, these functions
In only only a few can be called in one application.If being used as feature, characteristic vector with the data flow between these functions
It is sparse vector, it is necessary to be handled to improve training result.Therefore, the present embodiment utilizes Rasthofer, what S. et al. was proposed
SUSI technologies are handled the static data flow feature that FlowDroid instruments are extracted, the feature of generation data flow (standardization)
Vector.SUSI technologies are based on machine learning algorithm, can judge that it belongs to source functions or sink letters according to function code
Number;Meanwhile, the source functions that there is currently and sink functions are respectively divided into 17 source classes and 19 by SUSI technologies
Sink classes;Wherein:
17 source classes include:
(1)UNIQUE_IDENTIFIER;
(2)LOCATION_INFORMATION;
(3)NETWORK_INFORMATION;
(4)ACCOUNT_INFORMATION;
(5)FILE_INFORMATION;
(6)BLUETOOTH_INFORMATION;
(7)DATABASE_INFORMATION;
(8)EMAIL;
(9)SYNCHRONIZATION_DATA;
(10)SMS_MMS;
(11)CONTACT_INFORMATION;
(12)CALENDAR_INFORMATION;
(13)SYSTEM_SETTING;
(14)IMAGE;
(15)BROWSER_INFORMATION;
(16)NFC;
(17)NO_CATEGORY。
19 sink classes include:
(1)LOCATION_INFORMATION;
(2)PHONE_CONNECTION;
(3)VOIP;
(4)PHONE_STATE;
(5)EMAIL;
(6)BLUETOOTH;
(7)ACCOUNT_SETTING;
(8)AUDIO;
(9)SYNCHRONIZATION_DATA;
(10)NETWORK;
(11)FILE;
(12)LOG;
(13)SMS_MMS;
(14)CONTACT_INFORMATION;
(15)CALENDAR_INFORMATION;
(16)SYSTEM_SETTING;
(17)NFC;
(18)BROWSER_INFORMATION;
(19)NO_CATEGORY。
By SUSI classify can more clearly understand certain malicious application reveal information category, and information leakage road
Footpath.
The present embodiment combines two kinds of technologies of FlowDroid instruments and SUSI, can be with from each Android applications
323 data flow characteristics are extracted, characteristic vector can be expressed as:
Features (app)=(src_category1→sink_category1,src_category1→sink_
category2,…,src_category17→src_category18,src_category17→src_category19)
As a source classes src_categoryi(i=1,2 ..., 17) and a sink classes sink_categoryj(j
=1,2 ..., 19) when there is data flow between, the corresponding value src_category in characteristic vectori→sink_categoryj
=1, otherwise, the value is 0.
It is understood that step 102 is to be unfavorable for training for the FlowDroid data flow characteristics presence extracted
Data flow characteristics are handled, the characteristic vector of formation can be preferably by problem using the sensitive API classification tool SUSI increased income
The flow direction of the internal sensitive data of each application of reaction.
103rd, the characteristic vector of the Android to be measured of the generation data flows applied is inputted into the good depth confidence of training in advance
Network detection model, obtain Android to be measured application whether be malicious application testing result.
In a particular application, before the step 103, methods described also includes the step S1-S4 not shown in figure:
S1, acquisition Android application samples, the Android applications sample include:Safe Android applications sample and
Malice Android application samples.
Specifically, the step S1 can constantly be captured newest Android malice from network using crawler technology and be answered
The malice Android application samples in the Android applications sample are used as, and using crawler technology constantly from network
The newest Android safety applications of crawl are used as the safe Android applications sample in the Android applications sample.
In a particular application, the Android applications sample can be divided into two parts by the step S1:A part is not
The safe Android applications sample and malice Android application samples of mark, another part should for the safe Android of mark
With sample and malice Android application samples.
S2, call FlowDroid instruments, extract the static data flow feature of Android application samples.
Specifically, the principle of this step FlowDroid instruments may refer to the explanation to above-mentioned steps 101, herein no longer
Repeat.
S3, using SUSI technologies, the static data flow feature to the Android applications sample is handled, generation
The characteristic vector of the data flow of Android application samples.
Specifically, this step SUSI technologies may refer to the explanation to above-mentioned steps 102, and here is omitted.
S4, it is trained according to the characteristic vector of the data flow of the Android applications sample, builds depth confidence network
Detection model.
It is understood that the depth confidence network detection model based on deep learning has more than traditional shallow Model
The structure of deep layer and more powerful feature descriptive power, therefore, it can to excavate application data stream feature to a deeper level and reflect
The application security come, with higher Detection accuracy.
Specifically, the step S4 can specifically include the step S41-S43 not shown in figure:
S41, by unlabelled safe Android applications sample and the feature of the data flow of malice Android application samples
Vector is restricted Boltzmann machine RBM input as the bottom, using unsupervised learning method, from bottom to top successively pre-training
Multilayer RBM, generates depth confidence network DBN, until DBN networks are in poised state.
It should be noted that depth confidence network DBN is made up of multiple RBM layers of Boltzmann machines that are restricted, it is each RBM layers
Including receiving the visible layer of input data and the hidden layer of output data, interlayer, which exists, to be connected, but is not deposited between the unit in layer
In connection.
For example, the step S41 can by the feature of the data flow of unlabelled safe Android applications sample to
Amount is restricted Boltzmann machine RBM input as the bottom, from bottom to top unsupervisedly successively pre- using successively greedy algorithm
Multilayer RBM is trained, depth confidence network DBN is generated, until DBN networks are in poised state.
S42, one classification layer of increase after last hidden layer of the DBN networks.
S43, by the feature of the safe Android applications sample of mark and the data flow of malice Android application samples to
The amount input classification layer, using supervised learning method, successively finely tunes the parameter of each layer of whole network from top to bottom, until receiving
Hold back.
For example, the step S43 can apply the safe Android applications sample and malice Android of mark
The characteristic vector of the data flow of sample inputs the classification layer, is calculated using backpropagation (Back Propagation, abbreviation BP)
Method, finely tunes the parameter of each layer of whole network with having supervision, until convergence.
It should be noted that because each layer of RBM training process is separate, to each layer of RBM instruction
The expression best to visible layer of the experienced hidden layer that can only obtain this layer.But, whole DBN networks be not to input data most
Good expression.Therefore, a classification layer (supervised learning network), such as BP are increased after last hidden layer of DBN networks
Neutral net.Whole DBN networks can so be regarded as to the BP neural network of a multilayer, be finely adjusted top-downly, this
Individual process is considered as the initialization to a deep layer BP network weight parameter.DBN this training process can effectively change
The problem of being apt to local optimum that may be present and long training time.According to the difference of concrete application, DBN networks are uppermost prison
Different graders can be chosen by superintending and directing learning layer (layer of classifying).Since then, the detection Android malice based on deep learning algorithm
The depth confidence network detection model training of application is completed.
In a particular application, after the step 103, the present embodiment methods described can also include:
If Android applications to be measured are malicious applications, it is malicious application to point out Android applications to be measured described in user,
And show that Android applications to be measured are the detailed analysis reports of malicious application to user;
If Android applications to be measured are not malicious applications, it is not that malice should to point out Android applications to be measured described in user
With.
Due to realizing that privacy steals the Android malicious applications of attack in the mode of the internal sensitive data of processing application,
There is very big difference in one side, on the other hand there is necessarily identical with other malicious applications again with Android safety applications
Part.Therefore, the present embodiment analyzes this dissimilarity and similitude using machine learning algorithm, focuses mainly on Android and puts down
The state of development of the data stream analysis techniques of platform intelligent terminal and malicious application detection technique based on machine learning algorithm, is based on
Sensitive data stream information inside Android applications carries out depth analysis, probes into malicious application and safety applications in sensitive data
Dissimilarity in processing, so as to be detected to malicious application.
The Android platform malicious application detection method of the present embodiment, it is to be measured by calling FlowDroid instruments to extract
The static data flow feature of Android applications, utilizes static data flow feature of the SUSI technologies to the Android applications to be measured
Handled, generate the characteristic vector of the data flow of Android applications to be measured, the data that the Android to be measured of generation is applied
The characteristic vector of stream inputs the good depth confidence network detection model of training in advance, and whether obtain Android applications to be measured is to dislike
Anticipate the testing result of application, Android platform malicious application is carried out to detect compared with high-accuracy thereby, it is possible to realize, can be with
Avoid dynamic stain from following the trail of the path covering problem existed, overcome static data flow analysis technology to need to entering using operational process
Two big challenges of the target element communicated between row accurate modeling and the accurate securing component of needs, are realized to Android using quick
The accurate comprehensive extraction of data flow is felt, while overcoming the office that traditional shallow-layer machine learning algorithm exists when building detection model
It is sex-limited, it can largely improve the verification and measurement ratio to unknown malicious application.
Fig. 3 shows that a kind of structure for Android platform malicious application detection means that one embodiment of the invention is provided is shown
It is intended to, as shown in figure 3, the Android platform malicious application detection means of the present embodiment, including:Second extraction module 31, second
Processing module 32 and detection module 33;Wherein:
Second extraction module 31, for calling FlowDroid instruments, extracts the static data flow of Android applications to be measured
Feature;
Second processing module 32, for utilizing SUSI technologies, to the static data flow feature of the Android applications to be measured
Handled, generate the characteristic vector of the data flow of Android applications to be measured;
Detection module 33, for the characteristic vector of the data flow of the Android to be measured applications of generation to be inputted into training in advance
Good depth confidence network detection model, obtain Android applications to be measured whether be malicious application testing result.
In a particular application, this described device can also be included not shown in figure:
Acquisition module, for obtaining Android application samples, the Android applications sample includes:Safe Android
Using sample and malice Android application samples;
First extraction module, for calling FlowDroid instruments, extracts the static data of the Android applications sample
Flow feature;
First processing module, for utilizing SUSI technologies, the static data flow feature to the Android applications sample is entered
Row processing, generates the characteristic vector of the data flow of Android application samples;
Module is built, the characteristic vector for the data flow according to the Android applications sample is trained, built deep
Spend confidence network detection model.
In a particular application, the acquisition module can constantly be captured newest using crawler technology from network
Android malicious applications are as the malice Android application samples in the Android applications sample, and utilize reptile skill
Art constantly captures newest Android safety applications as the safe Android in the Android applications sample from network
Using sample.
In a particular application, the Android applications sample can be divided into two parts by the acquisition module:A part is
Unlabelled safe Android applications sample and malice Android application samples, another part are the safe Android of mark
Using sample and malice Android application samples.
In a particular application, the structure module, can be included not shown in figure:
Pre-training unit, for by the number of unlabelled safe Android applications sample and malice Android application samples
Boltzmann machine RBM input is restricted as the bottom according to the characteristic vector of stream, using unsupervised learning method, from bottom to top
Successively pre-training multilayer RBM, generates depth confidence network DBN, until DBN networks are in poised state;
Adding unit, for increasing a classification layer after last hidden layer of the DBN networks;
Fine-adjusting unit, for by the data flow of the safe Android applications sample of mark and malice Android application samples
Characteristic vector input the classification layer, using supervised learning method, the ginseng of each layer of whole network is successively finely tuned from top to bottom
Number, until convergence.
Specifically, for example, the fine-adjusting unit can be by the safe Android applications sample and malice of mark
The characteristic vector of the data flow of Android application samples inputs the classification layer, using backpropagation BP algorithm, supervises micro-
The parameter of each layer of whole network is adjusted, until convergence.
In a particular application, the present embodiment described device can also be included not shown in figure:
First reminding module, if being malicious application for Android to be measured applications, is pointed out to be measured described in user
Android applications are malicious applications, and show that Android applications to be measured are the analysis reports of malicious application to user;
Second reminding module, if being not malicious application for Android to be measured applications, is pointed out to be measured described in user
Android applications are not malicious applications.
Due to realizing that privacy steals the Android malicious applications of attack in the mode of the internal sensitive data of processing application,
There is very big difference in one side, on the other hand there is necessarily identical with other malicious applications again with Android safety applications
Part.Therefore, the present embodiment analyzes this dissimilarity and similitude using machine learning algorithm, focuses mainly on Android and puts down
The state of development of the data stream analysis techniques of platform intelligent terminal and malicious application detection technique based on machine learning algorithm, is based on
Sensitive data stream information inside Android applications carries out depth analysis, probes into malicious application and safety applications in sensitive data
Dissimilarity in processing, so as to be detected to malicious application.
It should be noted that for device/system embodiment, because it is substantially similar to embodiment of the method, so
What is described is fairly simple, and the relevent part can refer to the partial explaination of embodiments of method.
The Android platform malicious application detection means of the present embodiment, can realize and Android platform malicious application is entered
Row is detected compared with high-accuracy, and dynamic stain can be avoided to follow the trail of the path covering problem existed, static data flow analysis is overcome
The two of the target element that technology needs communicate carrying out accurate modeling and the accurate securing component of needs to application operational process are big
Challenge, realizes the accurate comprehensive extraction to Android application sensitive traffics, while overcoming traditional shallow-layer machine learning algorithm
The limitation existed when building detection model, can largely improve the verification and measurement ratio to unknown malicious application.
The Android platform malicious application detection means of the present embodiment, can be used for the skill for performing preceding method embodiment
Art scheme, its implementing principle and technical effect are similar, and here is omitted.
It should be understood by those skilled in the art that, embodiments herein can be provided as method, system or computer program
Product.Therefore, the application can be using the reality in terms of complete hardware embodiment, complete software embodiment or combination software and hardware
Apply the form of example.Moreover, the application can be used in one or more computers for wherein including computer usable program code
The computer program production that usable storage medium is implemented on (including but is not limited to magnetic disk storage, CD-ROM, optical memory etc.)
The form of product.
The application is the flow with reference to method, equipment (system) and computer program product according to the embodiment of the present application
Figure and/or block diagram are described.It should be understood that can be by every first-class in computer program instructions implementation process figure and/or block diagram
Journey and/or the flow in square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided
The processor of all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices is instructed to produce
A raw machine so that produced by the instruction of computer or the computing device of other programmable data processing devices for real
The device for the function of being specified in present one flow of flow chart or one square frame of multiple flows and/or block diagram or multiple square frames.
These computer program instructions, which may be alternatively stored in, can guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory, which is produced, to be included referring to
Make the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one square frame of block diagram or
The function of being specified in multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that in meter
Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented processing, thus in computer or
The instruction performed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one
The step of function of being specified in individual square frame or multiple square frames.
It should be noted that herein, such as first and second or the like relational terms are used merely to a reality
Body or operation make a distinction with another entity or operation, and not necessarily require or imply these entities or deposited between operating
In any this actual relation or order.Moreover, term " comprising ", "comprising" or its any other variant are intended to
Nonexcludability is included, so that process, method, article or equipment including a series of key elements not only will including those
Element, but also other key elements including being not expressly set out, or also include being this process, method, article or equipment
Intrinsic key element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that
Also there is other identical element in process, method, article or equipment including the key element.Term " on ", " under " etc. refers to
The orientation or position relationship shown is, based on orientation shown in the drawings or position relationship, to be for only for ease of the description present invention and simplify
Description, rather than indicate or imply that the device or element of meaning must have specific orientation, with specific azimuth configuration and behaviour
Make, therefore be not considered as limiting the invention.Unless otherwise clearly defined and limited, term " installation ", " connected ",
" connection " should be interpreted broadly, for example, it may be being fixedly connected or being detachably connected, or be integrally connected;Can be
Mechanically connect or electrically connect;Can be joined directly together, can also be indirectly connected to by intermediary, can be two
The connection of element internal.For the ordinary skill in the art, above-mentioned term can be understood at this as the case may be
Concrete meaning in invention.
In the specification of the present invention, numerous specific details are set forth.Although it is understood that, embodiments of the invention can
To be put into practice in the case of these no details.In some instances, known method, structure and skill is not been shown in detail
Art, so as not to obscure the understanding of this description.Similarly, it will be appreciated that disclose in order to simplify the present invention and helps to understand respectively
One or more of individual inventive aspect, above in the description of the exemplary embodiment of the present invention, each of the invention is special
Levy and be grouped together into sometimes in single embodiment, figure or descriptions thereof.However, should not be by the method solution of the disclosure
Release and be intended in reflection is following:I.e. the present invention for required protection requirement is than the feature that is expressly recited in each claim more
Many features.More precisely, as the following claims reflect, inventive aspect is to be less than single reality disclosed above
Apply all features of example.Therefore, it then follows thus claims of embodiment are expressly incorporated in the embodiment,
Wherein each claim is in itself as the separate embodiments of the present invention.It should be noted that in the case where not conflicting, this
The feature in embodiment and embodiment in application can be mutually combined.The invention is not limited in any single aspect,
Any single embodiment is not limited to, any combination and/or the displacement of these aspects and/or embodiment is also not limited to.And
And, can be used alone the present invention each aspect and/or embodiment or with other one or more aspects and/or its implementation
Example is used in combination.
Finally it should be noted that:Various embodiments above is merely illustrative of the technical solution of the present invention, rather than its limitations;To the greatest extent
The present invention is described in detail with reference to foregoing embodiments for pipe, it will be understood by those within the art that:Its according to
The technical scheme described in foregoing embodiments can so be modified, or which part or all technical characteristic are entered
Row equivalent substitution;And these modifications or replacement, the essence of appropriate technical solution is departed from various embodiments of the present invention technology
The scope of scheme, it all should cover among the claim of the present invention and the scope of specification.
Claims (10)
1. a kind of Android platform malicious application detection method, it is characterised in that including:
FlowDroid instruments are called, the static data flow feature of Android applications to be measured is extracted;
Using SUSI technologies, the static data flow feature to the Android applications to be measured is handled, generated to be measured
The characteristic vector of the data flow of Android applications;
By the good depth confidence network detection of the characteristic vector input training in advance of the Android to be measured of the generation data flows applied
Model, obtain Android to be measured application whether be malicious application testing result.
2. according to the method described in claim 1, it is characterised in that the data applied in the Android to be measured by generation
The characteristic vector of stream is inputted before the good depth confidence network detection model of training in advance, and methods described also includes:
Android application samples are obtained, the Android applications sample includes:Safe Android applications sample and malice
Android application samples;
FlowDroid instruments are called, the static data flow feature of the Android applications sample is extracted;
Using SUSI technologies, the static data flow feature to the Android applications sample is handled, and generation Android should
With the characteristic vector of the data flow of sample;
It is trained according to the characteristic vector of the data flow of the Android applications sample, builds depth confidence network detection mould
Type.
3. method according to claim 2, it is characterised in that the data flow applied according to the sample Android
Characteristic vector be trained, build depth confidence network detection model, including:
Using the characteristic vector of unlabelled safe Android applications sample and the data flow of malice Android application samples as
The bottom is restricted Boltzmann machine RBM input, using unsupervised learning method, from bottom to top successively pre-training multilayer RBM,
Depth confidence network DBN is generated, until DBN networks are in poised state;
Increase a classification layer after last hidden layer of the DBN networks;
The characteristic vector of the safe Android applications sample of mark and the data flow of malice Android application samples is inputted into institute
Classification layer is stated, using supervised learning method, the parameter of each layer of whole network is successively finely tuned from top to bottom, until convergence.
4. method according to claim 3, it is characterised in that the safe Android applications sample and evil by mark
Anticipate Android application samples data flow the characteristic vector input classification layer, using supervised learning method, from top to bottom
The parameter of each layer of whole network is successively finely tuned, until convergence, including:
The characteristic vector of the safe Android applications sample of mark and the data flow of malice Android application samples is inputted into institute
Classification layer is stated, using backpropagation BP algorithm, the parameter of each layer of whole network is finely tuned with having supervision, until convergence.
5. the method according to any one of claim 1-4, it is characterised in that obtain Android applications to be measured described
Whether be malicious application testing result after, methods described also includes:
If Android applications to be measured are malicious applications, it is malicious application to point out Android applications to be measured described in user, and to
User shows that Android applications to be measured are the analysis reports of malicious application;
If Android applications to be measured are not malicious applications, it is not malicious application to point out Android applications to be measured described in user.
6. a kind of Android platform malicious application detection means, it is characterised in that including:
Second extraction module, for calling FlowDroid instruments, extracts the static data flow feature of Android applications to be measured;
Second processing module, for utilizing SUSI technologies, at the static data flow feature of the Android applications to be measured
Reason, generates the characteristic vector of the data flow of Android applications to be measured;
Detection module, for the characteristic vector of the data flow of the Android to be measured applications of generation to be inputted into the good depth of training in advance
Spend confidence network detection model, obtain Android to be measured application whether be malicious application testing result.
7. device according to claim 6, it is characterised in that described device also includes:
Acquisition module, for obtaining Android application samples, the Android applications sample includes:Safe Android applications
Sample and malice Android application samples;
First extraction module, for calling FlowDroid instruments, the static data flow for extracting the Android applications sample is special
Levy;
First processing module, for utilizing SUSI technologies, at the static data flow feature of the Android applications sample
Reason, generates the characteristic vector of the data flow of Android application samples;
Module is built, the characteristic vector for the data flow according to the Android applications sample is trained, build depth and put
Communication network detection model.
8. device according to claim 7, it is characterised in that the structure module, including:
Pre-training unit, for by the data flow of unlabelled safe Android applications sample and malice Android application samples
Characteristic vector Boltzmann machine RBM input is restricted as the bottom, using unsupervised learning method, from bottom to top successively
Pre-training multilayer RBM, generates depth confidence network DBN, until DBN networks are in poised state;
Adding unit, for increasing a classification layer after last hidden layer of the DBN networks;
Fine-adjusting unit, for by the spy of the safe Android applications sample of mark and the data flow of malice Android application samples
The vector input classification layer is levied, using supervised learning method, the parameter of each layer of whole network is successively finely tuned from top to bottom, directly
To convergence.
9. device according to claim 8, it is characterised in that the fine-adjusting unit, specifically for
The characteristic vector of the safe Android applications sample of mark and the data flow of malice Android application samples is inputted into institute
Classification layer is stated, using backpropagation BP algorithm, the parameter of each layer of whole network is finely tuned with having supervision, until convergence.
10. the device according to any one of claim 6-9, it is characterised in that described device also includes:
First reminding module, if being malicious application for Android to be measured applications, points out Android to be measured described in user should
Show that Android applications to be measured are the analysis reports of malicious application with being malicious application, and to user;
Second reminding module, if being not malicious application for Android to be measured applications, points out Android to be measured described in user
Using not being malicious application.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710214419.0A CN107194251B (en) | 2017-04-01 | 2017-04-01 | Malicious application detection method and device for Android platform |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710214419.0A CN107194251B (en) | 2017-04-01 | 2017-04-01 | Malicious application detection method and device for Android platform |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107194251A true CN107194251A (en) | 2017-09-22 |
CN107194251B CN107194251B (en) | 2020-02-14 |
Family
ID=59871820
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710214419.0A Active CN107194251B (en) | 2017-04-01 | 2017-04-01 | Malicious application detection method and device for Android platform |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107194251B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108718310A (en) * | 2018-05-18 | 2018-10-30 | 安徽继远软件有限公司 | Multi-level attack signatures generation based on deep learning and malicious act recognition methods |
CN109508545A (en) * | 2018-11-09 | 2019-03-22 | 北京大学 | A kind of Android Malware classification method based on rarefaction representation and Model Fusion |
CN110096265A (en) * | 2019-05-09 | 2019-08-06 | 趋新科技(北京)有限公司 | A kind of software design approach based on data flow and element, software design tool and software running platform |
CN110472415A (en) * | 2018-12-13 | 2019-11-19 | 成都亚信网络安全产业技术研究院有限公司 | A kind of determination method and device of rogue program |
CN110532773A (en) * | 2018-05-25 | 2019-12-03 | 阿里巴巴集团控股有限公司 | Malicious access Activity recognition method, data processing method, device and equipment |
CN110555305A (en) * | 2018-05-31 | 2019-12-10 | 武汉安天信息技术有限责任公司 | Malicious application tracing method based on deep learning and related device |
CN110858247A (en) * | 2018-08-23 | 2020-03-03 | 北京京东尚科信息技术有限公司 | Android malicious application detection method, system, device and storage medium |
CN112287341A (en) * | 2020-09-22 | 2021-01-29 | 哈尔滨安天科技集团股份有限公司 | Android malicious application detection method and device, electronic equipment and storage medium |
CN113110986A (en) * | 2020-01-13 | 2021-07-13 | 深信服科技股份有限公司 | WebShell script file detection method and system |
CN113111346A (en) * | 2020-01-13 | 2021-07-13 | 深信服科技股份有限公司 | Multi-engine WebShell script file detection method and system |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101266550A (en) * | 2007-12-21 | 2008-09-17 | 北京大学 | Malicious code detection method |
US7526804B2 (en) * | 2004-02-02 | 2009-04-28 | Microsoft Corporation | Hardware assist for pattern matches |
CN103793650A (en) * | 2013-12-02 | 2014-05-14 | 北京邮电大学 | Static analysis method and static analysis device for Android application program |
CN104392174A (en) * | 2014-10-23 | 2015-03-04 | 腾讯科技(深圳)有限公司 | Generation method and device for characteristic vectors of dynamic behaviors of application program |
CN105320887A (en) * | 2015-10-12 | 2016-02-10 | 湖南大学 | Static characteristic extraction and selection based detection method for Android malicious application |
CN106096415A (en) * | 2016-06-24 | 2016-11-09 | 康佳集团股份有限公司 | A kind of malicious code detecting method based on degree of depth study and system |
CN106228068A (en) * | 2016-07-21 | 2016-12-14 | 江西师范大学 | Android malicious code detecting method based on composite character |
-
2017
- 2017-04-01 CN CN201710214419.0A patent/CN107194251B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7526804B2 (en) * | 2004-02-02 | 2009-04-28 | Microsoft Corporation | Hardware assist for pattern matches |
CN101266550A (en) * | 2007-12-21 | 2008-09-17 | 北京大学 | Malicious code detection method |
CN103793650A (en) * | 2013-12-02 | 2014-05-14 | 北京邮电大学 | Static analysis method and static analysis device for Android application program |
CN104392174A (en) * | 2014-10-23 | 2015-03-04 | 腾讯科技(深圳)有限公司 | Generation method and device for characteristic vectors of dynamic behaviors of application program |
CN105320887A (en) * | 2015-10-12 | 2016-02-10 | 湖南大学 | Static characteristic extraction and selection based detection method for Android malicious application |
CN106096415A (en) * | 2016-06-24 | 2016-11-09 | 康佳集团股份有限公司 | A kind of malicious code detecting method based on degree of depth study and system |
CN106228068A (en) * | 2016-07-21 | 2016-12-14 | 江西师范大学 | Android malicious code detecting method based on composite character |
Non-Patent Citations (3)
Title |
---|
DALI ZHU 等: "Application of Modified BLP Model on Mobile Web Operating System", 《 2016 IEEE TRUSTCOM/BIGDATASE/ISPA》 * |
徐林溪: "基于混合特征的恶意安卓程序检测方法研究与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
汤伟: "基于数据流特征向量识别的P2P僵尸网络检测方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108718310A (en) * | 2018-05-18 | 2018-10-30 | 安徽继远软件有限公司 | Multi-level attack signatures generation based on deep learning and malicious act recognition methods |
CN108718310B (en) * | 2018-05-18 | 2021-02-26 | 安徽继远软件有限公司 | Deep learning-based multilevel attack feature extraction and malicious behavior identification method |
CN110532773B (en) * | 2018-05-25 | 2023-04-07 | 阿里巴巴集团控股有限公司 | Malicious access behavior identification method, data processing method, device and equipment |
CN110532773A (en) * | 2018-05-25 | 2019-12-03 | 阿里巴巴集团控股有限公司 | Malicious access Activity recognition method, data processing method, device and equipment |
CN110555305A (en) * | 2018-05-31 | 2019-12-10 | 武汉安天信息技术有限责任公司 | Malicious application tracing method based on deep learning and related device |
CN110858247A (en) * | 2018-08-23 | 2020-03-03 | 北京京东尚科信息技术有限公司 | Android malicious application detection method, system, device and storage medium |
CN109508545B (en) * | 2018-11-09 | 2021-06-04 | 北京大学 | Android Malware classification method based on sparse representation and model fusion |
CN109508545A (en) * | 2018-11-09 | 2019-03-22 | 北京大学 | A kind of Android Malware classification method based on rarefaction representation and Model Fusion |
CN110472415A (en) * | 2018-12-13 | 2019-11-19 | 成都亚信网络安全产业技术研究院有限公司 | A kind of determination method and device of rogue program |
CN110472415B (en) * | 2018-12-13 | 2021-08-10 | 成都亚信网络安全产业技术研究院有限公司 | Malicious program determination method and device |
CN110096265A (en) * | 2019-05-09 | 2019-08-06 | 趋新科技(北京)有限公司 | A kind of software design approach based on data flow and element, software design tool and software running platform |
CN110096265B (en) * | 2019-05-09 | 2023-06-20 | 趋新科技(北京)有限公司 | Software design method, software design tool and software operation platform based on data stream and element |
CN113110986A (en) * | 2020-01-13 | 2021-07-13 | 深信服科技股份有限公司 | WebShell script file detection method and system |
CN113111346A (en) * | 2020-01-13 | 2021-07-13 | 深信服科技股份有限公司 | Multi-engine WebShell script file detection method and system |
CN112287341A (en) * | 2020-09-22 | 2021-01-29 | 哈尔滨安天科技集团股份有限公司 | Android malicious application detection method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN107194251B (en) | 2020-02-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107194251A (en) | Android platform malicious application detection method and device | |
Jin et al. | Why are they collecting my data? inferring the purposes of network traffic in mobile apps | |
Qu et al. | Continuous-time link prediction via temporal dependent graph neural network | |
CN105830081A (en) | Methods and systems of generating application-specific models for the targeted protection of vital applications | |
CN107392025A (en) | Malice Android application program detection method based on deep learning | |
Wen et al. | Asa: Adversary situation awareness via heterogeneous graph convolutional networks | |
Lee et al. | Advanced sound classifiers and performance analyses for accurate audio-based construction project monitoring | |
Shezan et al. | Read between the lines: An empirical measurement of sensitive applications of voice personal assistant systems | |
CN109886290A (en) | Detection method, device, computer equipment and the storage medium of user's request | |
US20210209162A1 (en) | Method for processing identity information, electronic device, and storage medium | |
CN114866358B (en) | Automatic penetration testing method and system based on knowledge graph | |
CN107169360A (en) | The detection method and system of a kind of source code security loophole | |
Drosou et al. | An enhanced graph analytics platform (gap) providing insight in big network data | |
CN115004153A (en) | Demonstration of nerve flow | |
CN110197375A (en) | A kind of similar users recognition methods, device, similar users identification equipment and medium | |
KR101602480B1 (en) | Illegal internet site filtering system and control method thereof, recording medium for performing the method | |
CN108241678A (en) | The method for digging and device of interest point data | |
Wadhwa | Smart cities: toward the surveillance society? | |
CN116980162A (en) | Cloud audit data detection method, device, equipment, medium and program product | |
Nasri et al. | Android malware detection system using machine learning | |
Woerndl et al. | Logging user activities and sensor data on mobile devices | |
CN114513329A (en) | Industrial Internet information security assessment method and device | |
Peng et al. | A Survey of Security Protection Methods for Deep Learning Model | |
Singh et al. | A Blueprint for Effective Pandemic Mitigation | |
Olber | Artificial intelligence and future crime in the context of computer forensics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |