CN106203117A - A kind of malice mobile applications decision method based on machine learning - Google Patents
A kind of malice mobile applications decision method based on machine learning Download PDFInfo
- Publication number
- CN106203117A CN106203117A CN201610547624.4A CN201610547624A CN106203117A CN 106203117 A CN106203117 A CN 106203117A CN 201610547624 A CN201610547624 A CN 201610547624A CN 106203117 A CN106203117 A CN 106203117A
- Authority
- CN
- China
- Prior art keywords
- application program
- mobile applications
- malice
- machine learning
- method based
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/566—Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Virology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The invention discloses a kind of malice mobile applications decision method based on machine learning, by whether being that malicious act automatically learns and judges to the combined network communication behavior of this application program, and then judge that whether this application program is the method for malice;The method relates to moving to the fields such as application program detection, can be used to develop the application program of similar detection function, is arranged separately on smart mobile phone use, it is also possible to support the application program malice detection kit etc. that the exploitation of third party testing agency is special.
Description
Technical field
The invention belongs to mobile Internet security technology area, be specifically related to a kind of malice based on machine learning move should
Use programmed decision method.
Background technology
Along with smart mobile phone is more and more universal, various mobile applications emerge in an endless stream, the application of corresponding all kinds of malice
Program occurs the most therewith.A lot of the most resident backstages of rogue program, steal the privacy of user data such as user communication record, note also on
Passing to remote service end, or infection becomes corpse wooden horse and controlled node, in the case of user's the unknown, DDoS is initiated in timing
Attacking, to individual privacy, network security etc. all causes the biggest infringement.
How going to identify and detect which application program is malice, becomes a difficulties.The most much sides of detection
Method is all to make a decision according to simple behavior characteristics, formulates corresponding baseline strategy, if the behavior of certain application program is special
Levy and exceeded baseline, it is determined that malice.This type of method is typically all the detection analysis for single application program, lacks complete
The association analysis of office's property, the most well uses the knowledge base etc. of the overall situation.
Summary of the invention
In view of this, it is an object of the invention to provide a kind of malice mobile applications judgement side based on machine learning
Method, it is possible to judge on smart mobile phone whether mobile applications is rogue program.
A kind of malice mobile applications decision method based on machine learning, comprises the steps:
S1, first collect a number of normal mobile applications and malice mobile applications;
S2, smart mobile phone is connected into network, this smart mobile phone is installed successively and starts each application journey that S1 obtains
Sequence, and trigger each application program by manual operation, network is carried out lasting monitoring, gets all nets of application program
Network Content of communciation, extracts all data messages of the request content that application program is sent to remote server;
S3, all transmission solicited messages that each application program is captured, by putting in order, initial and end successively is connected,
It is integrated into a long character string, records the classification of application program corresponding to this long character string, i.e. normal mobile applications or evil
Meaning mobile applications;For each long character string, if length N represents, respectively from the 1st, 2 ... N number of character starts backward
Intercepted length is the character cell of M, searches the character cell repeated, and records number of repetition;Described M is long much smaller than long character string
Degree N;
All different character cells corresponding to S4, each application program obtained by S3 is as feature space, character
Unit number of repetition, as eigenvalue, forms a sample, according to the record of S3, sample carries out category label, all application journeys
The sample that ordered pair is answered forms training sample set, and uses the method for machine learning to carry out secondary classification learning training, obtains one
Grader;
S5, the mobile applications that certain needs is judged, the method first using S2, it is thus achieved that this mobile applications institute
All data messages of the request content sent, then the method using S3, obtain string elements and number of repetition, finally utilize
The grader that S4 obtains judges that this mobile applications is whether as malice mobile applications.
It is also preferred that the left application program in described S1 by the industry organizations such as similar China anti-virus network alliance share black
The open channel of list and white list application program obtains.
It is also preferred that the left described machine learning uses support vector machines theory of learning.
It is also preferred that the left the kernel functional parameter used in SVM study is gaussian kernel function.
It is also preferred that the left described M value is 4 or 5.
There is advantages that
The invention discloses and a kind of on smart mobile phone, judge that whether mobile applications is the method for rogue program, pass through
Whether the combined network communication behavior to this application program is that malicious act automatically learns and judges, and then judge this application program
Whether it is method maliciously.The method relates to moving to the fields such as application program detection, can be used to develop similar detection function
Application program, be arranged separately on smart mobile phone use, it is also possible to support the third party testing agency special application journey of exploitation
Sequence malice detection kit etc..
Detailed description of the invention
Major part rogue program all can have the networking behavior accompanied therewith, actively can send request to far-end server and disappear
Breath, general by http protocol or other proprietary protocols, this type of protocol data has generally comprised privacy of user data or
The relevant informations such as wooden horse control.The invention provides a kind of method based on machine Learning Theory, it is possible to well utilize these
The information that rogue program sends, utilizes support vector machine learning algorithm to learn to corresponding model, to unknown mobile process
Connected network communication behavior automatically learns and judges, it is achieved judge that this application program is whether as the function of rogue program;
For solving the problems referred to above, the invention provides and a kind of based on machine learning, whether the networking behavior of application program is disliked
Meaning judges, and then judges application program method the most maliciously, and it is as follows that the method comprising the steps of:
S1, first collect a number of normal mobile process and malice mobile process, this two classes application program can lead to
Cross the shared blacklist of the industry organizations such as similar China's anti-virus network alliance (anva.org.cn) and white list application program etc.
Open channel obtains;
S2, smart mobile phone is connected into network, this smart mobile phone is installed successively and starts these application programs, and leading to
Crossing manual operation and trigger this application program, the network port at local network or this smart mobile phone carries out lasting network prison
Listen, got the all-network Content of communciation of this application program by network packet capturing, extract this application program to remote service
All data messages of the request content that device is sent;
S3, all transmission solicited messages that each application program is captured, such as HTTP request content, or other
API content etc., by putting in order, initial and end successively is connected, and is handled as follows the most again:
If this sample of S3.1 is rogue program, the class label labelling 1 of sample;Otherwise, 0 it is labeled as;
The request data that each application program sample is sent by S3.2, it is assumed that its long string length is N, respectively from the 1st,
2 ... it is the string elements of M that N number of character starts intercepted length backward, searches the string elements repeated, and records repetition time
Number;Described M is much smaller than long string length N;M is typically based on empirical value and chooses 4 or 5;
M metacharacter collection element in all samples as feature space, the eigenvalue of the feature of each sample is by S3.3
The number of times that M metacharacter collection corresponding to this feature occurs in this sample
S4, each application program sample that S3.3 is obtained all different M metacharacter unit as feature space,
Sample, as eigenvalue, is carried out normal or malice according to the record of S3 by the number of times that M metacharacter repeats in this sample
Category label, forms a sample.Sample corresponding for all application programs collected in S1 is formed training sample set, and uses
Support vector machines theory of learning carries out secondary classification learning training, obtains a grader;
The kernel functional parameter used in SVM study is gaussian kernel function;
S5, the mobile applications that certain needs is judged, the method first using S2, it is thus achieved that this mobile applications institute
All data messages of the request content sent, then the method using S3, obtain string elements and number of repetition, finally utilize
The grader that S4 obtains judges that this mobile applications is whether as malice mobile applications.
In sum, these are only presently preferred embodiments of the present invention, be not intended to limit protection scope of the present invention.
All within the spirit and principles in the present invention, any modification, equivalent substitution and improvement etc. made, should be included in the present invention's
Within protection domain.
Claims (5)
1. a malice mobile applications decision method based on machine learning, it is characterised in that comprise the steps:
S1, first collect a number of normal mobile applications and malice mobile applications;
S2, smart mobile phone is connected into network, this smart mobile phone is installed successively and starts each application program that S1 obtains, and
And trigger each application program by manual operation, and network is carried out lasting monitoring, the all-network getting application program leads to
News content, extracts all data messages of the request content that application program is sent to remote server;
S3, all transmission solicited messages being captured each application program, by putting in order, initial and end successively is connected, and integrates
Becoming a long character string, record the classification of application program corresponding to this long character string, i.e. normal mobile applications or malice are moved
Dynamic application program;For each long character string, if length N represents, respectively from the 1st, 2 ... N number of character starts to intercept backward
The character cell of a length of M, searches the character cell repeated, and records number of repetition;Described M is much smaller than long string length N;
All different character cells corresponding to S4, each application program obtained by S3 is as feature space, character cell
Number of repetition, as eigenvalue, forms a sample, according to the record of S3, sample carries out category label, all application programs pair
The sample answered forms training sample set, and uses the method for machine learning to carry out secondary classification learning training, obtains a classification
Device;
S5, the mobile applications that certain needs is judged, the method first using S2, it is thus achieved that this mobile applications is sent
All data messages of request content, then the method using S3, obtain string elements and number of repetition, finally utilize S4 to obtain
To grader judge that this mobile applications is whether as malice mobile applications.
2. malice mobile applications decision method based on machine learning as claimed in claim 1, it is characterised in that described
Blacklist that application program in S1 is shared by the industry organization such as anti-virus network alliance of similar China and white list application journey
The open channel of sequence obtains.
3. malice mobile applications decision method based on machine learning as claimed in claim 1, it is characterised in that described
Machine learning uses support vector machines theory of learning.
4. malice mobile applications decision method based on machine learning as claimed in claim 3, it is characterised in that SVM
The kernel functional parameter used in study is gaussian kernel function.
5. malice mobile applications decision method based on machine learning as claimed in claim 1, it is characterised in that described
M value is 4 or 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610547624.4A CN106203117A (en) | 2016-07-12 | 2016-07-12 | A kind of malice mobile applications decision method based on machine learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610547624.4A CN106203117A (en) | 2016-07-12 | 2016-07-12 | A kind of malice mobile applications decision method based on machine learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106203117A true CN106203117A (en) | 2016-12-07 |
Family
ID=57477159
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610547624.4A Pending CN106203117A (en) | 2016-07-12 | 2016-07-12 | A kind of malice mobile applications decision method based on machine learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106203117A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107577943A (en) * | 2017-09-08 | 2018-01-12 | 北京奇虎科技有限公司 | Sample predictions method, apparatus and server based on machine learning |
CN110147671A (en) * | 2019-05-29 | 2019-08-20 | 北京奇安信科技有限公司 | Text string extracting method and device in a kind of program |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009142668A1 (en) * | 2007-12-20 | 2009-11-26 | Bank Of America Corporation | Detection and prevention of malicious code execution using risk scoring |
CN102360408A (en) * | 2011-09-28 | 2012-02-22 | 国家计算机网络与信息安全管理中心 | Detecting method and system for malicious codes |
CN103679030A (en) * | 2013-12-12 | 2014-03-26 | 中国科学院信息工程研究所 | Malicious code analysis and detection method based on dynamic semantic features |
CN103853979A (en) * | 2010-12-31 | 2014-06-11 | 北京奇虎科技有限公司 | Program identification method and device based on machine learning |
CN104751053A (en) * | 2013-12-30 | 2015-07-01 | 南京理工大学常熟研究院有限公司 | Static behavior analysis method of mobile smart terminal software |
-
2016
- 2016-07-12 CN CN201610547624.4A patent/CN106203117A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009142668A1 (en) * | 2007-12-20 | 2009-11-26 | Bank Of America Corporation | Detection and prevention of malicious code execution using risk scoring |
CN103853979A (en) * | 2010-12-31 | 2014-06-11 | 北京奇虎科技有限公司 | Program identification method and device based on machine learning |
CN102360408A (en) * | 2011-09-28 | 2012-02-22 | 国家计算机网络与信息安全管理中心 | Detecting method and system for malicious codes |
CN103679030A (en) * | 2013-12-12 | 2014-03-26 | 中国科学院信息工程研究所 | Malicious code analysis and detection method based on dynamic semantic features |
CN104751053A (en) * | 2013-12-30 | 2015-07-01 | 南京理工大学常熟研究院有限公司 | Static behavior analysis method of mobile smart terminal software |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107577943A (en) * | 2017-09-08 | 2018-01-12 | 北京奇虎科技有限公司 | Sample predictions method, apparatus and server based on machine learning |
CN110147671A (en) * | 2019-05-29 | 2019-08-20 | 北京奇安信科技有限公司 | Text string extracting method and device in a kind of program |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11399288B2 (en) | Method for HTTP-based access point fingerprint and classification using machine learning | |
CN107852410B (en) | Dissect rogue access point | |
US20230224232A1 (en) | System and method for extracting identifiers from traffic of an unknown protocol | |
Cunche et al. | Linking wireless devices using information contained in Wi-Fi probe requests | |
US10033757B2 (en) | Identifying malicious identifiers | |
Cunche et al. | I know who you will meet this evening! linking wireless devices using wi-fi probe requests | |
US8065722B2 (en) | Semantically-aware network intrusion signature generator | |
CN107483488A (en) | A kind of malice Http detection methods and system | |
Li et al. | Demographic information inference through meta-data analysis of Wi-Fi traffic | |
Bartos et al. | Network entity characterization and attack prediction | |
CN107749859A (en) | A kind of malice Mobile solution detection method of network-oriented encryption flow | |
CN105323247A (en) | Intrusion detection system for mobile terminal | |
CN104462973B (en) | The dynamic malicious act detecting system and method for application program in mobile terminal | |
CN105530265B (en) | A kind of mobile Internet malicious application detection method based on frequent item set description | |
Grill et al. | Malware detection using http user-agent discrepancy identification | |
US20150113651A1 (en) | Spammer group extraction apparatus and method | |
CN109151880A (en) | Mobile application flow identification method based on multilayer classifier | |
CN110493235A (en) | A kind of mobile terminal from malicious software synchronization detection method based on network flow characteristic | |
Wang et al. | TextDroid: Semantics-based detection of mobile malware using network flows | |
CN111147490A (en) | Directional fishing attack event discovery method and device | |
CN107666464A (en) | A kind of information processing method and server | |
Kumar et al. | Light weighted CNN model to detect DDoS attack over distributed scenario | |
CN110472410B (en) | Method and device for identifying data and data processing method | |
Yin et al. | Identifying iot devices based on spatial and temporal features from network traffic | |
CN106203117A (en) | A kind of malice mobile applications decision method based on machine learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20161207 |
|
WD01 | Invention patent application deemed withdrawn after publication |