CN106203117A - A kind of malice mobile applications decision method based on machine learning - Google Patents

A kind of malice mobile applications decision method based on machine learning Download PDF

Info

Publication number
CN106203117A
CN106203117A CN201610547624.4A CN201610547624A CN106203117A CN 106203117 A CN106203117 A CN 106203117A CN 201610547624 A CN201610547624 A CN 201610547624A CN 106203117 A CN106203117 A CN 106203117A
Authority
CN
China
Prior art keywords
application program
mobile applications
malice
machine learning
method based
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610547624.4A
Other languages
Chinese (zh)
Inventor
何清林
马秀娟
张家琦
王子厚
王大伟
朱佳伟
刘培朋
李海灵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Computer Network and Information Security Management Center
Original Assignee
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Computer Network and Information Security Management Center filed Critical National Computer Network and Information Security Management Center
Priority to CN201610547624.4A priority Critical patent/CN106203117A/en
Publication of CN106203117A publication Critical patent/CN106203117A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a kind of malice mobile applications decision method based on machine learning, by whether being that malicious act automatically learns and judges to the combined network communication behavior of this application program, and then judge that whether this application program is the method for malice;The method relates to moving to the fields such as application program detection, can be used to develop the application program of similar detection function, is arranged separately on smart mobile phone use, it is also possible to support the application program malice detection kit etc. that the exploitation of third party testing agency is special.

Description

A kind of malice mobile applications decision method based on machine learning
Technical field
The invention belongs to mobile Internet security technology area, be specifically related to a kind of malice based on machine learning move should Use programmed decision method.
Background technology
Along with smart mobile phone is more and more universal, various mobile applications emerge in an endless stream, the application of corresponding all kinds of malice Program occurs the most therewith.A lot of the most resident backstages of rogue program, steal the privacy of user data such as user communication record, note also on Passing to remote service end, or infection becomes corpse wooden horse and controlled node, in the case of user's the unknown, DDoS is initiated in timing Attacking, to individual privacy, network security etc. all causes the biggest infringement.
How going to identify and detect which application program is malice, becomes a difficulties.The most much sides of detection Method is all to make a decision according to simple behavior characteristics, formulates corresponding baseline strategy, if the behavior of certain application program is special Levy and exceeded baseline, it is determined that malice.This type of method is typically all the detection analysis for single application program, lacks complete The association analysis of office's property, the most well uses the knowledge base etc. of the overall situation.
Summary of the invention
In view of this, it is an object of the invention to provide a kind of malice mobile applications judgement side based on machine learning Method, it is possible to judge on smart mobile phone whether mobile applications is rogue program.
A kind of malice mobile applications decision method based on machine learning, comprises the steps:
S1, first collect a number of normal mobile applications and malice mobile applications;
S2, smart mobile phone is connected into network, this smart mobile phone is installed successively and starts each application journey that S1 obtains Sequence, and trigger each application program by manual operation, network is carried out lasting monitoring, gets all nets of application program Network Content of communciation, extracts all data messages of the request content that application program is sent to remote server;
S3, all transmission solicited messages that each application program is captured, by putting in order, initial and end successively is connected, It is integrated into a long character string, records the classification of application program corresponding to this long character string, i.e. normal mobile applications or evil Meaning mobile applications;For each long character string, if length N represents, respectively from the 1st, 2 ... N number of character starts backward Intercepted length is the character cell of M, searches the character cell repeated, and records number of repetition;Described M is long much smaller than long character string Degree N;
All different character cells corresponding to S4, each application program obtained by S3 is as feature space, character Unit number of repetition, as eigenvalue, forms a sample, according to the record of S3, sample carries out category label, all application journeys The sample that ordered pair is answered forms training sample set, and uses the method for machine learning to carry out secondary classification learning training, obtains one Grader;
S5, the mobile applications that certain needs is judged, the method first using S2, it is thus achieved that this mobile applications institute All data messages of the request content sent, then the method using S3, obtain string elements and number of repetition, finally utilize The grader that S4 obtains judges that this mobile applications is whether as malice mobile applications.
It is also preferred that the left application program in described S1 by the industry organizations such as similar China anti-virus network alliance share black The open channel of list and white list application program obtains.
It is also preferred that the left described machine learning uses support vector machines theory of learning.
It is also preferred that the left the kernel functional parameter used in SVM study is gaussian kernel function.
It is also preferred that the left described M value is 4 or 5.
There is advantages that
The invention discloses and a kind of on smart mobile phone, judge that whether mobile applications is the method for rogue program, pass through Whether the combined network communication behavior to this application program is that malicious act automatically learns and judges, and then judge this application program Whether it is method maliciously.The method relates to moving to the fields such as application program detection, can be used to develop similar detection function Application program, be arranged separately on smart mobile phone use, it is also possible to support the third party testing agency special application journey of exploitation Sequence malice detection kit etc..
Detailed description of the invention
Major part rogue program all can have the networking behavior accompanied therewith, actively can send request to far-end server and disappear Breath, general by http protocol or other proprietary protocols, this type of protocol data has generally comprised privacy of user data or The relevant informations such as wooden horse control.The invention provides a kind of method based on machine Learning Theory, it is possible to well utilize these The information that rogue program sends, utilizes support vector machine learning algorithm to learn to corresponding model, to unknown mobile process Connected network communication behavior automatically learns and judges, it is achieved judge that this application program is whether as the function of rogue program;
For solving the problems referred to above, the invention provides and a kind of based on machine learning, whether the networking behavior of application program is disliked Meaning judges, and then judges application program method the most maliciously, and it is as follows that the method comprising the steps of:
S1, first collect a number of normal mobile process and malice mobile process, this two classes application program can lead to Cross the shared blacklist of the industry organizations such as similar China's anti-virus network alliance (anva.org.cn) and white list application program etc. Open channel obtains;
S2, smart mobile phone is connected into network, this smart mobile phone is installed successively and starts these application programs, and leading to Crossing manual operation and trigger this application program, the network port at local network or this smart mobile phone carries out lasting network prison Listen, got the all-network Content of communciation of this application program by network packet capturing, extract this application program to remote service All data messages of the request content that device is sent;
S3, all transmission solicited messages that each application program is captured, such as HTTP request content, or other API content etc., by putting in order, initial and end successively is connected, and is handled as follows the most again:
If this sample of S3.1 is rogue program, the class label labelling 1 of sample;Otherwise, 0 it is labeled as;
The request data that each application program sample is sent by S3.2, it is assumed that its long string length is N, respectively from the 1st, 2 ... it is the string elements of M that N number of character starts intercepted length backward, searches the string elements repeated, and records repetition time Number;Described M is much smaller than long string length N;M is typically based on empirical value and chooses 4 or 5;
M metacharacter collection element in all samples as feature space, the eigenvalue of the feature of each sample is by S3.3 The number of times that M metacharacter collection corresponding to this feature occurs in this sample
S4, each application program sample that S3.3 is obtained all different M metacharacter unit as feature space, Sample, as eigenvalue, is carried out normal or malice according to the record of S3 by the number of times that M metacharacter repeats in this sample Category label, forms a sample.Sample corresponding for all application programs collected in S1 is formed training sample set, and uses Support vector machines theory of learning carries out secondary classification learning training, obtains a grader;
The kernel functional parameter used in SVM study is gaussian kernel function;
S5, the mobile applications that certain needs is judged, the method first using S2, it is thus achieved that this mobile applications institute All data messages of the request content sent, then the method using S3, obtain string elements and number of repetition, finally utilize The grader that S4 obtains judges that this mobile applications is whether as malice mobile applications.
In sum, these are only presently preferred embodiments of the present invention, be not intended to limit protection scope of the present invention. All within the spirit and principles in the present invention, any modification, equivalent substitution and improvement etc. made, should be included in the present invention's Within protection domain.

Claims (5)

1. a malice mobile applications decision method based on machine learning, it is characterised in that comprise the steps:
S1, first collect a number of normal mobile applications and malice mobile applications;
S2, smart mobile phone is connected into network, this smart mobile phone is installed successively and starts each application program that S1 obtains, and And trigger each application program by manual operation, and network is carried out lasting monitoring, the all-network getting application program leads to News content, extracts all data messages of the request content that application program is sent to remote server;
S3, all transmission solicited messages being captured each application program, by putting in order, initial and end successively is connected, and integrates Becoming a long character string, record the classification of application program corresponding to this long character string, i.e. normal mobile applications or malice are moved Dynamic application program;For each long character string, if length N represents, respectively from the 1st, 2 ... N number of character starts to intercept backward The character cell of a length of M, searches the character cell repeated, and records number of repetition;Described M is much smaller than long string length N;
All different character cells corresponding to S4, each application program obtained by S3 is as feature space, character cell Number of repetition, as eigenvalue, forms a sample, according to the record of S3, sample carries out category label, all application programs pair The sample answered forms training sample set, and uses the method for machine learning to carry out secondary classification learning training, obtains a classification Device;
S5, the mobile applications that certain needs is judged, the method first using S2, it is thus achieved that this mobile applications is sent All data messages of request content, then the method using S3, obtain string elements and number of repetition, finally utilize S4 to obtain To grader judge that this mobile applications is whether as malice mobile applications.
2. malice mobile applications decision method based on machine learning as claimed in claim 1, it is characterised in that described Blacklist that application program in S1 is shared by the industry organization such as anti-virus network alliance of similar China and white list application journey The open channel of sequence obtains.
3. malice mobile applications decision method based on machine learning as claimed in claim 1, it is characterised in that described Machine learning uses support vector machines theory of learning.
4. malice mobile applications decision method based on machine learning as claimed in claim 3, it is characterised in that SVM The kernel functional parameter used in study is gaussian kernel function.
5. malice mobile applications decision method based on machine learning as claimed in claim 1, it is characterised in that described M value is 4 or 5.
CN201610547624.4A 2016-07-12 2016-07-12 A kind of malice mobile applications decision method based on machine learning Pending CN106203117A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610547624.4A CN106203117A (en) 2016-07-12 2016-07-12 A kind of malice mobile applications decision method based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610547624.4A CN106203117A (en) 2016-07-12 2016-07-12 A kind of malice mobile applications decision method based on machine learning

Publications (1)

Publication Number Publication Date
CN106203117A true CN106203117A (en) 2016-12-07

Family

ID=57477159

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610547624.4A Pending CN106203117A (en) 2016-07-12 2016-07-12 A kind of malice mobile applications decision method based on machine learning

Country Status (1)

Country Link
CN (1) CN106203117A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107577943A (en) * 2017-09-08 2018-01-12 北京奇虎科技有限公司 Sample predictions method, apparatus and server based on machine learning
CN110147671A (en) * 2019-05-29 2019-08-20 北京奇安信科技有限公司 Text string extracting method and device in a kind of program

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009142668A1 (en) * 2007-12-20 2009-11-26 Bank Of America Corporation Detection and prevention of malicious code execution using risk scoring
CN102360408A (en) * 2011-09-28 2012-02-22 国家计算机网络与信息安全管理中心 Detecting method and system for malicious codes
CN103679030A (en) * 2013-12-12 2014-03-26 中国科学院信息工程研究所 Malicious code analysis and detection method based on dynamic semantic features
CN103853979A (en) * 2010-12-31 2014-06-11 北京奇虎科技有限公司 Program identification method and device based on machine learning
CN104751053A (en) * 2013-12-30 2015-07-01 南京理工大学常熟研究院有限公司 Static behavior analysis method of mobile smart terminal software

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009142668A1 (en) * 2007-12-20 2009-11-26 Bank Of America Corporation Detection and prevention of malicious code execution using risk scoring
CN103853979A (en) * 2010-12-31 2014-06-11 北京奇虎科技有限公司 Program identification method and device based on machine learning
CN102360408A (en) * 2011-09-28 2012-02-22 国家计算机网络与信息安全管理中心 Detecting method and system for malicious codes
CN103679030A (en) * 2013-12-12 2014-03-26 中国科学院信息工程研究所 Malicious code analysis and detection method based on dynamic semantic features
CN104751053A (en) * 2013-12-30 2015-07-01 南京理工大学常熟研究院有限公司 Static behavior analysis method of mobile smart terminal software

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107577943A (en) * 2017-09-08 2018-01-12 北京奇虎科技有限公司 Sample predictions method, apparatus and server based on machine learning
CN110147671A (en) * 2019-05-29 2019-08-20 北京奇安信科技有限公司 Text string extracting method and device in a kind of program

Similar Documents

Publication Publication Date Title
US11399288B2 (en) Method for HTTP-based access point fingerprint and classification using machine learning
CN107852410B (en) Dissect rogue access point
US20230224232A1 (en) System and method for extracting identifiers from traffic of an unknown protocol
Cunche et al. Linking wireless devices using information contained in Wi-Fi probe requests
US10033757B2 (en) Identifying malicious identifiers
Cunche et al. I know who you will meet this evening! linking wireless devices using wi-fi probe requests
US8065722B2 (en) Semantically-aware network intrusion signature generator
CN107483488A (en) A kind of malice Http detection methods and system
Li et al. Demographic information inference through meta-data analysis of Wi-Fi traffic
Bartos et al. Network entity characterization and attack prediction
CN107749859A (en) A kind of malice Mobile solution detection method of network-oriented encryption flow
CN105323247A (en) Intrusion detection system for mobile terminal
CN104462973B (en) The dynamic malicious act detecting system and method for application program in mobile terminal
CN105530265B (en) A kind of mobile Internet malicious application detection method based on frequent item set description
Grill et al. Malware detection using http user-agent discrepancy identification
US20150113651A1 (en) Spammer group extraction apparatus and method
CN109151880A (en) Mobile application flow identification method based on multilayer classifier
CN110493235A (en) A kind of mobile terminal from malicious software synchronization detection method based on network flow characteristic
Wang et al. TextDroid: Semantics-based detection of mobile malware using network flows
CN111147490A (en) Directional fishing attack event discovery method and device
CN107666464A (en) A kind of information processing method and server
Kumar et al. Light weighted CNN model to detect DDoS attack over distributed scenario
CN110472410B (en) Method and device for identifying data and data processing method
Yin et al. Identifying iot devices based on spatial and temporal features from network traffic
CN106203117A (en) A kind of malice mobile applications decision method based on machine learning

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20161207

WD01 Invention patent application deemed withdrawn after publication