CN109104381A - A kind of mobile application recognition methods based on third party's flow HTTP message - Google Patents
A kind of mobile application recognition methods based on third party's flow HTTP message Download PDFInfo
- Publication number
- CN109104381A CN109104381A CN201810670461.8A CN201810670461A CN109104381A CN 109104381 A CN109104381 A CN 109104381A CN 201810670461 A CN201810670461 A CN 201810670461A CN 109104381 A CN109104381 A CN 109104381A
- Authority
- CN
- China
- Prior art keywords
- application
- message
- party
- flow
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2475—Traffic characterised by specific attributes, e.g. priority or QoS for supporting traffic characterised by the type of applications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
Abstract
The present invention discloses a kind of mobile application recognition methods based on third party's flow HTTP message, and step is: user carries out flow sample collection, and automatic marked traffic using automation flow collection platform;User counts HTTP message keyword sequences and is in the presence of judging whether message corresponding to the sequence is third party's flow in data set;Count HTTP message composition sequence, by message value value the appearance situation in same application and between different application judge value value whether with apply there are mapping relations, to establish third party's fingerprint base;Then, after capturing message to be measured, first determine whether the message is third party's flow, then check that third party's fingerprint base finds the value value of mark application, i.e. application ID, and identify the application source of the message by the mapping relations between ID and application.Such method judges third party's traffic messages and extracts the application ID in message using statistical method, establishes the mapping relations between ID and application, to carry out using identification.
Description
Technical field
The invention belongs to mobile application identification technology fields, and in particular to a kind of shifting based on third party's flow HTTP message
Dynamic application and identification method.
Background technique
With universal and mobile application market the prosperity of mobile intelligent terminal, mobile flow accounts for network bulk flow ratio
It is continuously increased, how it is effectively supervised of increasing concern.In order to carry out fine granularity monitoring to mobile flow, need pair
The attributes such as source, the function of flow are identified, and mobile application identification technology due to can effectively solve the problem that the above problem and by
To extensive concern.
A kind of common approach of mobile application identification is by knowing to the application feature in third parties' flow such as advertisement
Not.Specifically, the purpose that third party's service is needed or got a profit for function, often needs to identify application identity, this
ID of the value value for allowing for often filling some application identities for identification in third party's traffic messages as application.They
There are apparent mapping relations between application, so can be used to identify application.But due to third party service provider's quantity compared with
More, the flow of generation also has respective mode, so being difficult to the mapping relations automatically established between ID value and application;And
The method for extracting third party's flow application ID at present is based on phraseological analysis more, and this method is time-consuming and is easy erroneous judgement.
Summary of the invention
The purpose of the present invention is to provide a kind of mobile application recognition methods based on third party's flow HTTP message,
Third party's traffic messages are judged using statistical method and extract the application ID in message, and the mapping established between ID and application is closed
System, to carry out using identification.
In order to achieve the above objectives, solution of the invention is:
A kind of mobile application recognition methods based on third party's flow HTTP message, includes the following steps:
Step 1, user carries out flow sample collection, and automatic marked traffic by using automation flow collection platform;
Step 2, user is in the presence of in data set by counting HTTP message keyword sequences, judges the sequence institute
Whether corresponding message is third party's flow;
Step 3, count HTTP message composition sequence, by message value value in same application and different application it
Between appearance situation judge value value whether with application there are mapping relations, to establish third party's fingerprint base;Then, when catching
After grasping message to be measured, first determine whether the message is third party's flow, then checks that third party's fingerprint base finds mark
The value value of application, i.e. application ID, and identify by the mapping relations between ID and application the message applies source.
In above-mentioned steps 1, automatic test platform is built using Android virtual machine and Monkey, guarantees same
The same time at most only installs an application to be measured on simulator, to be numbered by simulator and be using runing time section
The flow of test platform triggering is marked.
In above-mentioned steps 2, the later keyword sequences of value value are rejected to characterize message using HTTP message, and pass through
The number that the sequence occurs in multiple and different applications judges whether message corresponding to the sequence comes from third party's service.
Wherein, keyword sequences are formed using the parameter name in domain name, resource path and the domain query and content.
In above-mentioned steps 3, if the value in different message composition sequences on a certain position is identical in same application, more
It is different between a different application, then it is assumed that the value value of this position be with using there are the application IDs of mapping relations.
Wherein, the parameter name and parameter value of domain name, path, query and content in is used to form sequence as message
Column.
According to the survey, same service provider provides the flow generated when service in addition to respective location is filled for different application to root
Value value other than it is all identical, and there is the application ID of application identity for identification among these value.It is based on
This, recognition methods provided by the invention compared with prior art, has the advantage that
(1) whether the present invention uses statistical method to judge a HTTP message for third party's service flow;
(2) present invention extracts the application ID of third party's traffic messages using statistical method, and is come using this ID value
Identification application, method is simple, and calculation amount is smaller.
Detailed description of the invention
Fig. 1 is flow chart of the invention.
Specific embodiment
Below with reference to attached drawing, technical solution of the present invention and beneficial effect are described in detail.
As shown in Figure 1, the present invention provides a kind of mobile application recognition methods based on third party's flow HTTP message, including
Following steps:
(1) data on flows automation collection:
User carries out flow sample collection, and automatic marked traffic by using automation flow collection platform.It utilizes
The fuzz testings tool such as Android virtual machine and Monkey builds automatic test platform, guarantees same on same simulator
One time at most only installed an application to be measured, can thus be numbered by simulator and application runing time section is test
The flow of platform triggering is marked;It carries out needing a large amount of labeled good flow samples before using identification.
(2) third party's flow is identified:
User is in the presence of in data set by counting HTTP message keyword sequences, judges corresponding to the sequence
Whether message is third party's flow.The later keyword sequences of value value are rejected to characterize message, specifically using HTTP message
Parameter name sequential concatenation in domain name, resource path and query and two domains content in message is got up to form keyword
Sequence, to characterize message.It can be treated after establishing third party's message keyword sequence library and observe and predict text and matched to sentence
Whether the message that breaks belongs to third party's service.Meanwhile if same keyword sequences occur in multiple and different applications, recognize
For flow of the corresponding message of the sequence from third party's service.Third party's flow library is established on this basis, when message to be measured
Keyword sequences are in the flow library, it may be considered that this message is the flow of third party's service.
(3) the identification application of third party's flow is utilized:
User counts HTTP message composition sequence, by message value value in same application and between different application
Appearance situation judge value value whether with application there are mapping relations, to establish third party's fingerprint base.Then, by sentencing
Special value value completes application identification in disconnected third party's flow.By in message domain name, resource path and query and
Parameter name and parameter value sequential concatenation in two domains content get up formation sequence.By the sequence between multiple applications into
Row compares, if the value of a certain position of the sequence is constant in the same application, but is different between multiple applications, then may be used
To think that this value is the application ID that third party's service is applied for identification.It is established between application ID and application according to above-mentioned rule
Mapping relations, it can identified by application ID and applied belonging to message to be measured.
Embodiment:
The mobile application recognition methods based on third party's flow HTTP message in the present embodiment, comprising the following steps:
One, data on flows automation collection:
A large amount of mobile applications are downloaded by reptile instrument first;It is then based on Android virtual machine and fuzz testing tool
The mobile application automatic test platform of Monkey chooses application in application library, and automation is installed and run using to generate stream
Amount;Then tool is acted on behalf of using MITMPROXY go-between to monitor and save using the flow generated on virtual machine, record
Traffic log;Finally judged using source by traffic log to be marked as flow, then by it using wscript.exe
It is stored in data on flows library.Particularly, since the time same on simulator can only at most run an application, it is possible to pass through prison
Time and its simulator source of message are heard to judge message is generated by which application, it is possible to be carried out using mark
Note.
Two, third party's flow is identified:
User is in the presence of in data set by counting HTTP message keyword sequences, judges corresponding to the sequence
Whether message is third party's flow, can treat after establishing third party's message keyword sequence library and observe and predict text and match
To judge whether the message belongs to third party's service.
It is always fixed due to the interaction protocol of the same third party's service, so its format, i.e. keyword sequences are always
Constant, what can be changed only has because carrying the value that information is different and value is different.For this purpose, by the value in each message
It removes, leaves the keyword sequences in its domain name, resource path, query and content.If the sequence is at 3 or 3 or more
Application in occur, then it is assumed that the message belongs to third party's flow, and keyword sequences thirdPktStr is stored.
Particularly, when these applications belong to the different editions of the same application or belong to same manufacturer with a series of
When different product, they are most likely with in-company public service, but the similar flow that these applications generate is but not
Third party's flow should be classified as.Apk can be named as ' domain1.domain2 ... name_ by usual developer
The form of version.apk ', and ' domain1.domain2 ' Chang Xiangtong in the ProductName of same manufacturer.For example,
Com.youdao.dict_6070000 and com.youdao.note_65 is two product under Netease has, wherein
' com.youdao ' specifies product manufacturer and series, and ' _ 6070000 ' and ' _ 65 ' represent the sequence of some version
Number.Application vendor and version are judged accordingly, if ' domain1.domain2 ' of application is identical, they are from same
Manufacturer, if using only ' part of _ version ' is different, then it is assumed that they are actually with a application.Algorithm 1 describes whole
A process:
Three, it is identified and is applied using third party's flow:
HTTP message composition sequence is counted, by message value value in going out in same application and between different application
Status condition judge value value whether with application there are mapping relations, to establish third party's fingerprint base.Then, by judging
Special value value completes application identification in tripartite's flow.
After further research third party's flow, value and the application of a certain specific position of part specific message are found
Between there are one-to-one relationships, it can be used as a validity feature identifier, to using identification help is provided.This
Invention devises third party's marker extraction algorithm and extracts to it, establishes the mapping table between identifier and application, from
And identify application.
With application with the presence of the identifier value following characteristics of corresponding relationship:
Wherein message type can be used thirdPktStr and be indicated.Assuming that consistList be message key,
Value sequence, identifier extraction algorithm is as follows, and thirdPktStr and consistList example are as shown in table 1.
1 thirdPktStr of table and consistList example
Mapping relations between ID and application are finally recorded in by third party's marker extraction method as shown in algorithm 2
In thirdIdTable table:
In message cognitive phase, the structure sequence thirdPktStr of message is extracted first, whether inquiry has
The record of thirdPktStr extracts the element in message consistList if having and carries out being spliced to form feature, most
The corresponding application of the feature is known by inquiring thirdIdTable afterwards.
In summary, a kind of mobile application recognition methods based on third party's flow HTTP message of the present invention, by building
Application traffic is collected and marked to automation flow collection platform based on Android virtual machine and fuzz testing tool, realizes
The automation of data sample obtains;On the basis of collecting data on flows collection, third party's flow HTTP based on statistics is used
Packet identification method identifies third party's service flow, and establishes the value value and application of specific position in third party's flow automatically
Between corresponding relationship, to identify application.The present invention allows to belong to third party in user's automatic identification mobile application flow
The HTTP message of service, and pass through the identification application of these messages.
The above examples only illustrate the technical idea of the present invention, and this does not limit the scope of protection of the present invention, all
According to the technical idea provided by the invention, any changes made on the basis of the technical scheme each falls within the scope of the present invention
Within.
Claims (6)
1. a kind of mobile application recognition methods based on third party's flow HTTP message, it is characterised in that include the following steps:
Step 1, user carries out flow sample collection, and automatic marked traffic by using automation flow collection platform;
Step 2, user is in the presence of in data set by counting HTTP message keyword sequences, judges corresponding to the sequence
Message whether be third party's flow;
Step 3, HTTP message composition sequence is counted, by message value value in same application and between different application
Appearance situation judge value value whether with application there are mapping relations, to establish third party's fingerprint base;Then, when capturing
After message to be measured, first determine whether the message is third party's flow, then checks that third party's fingerprint base finds mark application
Value value, i.e. application ID, and identify by the mapping relations between ID and application the message applies source.
2. a kind of mobile application recognition methods based on third party's flow HTTP message as described in claim 1, feature exist
In: in the step 1, automatic test platform is built using Android virtual machine and Monkey, guarantees same simulator
The upper same time at most only installs an application to be measured, to be that test is flat by simulator number and application runing time section
The flow of platform triggering is marked.
3. a kind of mobile application recognition methods based on third party's flow HTTP message as described in claim 1, feature exist
In: in the step 2, the later keyword sequences of value value are rejected to characterize message using HTTP message, and pass through the sequence
The number occurred in multiple and different applications judges whether message corresponding to the sequence comes from third party's service.
4. a kind of mobile application recognition methods based on third party's flow HTTP message as claimed in claim 3, feature exist
In: keyword sequences are formed using the parameter name in domain name, resource path and the domain query and content.
5. a kind of mobile application recognition methods based on third party's flow HTTP message as described in claim 1, feature exist
In: in the step 3, if the value in different message composition sequence on a certain position is identical in same application, it is multiple not
With different between application, then it is assumed that the value value of this position be with using there are the application IDs of mapping relations.
6. a kind of mobile application recognition methods based on third party's flow HTTP message as claimed in claim 5, feature exist
In: use the parameter name and parameter value of domain name, path, query and content in as message composition sequence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810670461.8A CN109104381B (en) | 2018-06-26 | 2018-06-26 | Mobile application identification method based on third-party traffic HTTP message |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810670461.8A CN109104381B (en) | 2018-06-26 | 2018-06-26 | Mobile application identification method based on third-party traffic HTTP message |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109104381A true CN109104381A (en) | 2018-12-28 |
CN109104381B CN109104381B (en) | 2021-11-02 |
Family
ID=64844985
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810670461.8A Active CN109104381B (en) | 2018-06-26 | 2018-06-26 | Mobile application identification method based on third-party traffic HTTP message |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109104381B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111222547A (en) * | 2019-12-30 | 2020-06-02 | 中国人民解放军国防科技大学 | Traffic feature extraction method and system for mobile application |
CN111371700A (en) * | 2020-03-11 | 2020-07-03 | 武汉思普崚技术有限公司 | Traffic identification method and device applied to forward proxy environment |
CN112671671A (en) * | 2021-03-16 | 2021-04-16 | 北京邮电大学 | Third party flow identification method, device and equipment based on third party library |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6870830B1 (en) * | 2000-11-30 | 2005-03-22 | 3Com Corporation | System and method for performing messaging services using a data communications channel in a data network telephone system |
CN102065017A (en) * | 2010-12-31 | 2011-05-18 | 成都市华为赛门铁克科技有限公司 | Message processing method and device |
CN103312565A (en) * | 2013-06-28 | 2013-09-18 | 南京邮电大学 | Independent learning based peer-to-peer (P2P) network flow identification method |
CN105099803A (en) * | 2014-05-15 | 2015-11-25 | 中国移动通信集团公司 | Traffic identification method, application server, and network element equipment |
US20170026407A1 (en) * | 2013-11-25 | 2017-01-26 | Imperva, Inc. | Coordinated detection and differentiation of denial of service attacks |
CN107357612A (en) * | 2017-06-27 | 2017-11-17 | 聚好看科技股份有限公司 | Application program updating detection method and device |
-
2018
- 2018-06-26 CN CN201810670461.8A patent/CN109104381B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6870830B1 (en) * | 2000-11-30 | 2005-03-22 | 3Com Corporation | System and method for performing messaging services using a data communications channel in a data network telephone system |
CN102065017A (en) * | 2010-12-31 | 2011-05-18 | 成都市华为赛门铁克科技有限公司 | Message processing method and device |
CN103312565A (en) * | 2013-06-28 | 2013-09-18 | 南京邮电大学 | Independent learning based peer-to-peer (P2P) network flow identification method |
US20170026407A1 (en) * | 2013-11-25 | 2017-01-26 | Imperva, Inc. | Coordinated detection and differentiation of denial of service attacks |
CN105099803A (en) * | 2014-05-15 | 2015-11-25 | 中国移动通信集团公司 | Traffic identification method, application server, and network element equipment |
CN107357612A (en) * | 2017-06-27 | 2017-11-17 | 聚好看科技股份有限公司 | Application program updating detection method and device |
Non-Patent Citations (3)
Title |
---|
WENJIA WU,JIANAN WU,YANHAO WANG,ZHEN LING;MING YANG: "Efficient Fingerprinting-Based Android Device Identification With Zero-Permission Identifiers", 《IEEE ACCESS,2016,PP(99):1-1》 * |
李冰,金志刚,舒炎泰: "基于流量统计实时识别QQ语音通信的方法", 《高技术通讯》 * |
黄健文,黄健,蔡秋艳,李俊磊,严冬: "UICC卡非接触应用隐式选择识别技术研究", 《微型机与应用》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111222547A (en) * | 2019-12-30 | 2020-06-02 | 中国人民解放军国防科技大学 | Traffic feature extraction method and system for mobile application |
CN111222547B (en) * | 2019-12-30 | 2021-08-17 | 中国人民解放军国防科技大学 | Traffic feature extraction method and system for mobile application |
CN111371700A (en) * | 2020-03-11 | 2020-07-03 | 武汉思普崚技术有限公司 | Traffic identification method and device applied to forward proxy environment |
CN112671671A (en) * | 2021-03-16 | 2021-04-16 | 北京邮电大学 | Third party flow identification method, device and equipment based on third party library |
CN112671671B (en) * | 2021-03-16 | 2021-06-29 | 北京邮电大学 | Third party flow identification method, device and equipment based on third party library |
Also Published As
Publication number | Publication date |
---|---|
CN109104381B (en) | 2021-11-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105022960B (en) | Multiple features mobile terminal from malicious software detecting method and system based on network traffics | |
CN105809035B (en) | The malware detection method and system of real-time behavior is applied based on Android | |
CN107515915B (en) | User identification association method based on user behavior data | |
CN109726744A (en) | A kind of net flow assorted method | |
CN109120429B (en) | Risk identification method and system | |
CN102469117B (en) | Method and device for identifying abnormal access action | |
CN106354797B (en) | Data recommendation method and device | |
CN105491444B (en) | A kind of data identifying processing method and device | |
CN110648172B (en) | Identity recognition method and system integrating multiple mobile devices | |
CN109104381A (en) | A kind of mobile application recognition methods based on third party's flow HTTP message | |
CN106878108B (en) | Network flow playback test method and device | |
CN105376223B (en) | The reliability degree calculation method of network identity relationship | |
CN110245273B (en) | Method for acquiring APP service feature library and corresponding device | |
CN106301980A (en) | A kind of brush amount tool detection method and apparatus | |
CN106998336B (en) | Method and device for detecting user in channel | |
CN110011860A (en) | Android application and identification method based on network traffic analysis | |
CN106067879B (en) | The detection method and device of information | |
CN106301975A (en) | A kind of data detection method and device thereof | |
CN110891071A (en) | Network traffic information acquisition method, device and related equipment | |
Zungur et al. | Libspector: Context-aware large-scale network traffic analysis of android applications | |
CN107704494B (en) | User information collection method and system based on application software | |
CN102469450B (en) | Method and device for recognizing virus characteristics of mobile phone | |
CN108650145A (en) | Phone number characteristic automatic extraction method under a kind of home broadband WiFi | |
CN106161403A (en) | Application program restored method, device and system | |
CN106897619B (en) | Mobile terminal from malicious software cognitive method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |