CN102006174B - Data processing method and device based on online behavior of mobile phone user - Google Patents

Data processing method and device based on online behavior of mobile phone user Download PDF

Info

Publication number
CN102006174B
CN102006174B CN201010535447.0A CN201010535447A CN102006174B CN 102006174 B CN102006174 B CN 102006174B CN 201010535447 A CN201010535447 A CN 201010535447A CN 102006174 B CN102006174 B CN 102006174B
Authority
CN
China
Prior art keywords
url
ticket
data
origin
field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201010535447.0A
Other languages
Chinese (zh)
Other versions
CN102006174A (en
Inventor
卞登奎
季波涛
蒋天超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Verce Intelligent Technology Co ltd
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201010535447.0A priority Critical patent/CN102006174B/en
Publication of CN102006174A publication Critical patent/CN102006174A/en
Priority to PCT/CN2011/075696 priority patent/WO2012062107A1/en
Application granted granted Critical
Publication of CN102006174B publication Critical patent/CN102006174B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a data processing method and device based on the online behavior of a mobile phone user, wherein the method comprises the following steps: generating a first ticket containing a uniform resource locator (URL) accessed by the user according to the online data of the user; preprocessing the data in the first ticket according to the preset rules to generate a second ticket; and statistically analyzing and processing the data in the second ticket. In the invention, before the ticket data are transferred into a database, the ticket data are firstly preprocessed, new ticket data are generated through a series of preprocessing operations and are transferred into the database, and then, the system database statistically analyzes and processes the second ticket data, so that the process of analyzing a large amount of URL data by the database is omitted, thereby greatly improving the efficiency of processing the ticket data, and solving the performance bottleneck problem of analysis of the online behavior of the mobile phone user.

Description

Based on data processing method and the device of cellphone subscriber's internet behavior
Technical field
The present invention relates to mobile network's technical field, particularly relate to a kind of data processing method based on cellphone subscriber's internet behavior and device.
Background technology
At present, in mobile network's business, analysis mining is carried out to user's Internet data and becomes a kind of fashion trend.Along with service supplier and the continuous growth using surfing Internet with cell phone number of users, the ticket that mobile service system is produced constantly increases, in the operation system that ticket amount is more, traffic carrying capacity TPS (Tip-Per-Second) has reached 5000/second even, and the data volume of every day about has 1 hundred million to 2 hundred million more than.
As needing the operator understanding cellphone subscriber's internet behavior, usually need to carry out following analysis to cellphone subscriber's internet behavior:
A) network access style analysis: the type of the website that user's access frequency is higher;
B) appointed website flow analysis: the flowing of access of particular content in website or website;
C) advertisement flowing of access is analyzed: the flowing of access that advertisement network address is specifically classified.
In conventional art, analyzing adopted method to cellphone subscriber's Internet data is: URL (the Uniform/Universal Resource Locator in the ticket generate mobile service system, URL(uniform resource locator), also referred to as web page address) field analyzes.Wherein:
The process of network access style analysis comprises: call bill data warehouse-in, safeguard a HOST and type contrast relationship table, to parse HOST for single URL, from contrast relationship table, inquire type and analyze for all URL;
The process of appointed website flow analysis comprises: call bill data warehouse-in, safeguard the URL transformation rule table of comparisons, for single URL conversion and analyze for all URL;
The process of advertisement flowing of access analysis comprises: call bill data warehouse-in, safeguard URL and advertisement mapping table, inquire about single URL and belong to which kind of advertisement and analyze for all URL.
When data traffic is larger, adopt said method phone bill systematic function can be made to occur bottleneck according to when processing.Because the URL in ticket is all cryptographic storage, decryption processing must be done to URL before parsing URL, the character string after to URL deciphering is also needed to do complex calculations operation, so data processing time is longer, as shown in table 1 below with the test data that traditional solution is analyzed cellphone subscriber's internet behavior:
Ticket formation speed Every day 3000W bar, every 5 minutes 10W bars
10W bar conversation list processing speed 10W bar needs 8 minutes
The processing speed of one day ticket Need about 1.5 days
Table 1
As can be seen from Table 1, the speed generating ticket is faster than the speed of process ticket, makes ticket can pile more and more and cannot process in time thus, not only causes the serious time delay of data processing, and add system database processing load.
Summary of the invention
Main purpose of the present invention is to provide a kind of data processing method based on cellphone subscriber's internet behavior and device, is intended to the processing speed improving cellphone subscriber's Internet data, improves systematic function.
The present invention proposes a kind of data processing method based on cellphone subscriber's internet behavior, and described method comprises:
The first ticket including user's accessed web page address URL is generated according to user's Internet data;
According to pre-defined rule, preliminary treatment is carried out to data in described first ticket, generate the second ticket;
Statistical analysis process is carried out to data in described second ticket.
Preferably, describedly according to pre-defined rule, pretreated step is carried out to data in described first ticket and comprises:
Network access style URL analyzing and processing and/or appointed website flow analysis process and/or advertisement flowing of access analyzing and processing are carried out to data in described first ticket.
Preferably, described the step that data in first ticket carry out network access style URL analyzing and processing to be comprised:
The field of URL type is increased, for depositing URL generic in described first ticket;
Resolve the origin url in described first ticket;
From the URL classification contrast relationship table preset, search generic corresponding to described origin url, write the field of URL type corresponding with origin url in the second ticket.
Preferably, described the step that data in first ticket carry out appointed website flow analysis process to be comprised:
New url field is increased, for depositing the new URL after conversion in described first ticket;
According to the origin url in intended conversion rule conversion the first ticket;
Origin url after conversion is write new url field corresponding with origin url in the second ticket.
Preferably, described the step that data in first ticket carry out advertisement flowing of access analyzing and processing to be comprised:
Commercial paper url field is increased, for depositing commercial paper URL in described first ticket;
Commercial paper URL is isolated according to the predefined identifier in described first ticket entrained by origin url;
Described commercial paper URL is write commercial paper url field corresponding with origin url in the second ticket.
Preferably, described the step that data in first ticket carry out advertisement flowing of access analyzing and processing to be comprised:
Commercial paper url field is increased, for depositing commercial paper URL in described first ticket;
Commercial paper URL is isolated according to the predefined identifier in described first ticket entrained by origin url;
Described commercial paper URL is write commercial paper url field corresponding with origin url in the second ticket.
The present invention also proposes a kind of data processing equipment based on cellphone subscriber's internet behavior, comprising:
Original CDR generation module, includes for generating according to user's Internet data the first ticket that user accesses URL;
New ticket generation module, for carrying out preliminary treatment according to pre-defined rule to data in described first ticket, generates the second ticket;
New call bill data processing module, for carrying out statistical analysis process to data in described second ticket.
Preferably, described new ticket generation module is also for carrying out network access style URL analyzing and processing, appointed website flow analysis process and/or advertisement flowing of access analyzing and processing to data in described first ticket.
Preferably, described new ticket generation module comprises:
Field increases unit, is used for the field of the URL type depositing URL generic for increasing in described first ticket;
Resolution unit, for resolving the origin url in described first ticket;
Writing unit, for searching generic corresponding to described origin url from the URL classification contrast relationship table preset, writes the field of URL type corresponding with origin url in the second ticket.
Preferably, described field increases unit, is also used for the new url field of the new URL after depositing conversion for increasing in described first ticket;
Described resolution unit, also for changing the origin url in the first ticket according to intended conversion rule;
Said write unit, also for the origin url after conversion is write new url field corresponding with origin url in the second ticket; Or
Described field increases unit, also for increasing the commercial paper url field being used for depositing commercial paper URL in described first ticket;
Described resolution unit, also for isolating commercial paper URL according to the predefined identifier in described first ticket entrained by origin url;
Said write unit, also for described commercial paper URL is write commercial paper url field corresponding with origin url in the second ticket.
A kind of data processing method based on cellphone subscriber's internet behavior that the present invention proposes and device, before call bill data warehouse-in, first use pre-processing device such as interface message processor (IMP) phone bill according to carrying out preliminary treatment, preprocessing process comprises the user URL generated that surfs the Net is carried out to Classifying Sum, changes etc. according to certain rule URL, by generating new call bill data warehouse-in after a series of preliminary treatment.System database carries out statistical analysis process to the second call bill data afterwards.Thus, interface message processor (IMP) the resolving of URL is transferred to go process, result data after parsing generates new ticket, system database directly carries out statistical analysis according to result data, eliminate the process that large batch of url data is analyzed, thus substantially increase the efficiency of phone bill according to process, solve the performance bottleneck problem that cellphone subscriber's internet behavior is analyzed.
Accompanying drawing explanation
Fig. 1 is the data processing method one embodiment schematic flow sheet that the present invention is based on cellphone subscriber's internet behavior;
Fig. 2 carries out the schematic flow sheet of network access style URL analyzing and processing to data in the first ticket in data processing method one embodiment that the present invention is based on cellphone subscriber's internet behavior;
Fig. 3 carries out the schematic flow sheet of appointed website flow analysis process to data in the first ticket in data processing method one embodiment that the present invention is based on cellphone subscriber's internet behavior;
Fig. 4 carries out the schematic flow sheet of advertisement flowing of access analyzing and processing to data in the first ticket in data processing method one embodiment that the present invention is based on cellphone subscriber's internet behavior;
Fig. 5 is the data processing equipment one example structure schematic diagram that the present invention is based on cellphone subscriber's internet behavior;
Fig. 6 is the structural representation of new ticket generation module in data processing equipment one embodiment that the present invention is based on cellphone subscriber's internet behavior.
In order to make technical scheme of the present invention clearly, understand, be described in further detail below in conjunction with accompanying drawing.
Embodiment
Solution for embodiment of the invention is mainly before call bill data warehouse-in, first phone bill is according to carrying out preliminary treatment, preprocessing process comprises the user URL generated that surfs the Net is carried out to Classifying Sum, changes etc. according to certain rule URL, by generating new call bill data warehouse-in after a series of preliminary treatment.System database carries out statistical analysis process to the second call bill data afterwards, to improve the efficiency of phone bill according to process, solves the performance bottleneck problem that cellphone subscriber's internet behavior is analyzed.
As shown in Figure 1, one embodiment of the invention proposes a kind of data processing method based on cellphone subscriber's internet behavior, comprising:
Step S101, generates according to user's Internet data and includes the first ticket that user accesses URL;
In the present embodiment, user can pass through surfing Internet with cell phone, accesses various website, to obtain the corresponding network information.When user is by surfing Internet with cell phone, mobile service system obtains network data according to the access network address of cellphone subscriber, and produce Original CDR, i.e. the first ticket alleged by the present embodiment, user's visit capacity is more, the corresponding increase of ticket amount that mobile service system produces.
Wherein, the URL of user's access is included in ticket.URL is a kind of identification method of the address for intactly describing webpage and other resources on Internet.Each webpage on Internet has a unique name identification, is usually referred to as URL address, and this address can be local disk, and also can be a certain computer on local area network (LAN), be more the website on Internet.Briefly, URL is exactly Web address, is commonly called as " network address ".
After mobile service system gets the first ticket, need to carry out analytic statistics process to the first call bill data, to understand cellphone subscriber's internet behavior according to result, such as: the advertisement flowing of access etc. that user often likes the website of which type upper, the flowing of access situation of website that some is specified and businessman to be concerned about, thus take corresponding commercial practice etc. according to cellphone subscriber's internet behavior is follow-up.
Step S102, carries out preliminary treatment according to pre-defined rule to data in the first ticket, generates the second ticket;
In the present embodiment, pre-defined rule is the subject matters such as access websites type, website visiting flow and the advertisement flowing of access be concerned about for operator and formulates, wherein carry out preliminary treatment according to pre-defined rule to data in the first ticket to comprise: carry out network access style URL analyzing and processing and/or appointed website flow analysis process and/or advertisement flowing of access analyzing and processing to data in the first ticket, concrete, such as can the user URL generated that surfs the Net be carried out Classifying Sum, be changed etc. according to certain rule URL.
According to the needs obtaining data processing of information, above-mentioned pre-defined rule also can be other similar rules.
In the present embodiment, preliminary treatment is carried out to data in the first ticket, independently equipment can be adopted, such as interface message processor (IMP), first use interface message processor (IMP) phone bill according to carrying out preliminary treatment, such as to user surf the Net generate URL carry out Classifying Sum, URL changed etc. according to certain rule, by generating the second ticket in new ticket and the present embodiment after a series of preliminary treatment, then new call bill data is put in storage, so that in subsequent processes, system database carries out statistical analysis process to the second call bill data.In the present embodiment, the second call bill data warehouse-in can be entered in the database table that pretreated data inputting specifies to system by library by IMP.
Step S103, carries out statistical analysis process to data in the second ticket.
As mentioned above, newly-generated call bill data transfers to system database to carry out statistical analysis process, such as, according to the generic of URL in the second ticket, can count the combined data that user expects a certain class URL obtained.Thus, interface message processor (IMP) the resolving of URL in first ticket is transferred to go process, result data after parsing generates new ticket, system database directly carries out statistical analysis according to result data, eliminate the process that large batch of url data is analyzed, thus substantially increase the efficiency of phone bill according to process, solve the performance bottleneck problem that cellphone subscriber's internet behavior is analyzed.
As shown in Figure 2, in step S102, the step that data in the first ticket carry out network access style URL analyzing and processing is comprised:
Step S1021, increases and is used for the field of the URL type depositing URL generic in the first ticket;
Step S1022, resolves the origin url in the first ticket;
Step S1023, searches generic corresponding to origin url, writes the field of URL type corresponding with origin url in the second ticket from the URL classification contrast relationship table preset.
Illustrate with instantiation below and the process of network access style URL analyzing and processing carried out to data in the first ticket, if there is the first call bill data as shown in table 2 below:
Sequence number URL
1 http://www.sina.com/sport/1001.htm
2 http://www.sina.com/sport/1002.htm
3 http://www.sina.com/sport/1003.htm
4 http://www.sina.com/news/1004.htm
5 http://www.sina.com/news/1005.htm
6 http://www.sina.com/movie/1006.htm
Table 2
Wherein, the URL classification contrast relationship table namely preset of the criteria for classification of URL is as shown in table 3 below:
Classification URL
Sport category http://www.sina.com/sport/*
News category http://www.sina.com/news/*
Film class http://www.sina.com/movie/
Table 3
Pretreated result is analyzed as shown in table 4 below by network access style URL:
Sequence number URL Classification
1 http://www.sina.com/sport/1001.htm Sport category
2 http://www.sina.com/sport/1002.htm Sport category
3 http://www.sina.com/sport/1003.htm Sport category
4 http://www.sina.com/news/1004.htm News category
5 http://www.sina.com/news/1005.htm News category
6 http://www.sina.com/movie/1006.htm Film class
Table 4
Can draw thus, according to the generic of URL in the second ticket, the combined data that user expects the URL of a certain class such as news category obtained can be counted, the URL of the news category shown in table 4 is two, http://www.sina.com/news/1004.htm and http://www.sina.com/news/1005.htm.
As shown in Figure 3, in step S102, the step that data in the first ticket carry out appointed website flow analysis process is comprised:
Step S1024, increases and is used for the new url field of the new URL after depositing conversion in the first ticket;
Step S1025, according to the origin url in intended conversion rule conversion the first ticket;
Step S1026, writes new url field corresponding with origin url in the second ticket by the origin url after conversion.
Wherein, intended conversion rule can be the transformation rule table formulated according to system HOST file configuration rule, such as, for some HOST, there is following rule, as shown in table 5, wherein, " whether processing extension name ", " whether ignoring parameter " option are set with to each URL.
Table 5
According to above-mentioned transformation rule table, can the origin url in the first ticket be converted to new URL, write the corresponding new url field in the second ticket.The flowing of access of particular content in appointed website or appointed website can be counted according to the information of url field new in the second ticket.
It should be noted that, when carrying out preliminary treatment to the first call bill data, can by three kinds of pretreatment modes described in the present embodiment namely: network access style URL analyzing and processing, appointed website flow analysis process and the triplicity of advertisement flowing of access analyzing and processing are carried out to data in the first ticket and gets up to carry out, thus, according to final the second ticket generated, the type of user's access websites, the flowing of access of appointed website and advertisement flowing of access etc. can be counted simultaneously.
By data test, the comparable situation obtaining solution that the embodiment of the present invention analyzes cellphone subscriber's internet behavior and traditional solution is as shown in table 6 below:
Table 6
As shown in Table 6, compare conventional art, solution for embodiment of the invention analyzes the data of user's online enough more efficiently, substantially increases the processing speed of call bill data, alleviate the processing load of system database, solve the performance bottleneck problem that cellphone subscriber's internet behavior is analyzed.
The present embodiment is before call bill data warehouse-in, first use pre-processing device such as interface message processor (IMP) phone bill according to carrying out preliminary treatment, preprocessing process comprises the user URL generated that surfs the Net is carried out to Classifying Sum, changes etc. according to certain rule URL, by generating new call bill data warehouse-in after a series of preliminary treatment.System database carries out statistical analysis process to the second call bill data afterwards.Thus, interface message processor (IMP) the resolving of URL is transferred to go process, result data after parsing generates new ticket, system database directly carries out statistical analysis according to result data, eliminate the process that large batch of url data is analyzed, thus substantially increase the efficiency of phone bill according to process, solve the performance bottleneck problem that cellphone subscriber's internet behavior is analyzed.
As shown in Figure 4, in step S102, the step that data in the first ticket carry out advertisement flowing of access analyzing and processing is comprised:
Step S1027, increases the commercial paper url field being used for depositing commercial paper URL in the first ticket;
Step S1028, isolates commercial paper URL according to the predefined identifier in described first ticket entrained by origin url;
Step S1029, writes commercial paper url field corresponding with origin url in the second ticket by commercial paper URL.
As shown in Figure 5, one embodiment of the invention proposes a kind of data processing equipment based on cellphone subscriber's internet behavior, comprising: Original CDR generation module 501, new ticket generation module 502 and new call bill data processing module 503, wherein:
Original CDR generation module 501, includes for generating according to user's Internet data the first ticket that user accesses URL;
In the present embodiment, user can pass through surfing Internet with cell phone, accesses various website, to obtain the corresponding network information.When user is by surfing Internet with cell phone, in mobile service system, Original CDR generation module 501 obtains network data according to the access network address of cellphone subscriber, produces Original CDR, i.e. the first ticket alleged by the present embodiment, user's visit capacity is more, the corresponding increase of ticket amount that mobile service system produces.
Wherein, the URL of user's access is included in ticket.URL is a kind of identification method of the address for intactly describing webpage and other resources on Internet.Each webpage on Internet has a unique name identification, is usually referred to as URL address, and this address can be local disk, and also can be a certain computer on local area network (LAN), be more the website on Internet.Briefly, URL is exactly Web address, is commonly called as " network address ".
After mobile service system gets the first ticket, need to carry out analytic statistics process to the first call bill data, to understand cellphone subscriber's internet behavior according to result, such as: the advertisement flowing of access etc. that user often likes the website of which type upper, the flowing of access situation of website that some is specified and businessman to be concerned about, thus take corresponding commercial practice etc. according to cellphone subscriber's internet behavior is follow-up.
New ticket generation module 502, for carrying out preliminary treatment according to pre-defined rule to data in the first ticket, generates the second ticket;
In the present embodiment, new ticket generation module 501 carries out preliminary treatment according to pre-defined rule to data in the first ticket and specifically comprises and carry out network access style URL analyzing and processing, appointed website flow analysis process and/or advertisement flowing of access analyzing and processing to data in the first ticket.
Wherein, pre-defined rule is the subject matters such as access websites type, website visiting flow and the advertisement flowing of access be concerned about for operator and formulates, wherein carry out preliminary treatment according to pre-defined rule to data in the first ticket to comprise: carry out network access style URL analyzing and processing and/or appointed website flow analysis process and/or advertisement flowing of access analyzing and processing to data in the first ticket, concrete, such as can the user URL generated that surfs the Net be carried out Classifying Sum, be changed etc. according to certain rule URL.
According to the needs obtaining data processing of information, above-mentioned pre-defined rule also can be other similar rules.
In the present embodiment, preliminary treatment is carried out to data in the first ticket, independently equipment can be adopted, such as interface message processor (IMP), first use interface message processor (IMP) phone bill according to carrying out preliminary treatment, such as to user surf the Net generate URL carry out Classifying Sum, URL changed etc. according to certain rule, put in storage by generating new call bill data by the second ticket generation module 502 after a series of preliminary treatment, so that in subsequent processes, system database carries out statistical analysis process to the second call bill data.In the present embodiment, the second call bill data warehouse-in can be entered in the database table that data inputting after preliminary treatment specifies to system by library by IMP.
New call bill data processing module 503, for carrying out statistical analysis process to data in the second ticket.
As mentioned above, newly-generated call bill data transfers to the new call bill data processing module 503 of system database to carry out statistical analysis process, such as, according to the generic of URL in the second ticket, can count the combined data that user expects a certain class URL obtained.
Thus, interface message processor (IMP) the resolving of URL in first ticket is transferred to go process, result data after parsing generates new ticket, system database directly carries out statistical analysis according to result data, eliminate the process that large batch of url data is analyzed, thus substantially increase the efficiency of phone bill according to process, solve the performance bottleneck problem that cellphone subscriber's internet behavior is analyzed.
As shown in Figure 6, new ticket generation module 502 comprises: field increases unit 5021, resolution unit 5022 and writing unit 5023, wherein:
Field increases unit 5021, is used for the field of the URL type depositing URL generic for increasing in the first ticket;
Resolution unit 5022, for resolving the origin url in the first ticket;
Writing unit 5023, for searching generic corresponding to origin url from the URL classification contrast relationship table preset, writes the field of URL type corresponding with origin url in the second ticket.
Further, field increases unit 5021, is also used for the new url field of the new URL after depositing conversion for increasing in the first ticket;
Resolution unit 5022, also for changing the origin url in the first ticket according to intended conversion rule;
Writing unit 5023, also for the origin url after conversion is write new url field corresponding with origin url in the second ticket.
Further, field increases unit 5021, also for increasing the commercial paper url field being used for depositing commercial paper URL in the first ticket;
Resolution unit 5022, also for isolating commercial paper URL according to the predefined identifier in the first ticket entrained by origin url;
Writing unit 5023, also for commercial paper URL is write commercial paper url field corresponding with origin url in the second ticket.
The embodiment of the present invention based on the data processing method of cellphone subscriber's internet behavior and device by before call bill data warehouse-in, first use pre-processing device such as interface message processor (IMP) phone bill according to carrying out preliminary treatment, preprocessing process comprises the user URL generated that surfs the Net is carried out to Classifying Sum, changes etc. according to certain rule URL, by generating new call bill data warehouse-in after a series of preliminary treatment.System database carries out statistical analysis process to the second call bill data afterwards.Thus, interface message processor (IMP) the resolving of URL is transferred to go process, result data after parsing generates new ticket, system database directly carries out statistical analysis according to result data, eliminate the process that large batch of url data is analyzed, thus substantially increase the efficiency of phone bill according to process, solve the performance bottleneck problem that cellphone subscriber's internet behavior is analyzed.
The foregoing is only the preferred embodiments of the present invention; not thereby the scope of the claims of the present invention is limited; every utilize specification of the present invention and accompanying drawing content to do equivalent structure or flow process conversion; or be directly or indirectly used in other relevant technical field, be all in like manner included in scope of patent protection of the present invention.

Claims (7)

1., based on a data processing method for cellphone subscriber's internet behavior, described method comprises:
The first ticket including user's accessed web page address URL is generated according to user's Internet data;
According to pre-defined rule, preliminary treatment is carried out to data in described first ticket, generate the second ticket; Specifically comprise: network access style URL analyzing and processing and/or appointed website flow analysis process and/or advertisement flowing of access analyzing and processing are carried out to data in described first ticket; Independently equipment is adopted to carry out preliminary treatment to data in the first ticket; Described preliminary treatment comprise to user surf the Net generate URL carry out Classifying Sum, URL changed according to certain rule;
Statistical analysis process is carried out to data in described second ticket.
2. method according to claim 1, is characterized in that, describedly comprises the step that data in the first ticket carry out network access style URL analyzing and processing:
The field of URL type is increased, for depositing URL generic in described first ticket;
Resolve the origin url in described first ticket;
From the URL classification contrast relationship table preset, search generic corresponding to described origin url, write the field of URL type corresponding with origin url in the second ticket.
3. method according to claim 1 and 2, is characterized in that, describedly comprises the step that data in the first ticket carry out appointed website flow analysis process:
New url field is increased, for depositing the new URL after conversion in described first ticket;
According to the origin url in intended conversion rule conversion the first ticket;
Origin url after conversion is write new url field corresponding with origin url in the second ticket.
4. method according to claim 1, is characterized in that, describedly comprises the step that data in the first ticket carry out advertisement flowing of access analyzing and processing:
Commercial paper url field is increased, for depositing commercial paper URL in described first ticket;
Commercial paper URL is isolated according to the predefined identifier in described first ticket entrained by origin url;
Described commercial paper URL is write commercial paper url field corresponding with origin url in the second ticket.
5. based on a data processing equipment for cellphone subscriber's internet behavior, it is characterized in that, comprising:
Original CDR generation module, includes for generating according to user's Internet data the first ticket that user accesses URL;
New ticket generation module, for carrying out preliminary treatment according to pre-defined rule to data in described first ticket, generates the second ticket; Specifically for carrying out network access style URL analyzing and processing, appointed website flow analysis process and/or advertisement flowing of access analyzing and processing to data in described first ticket; Independently equipment is adopted to carry out preliminary treatment to data in the first ticket; Described pretreated comprise to user surf the Net generate URL carry out Classifying Sum, URL changed according to certain rule;
New call bill data processing module, for carrying out statistical analysis process to data in described second ticket.
6. device according to claim 5, is characterized in that, described new ticket generation module comprises:
Field increases unit, is used for the field of the URL type depositing URL generic for increasing in described first ticket;
Resolution unit, for resolving the origin url in described first ticket;
Writing unit, for searching generic corresponding to described origin url from the URL classification contrast relationship table preset, writes the field of URL type corresponding with origin url in the second ticket.
7. device according to claim 6, is characterized in that,
Described field increases unit, is also used for the new url field of the new URL after depositing conversion for increasing in described first ticket;
Described resolution unit, also for changing the origin url in the first ticket according to intended conversion rule;
Said write unit, also for the origin url after conversion is write new url field corresponding with origin url in the second ticket; Or
Described field increases unit, also for increasing the commercial paper url field being used for depositing commercial paper URL in described first ticket;
Described resolution unit, also for isolating commercial paper URL according to the predefined identifier in described first ticket entrained by origin url;
Said write unit, also for described commercial paper URL is write commercial paper url field corresponding with origin url in the second ticket.
CN201010535447.0A 2010-11-08 2010-11-08 Data processing method and device based on online behavior of mobile phone user Expired - Fee Related CN102006174B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201010535447.0A CN102006174B (en) 2010-11-08 2010-11-08 Data processing method and device based on online behavior of mobile phone user
PCT/CN2011/075696 WO2012062107A1 (en) 2010-11-08 2011-06-13 Method and apparatus for data processing based on surfing behavior of mobile telephone user

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010535447.0A CN102006174B (en) 2010-11-08 2010-11-08 Data processing method and device based on online behavior of mobile phone user

Publications (2)

Publication Number Publication Date
CN102006174A CN102006174A (en) 2011-04-06
CN102006174B true CN102006174B (en) 2015-01-28

Family

ID=43813268

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010535447.0A Expired - Fee Related CN102006174B (en) 2010-11-08 2010-11-08 Data processing method and device based on online behavior of mobile phone user

Country Status (2)

Country Link
CN (1) CN102006174B (en)
WO (1) WO2012062107A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102006174B (en) * 2010-11-08 2015-01-28 中兴通讯股份有限公司 Data processing method and device based on online behavior of mobile phone user
CN102547663B (en) * 2012-03-09 2016-05-11 北京思特奇信息技术股份有限公司 A kind of surfing Internet with cell phone optimization method based on traffic matrix
CN104331404B (en) * 2013-07-22 2018-05-01 中国科学院深圳先进技术研究院 A kind of user's behavior prediction method and apparatus based on user mobile phone Internet data
CN104978341A (en) * 2014-04-08 2015-10-14 北京奇虎科技有限公司 File processing method and equipment, and network system
CN105791613A (en) * 2014-12-24 2016-07-20 中兴通讯股份有限公司 Call bill processing method and device
CN104866909A (en) * 2015-04-29 2015-08-26 国网智能电网研究院 Method and system for finishing air ticket booking function URL
CN105827432A (en) * 2015-12-29 2016-08-03 广东亿迅科技有限公司 SHELL script-based traffic log statistical method and statistical system
CN108287831B (en) * 2017-01-09 2022-08-05 阿里巴巴集团控股有限公司 URL classification method and system and data processing method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1353371A (en) * 2000-11-10 2002-06-12 思网科技股份有限公司 Dynamic real-time data analyzing and processing system and method
WO2003005169A2 (en) * 2001-07-06 2003-01-16 Clickfox, Llc System and method for analyzing system visitor activities
CN101562538A (en) * 2009-04-15 2009-10-21 计世在线网络技术(北京)有限公司 System for analyzing website access
CN101872347A (en) * 2009-04-22 2010-10-27 富士通株式会社 Method and device for judging type of webpage

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102006174B (en) * 2010-11-08 2015-01-28 中兴通讯股份有限公司 Data processing method and device based on online behavior of mobile phone user

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1353371A (en) * 2000-11-10 2002-06-12 思网科技股份有限公司 Dynamic real-time data analyzing and processing system and method
WO2003005169A2 (en) * 2001-07-06 2003-01-16 Clickfox, Llc System and method for analyzing system visitor activities
CN101562538A (en) * 2009-04-15 2009-10-21 计世在线网络技术(北京)有限公司 System for analyzing website access
CN101872347A (en) * 2009-04-22 2010-10-27 富士通株式会社 Method and device for judging type of webpage

Also Published As

Publication number Publication date
WO2012062107A1 (en) 2012-05-18
CN102006174A (en) 2011-04-06

Similar Documents

Publication Publication Date Title
CN102006174B (en) Data processing method and device based on online behavior of mobile phone user
US10839038B2 (en) Generating configuration information for obtaining web resources
US10769254B2 (en) Method and apparatus for identifying user behavior object based on traffic analysis
CN102957712A (en) Method and system for loading website resources
AU2019101565A4 (en) User data sharing method and device
WO2016078533A1 (en) Search method, apparatus, and device and non-volatile computer storage medium
CN100527132C (en) Classified sample set optimizing method and content-related advertising server
WO2013106595A2 (en) Processing store visiting data
CN107835132B (en) Method and device for tracking flow source
CN104750760A (en) Application software recommending method and device
US20190138658A1 (en) Generation of a multi-tenant skinny table for a base table
US11521224B2 (en) Data structures for categorizing and filtering content
CN107169077A (en) Method and apparatus for pushed information
Gupta et al. Web usage mining using improved Frequent Pattern Tree algorithms
CN108073693A (en) A kind of distributed network crawler system based on Hadoop
CN103914479B (en) Resource request matching method and device
Wang et al. A bibliometric analysis of edge computing for internet of things
US9774688B2 (en) Discovery of server functions
CN107612707B (en) Preprocessing method and system for classified storage of homologous sample data in industry field
US9680797B2 (en) Deep packet inspection (DPI) of network packets for keywords of a vocabulary
CN107483565A (en) A kind of service background recognition methods, proxy server and computer-readable storage medium
CN102930015A (en) Method and equipment for providing search results on mobile terminals
EP3923157B1 (en) Data stream processing
Hassan et al. Mace: A dynamic caching framework for mashups
Fang et al. Parallelized user clicks recognition from massive HTTP data based on dependency graph model

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20170801

Address after: 1109, 136, 138, No. 510620, sports east road, Guangzhou, Guangdong, Tianhe District (only for office use). 1110

Patentee after: Guangzhou Verce Intelligent Technology Co.,Ltd.

Address before: 518057 Nanshan District Guangdong high tech Industrial Park, South Road, science and technology, ZTE building, Ministry of Justice

Patentee before: ZTE Corp.

CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: 510620 Room 901, Radio and Television Science and Technology Building, 163 Pingyun Road, Tianhe District, Guangzhou City, Guangdong Province

Patentee after: Guangzhou Verce Intelligent Technology Co.,Ltd.

Address before: 510620 Tianhe District, Guangzhou, Guangdong Sports East Road 136, 138 1109, 1110 units (for office use only)

Patentee before: Guangzhou Verce Intelligent Technology Co.,Ltd.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20150128