CN103246725A - Wireless network based data traffic pushing system and method - Google Patents

Wireless network based data traffic pushing system and method Download PDF

Info

Publication number
CN103246725A
CN103246725A CN2013101682183A CN201310168218A CN103246725A CN 103246725 A CN103246725 A CN 103246725A CN 2013101682183 A CN2013101682183 A CN 2013101682183A CN 201310168218 A CN201310168218 A CN 201310168218A CN 103246725 A CN103246725 A CN 103246725A
Authority
CN
China
Prior art keywords
user
interest
web page
submodule
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013101682183A
Other languages
Chinese (zh)
Inventor
刘臻
吕琳媛
肖思源
刘润然
佘莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI HEGUANG INFORMATION TECHNOLOGY Co Ltd
Original Assignee
SHANGHAI HEGUANG INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI HEGUANG INFORMATION TECHNOLOGY Co Ltd filed Critical SHANGHAI HEGUANG INFORMATION TECHNOLOGY Co Ltd
Priority to CN2013101682183A priority Critical patent/CN103246725A/en
Publication of CN103246725A publication Critical patent/CN103246725A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides a wireless network based data traffic pushing system. By a wireless gateway, the wireless network based data traffic pushing system acquires log information of a mobile terminal like cellphone of a user and then filters and processes user use cellphone behaviors within a certain time range to acquire user behavior characteristics, so that interest and preference of the user can be formed according to interested content and behavior habits of the user and are related to position information of the mobile terminal in real time, and the mobile terminal can receive information pushing. User required information can be transmitted specifically, reliability in data traffic pushing is improved, user preference is improved, and data noise can be well reduced.

Description

A kind of data service supplying system and method based on wireless network
Technical field
The present invention relates to a kind of content of text by user behavior and browsing page and find the current interest preference of user, and in conjunction with the system and method for user's current location propelling data, be specially adapted to wireless network.
Background technology
Data service pushes and has begun comprehensively to burst forth in 2011, emerge numerous mechanisms in the industry, data service pushes also website combination from the phase one, and (medium are selected very important, make up and select according to audient's characteristics of medium), (content optimization is very important to subordinate phase context orientation, attract audient's type to make up according to content), three phases is that the directed propelling movement mode of crowd of core changes with crowd's directional technology till now again, more focuses on the identification to the crowd.In addition, location-based data service pushes in another one dimension development and ripe.
The objective of the invention is to set up the new model that a kind of data service pushes, follow the tracks of each user's behavioural habits, and its behavior and browsing content are analyzed, predict its interest preference, concentrate on the object of receiving information interested and the user who needs is arranged, realize that the orientation of data service pushes.
Meaning of the present invention is the interest hobby according to the user, sends the information that the user needs targetedly, improves the confidence level that data service pushes, and improves the user preferences degree, can reduce data noise better.
Summary of the invention
The invention provides a kind of data service supplying system based on wireless network, after it obtains the log information of user's use as the portable terminal of mobile phone by radio network gateway, use the mobile phone behavior to carry out filtration treatment to user in the scope for the previous period, obtain the user behavior feature, make the internal interest of holding of user and behavioural habits in conjunction with the interest preference that forms the user, and associate in real time with the positional information of portable terminal, push to the portable terminal information of carrying out, described system comprises time window adjusting and web data statistic of classification module, the user interest extraction module, data service pushes module and location analysis module, wherein: time window adjusting and web data statistic of classification module receive the URL of browsing pages from radio network gateway, user's browsing page in the scope is for the previous period carried out filtration treatment, obtain user's interest related web page and user behavior feature; The user interest extraction module is used for obtaining the current interest of user according to user's interest related web page and user behavior feature; Location analysis module by the GMLC gateway obtain the user current browse positional information; Data service pushes module according to active user's interest of user interest extraction module output, utilizes the rule association strategy, judges whether to carry out the localization information Push Service; To not meeting active user's interest of localized service feature, service pushes module mates it with corresponding pre-pushed information, choose the highest pushed information of matching degree according to matching result; To meeting active user's interest of localized service feature, according to from the user of location analysis module current browse positional information, obtain location association information, the recycling matching strategy, the current interest of user and location association information are mated, and select the highest location association information of matching degree as pushed information according to matching result, push to portable terminal.
In addition, time window is regulated and web data statistic of classification module comprises time window adjusting submodule and web data statistic of classification submodule, time window is regulated submodule and is used for adjusting automatically the time window scope so that system to this user's browsing page in time window handle; Web data statistic of classification submodule comprises behavioural information statistics submodule and Web page classifying submodule, and behavioural information statistics submodule is used for obtaining the user behavior feature, and the Web page classifying submodule is used for obtaining the user's interest related web page.
Also have, the user interest extraction module comprises behavioural information analysis submodule, content information is analyzed submodule and integrated study submodule, behavioural information is analyzed submodule according to the user behavior feature, time series is added up and screened, dimensionality reduction, form user behavior interest, be output as user's current behavior interest, content information is analyzed submodule according to the URL address of user's interest related web page, web page contents is carried out text-processing, extract Web page subject, and according to described Web page subject and other attribute informations of webpage, form user content interest, be output as the current content interest of user, the integrated study submodule is according to user's current behavior interest and current content interest, use the integrated study technology, form user interest, be output as the current interest of user.
In addition, the Web page classifying submodule comprises that web page text obtains submodule, web page text classification submodule, visiting frequency statistics submodule and the current content interest of user and determines submodule, web page text obtains submodule in the above-mentioned time window, the webpage that the user browses carries out filtration treatment, obtain one group of related web page, according to the URL address of accessed web page, obtain the content of text of the page, web page text classification submodule is to the content of text processing of classifying; Visiting frequency statistics submodule is to each class frequency statistics that conducts interviews, and the current content interest of user determines that submodule is the user's interest related web page with the highest webpage collection of visiting frequency value.
The present invention also provides a kind of data service method for pushing based on wireless network, after it obtains the log information of user's use as the portable terminal of mobile phone by radio network gateway, use the mobile phone behavior to carry out filtration treatment to user in the scope for the previous period, obtain the user behavior feature, make the internal interest of holding of user and behavioural habits in conjunction with the interest preference that forms the user, and associate in real time with the positional information of portable terminal, push to the portable terminal information of carrying out, comprise: the URL that receives browsing pages from radio network gateway, user's browsing page in the scope is for the previous period carried out filtration treatment, obtain user's interest related web page and user behavior feature; Obtain the current interest of user according to user's interest related web page and user behavior feature; By the GMLC gateway obtain the user current browse positional information; According to active user's interest of user interest extraction module output, utilize the rule association strategy, judge whether to carry out the localization information Push Service; To not meeting active user's interest of localized service feature, service pushes module mates it with corresponding pre-pushed information, choose the highest pushed information of matching degree according to matching result; To meeting active user's interest of localized service feature, according to from the user of location analysis module current browse positional information, obtain location association information, the recycling matching strategy, the current interest of user and location association information are mated, and select the highest location association information of matching degree as pushed information according to matching result, push to portable terminal.
This system can adjust the time window scope automatically so that system to this user's browsing page in time window handle.
Further, obtaining the current interest step of user comprises: according to the user behavior feature, time series is added up and screened, dimensionality reduction, form user behavior interest, be output as user's current behavior interest, URL address according to the user's interest related web page, web page contents is carried out text-processing, extract Web page subject, and according to described Web page subject and other attribute informations of webpage, form user content interest, be output as the current content interest of user, according to user's current behavior interest and current content interest, use the integrated study technology, form user interest, be output as the current interest of user.
Have, obtain user's interest related web page step and comprise: in above-mentioned time window, the webpage that the user browses carries out filtration treatment again, obtain one group of related web page, according to the URL address of accessed web page, obtain the content of text of the page, to the content of text processing of classifying; To each class frequency statistics that conducts interviews, be the user's interest related web page with the highest webpage collection of visiting frequency value.
Description of drawings
Fig. 1 is that a kind of portable terminal is by the system construction drawing of radio network gateway browsing pages;
Fig. 2 is a kind of method of obtaining the mobile phone users interest preference on Mobile Server by radio network gateway in real time;
Fig. 3 is the operational flowchart of time window adjusting of the present invention and web data statistic of classification module;
Fig. 4 is the operational flowchart of Web page classifying of the present invention/content information processing sub;
Fig. 5 a is the method that the present invention makes up the web page text sorter;
Fig. 5 b is the using method of web page text sorter of the present invention;
Fig. 6 is that user content interest of the present invention is extracted the submodule operational flowchart;
Fig. 7 is the exemplary tree-shaped structure of user interest preference of the present invention;
Fig. 8 pushes the module operation process flow diagram for data service;
Fig. 9 is location analysis module operational flowchart of the present invention;
Figure 10 is the related process flow diagram of positional information of the present invention.
Embodiment
Followingly further specify the preferred embodiments of the present invention with reference to accompanying drawing 1~10.
Fig. 1 is that portable terminal passes through the system construction drawing as the radio network gateway browsing pages of WAP gateway.
The invention provides a kind of data service supplying system based on wireless network, after it obtains the log information of user's use as the portable terminal of mobile phone by radio network gateway, use the mobile phone behavior to carry out filtration treatment to user in the scope for the previous period, obtain the user behavior feature, make the internal interest of holding of user and behavioural habits in conjunction with the interest preference that forms the user, and associate in real time with the positional information of portable terminal, push to the portable terminal information of carrying out, described system is illustrated by the part of frame of broken lines institute mark among Fig. 1, comprise time window adjusting and web data statistic of classification module, the user interest extraction module, data service pushes module and location analysis module, wherein:
Time window is regulated and web data statistic of classification module receives the URL of browsing pages from radio network gateway, and user's browsing page in the scope is for the previous period carried out filtration treatment, acquisition user's interest related web page and user behavior feature;
The user interest extraction module comprises that behavioural information is analyzed submodule, content information is analyzed submodule and integrated study submodule,
Behavioural information is analyzed submodule according to the user behavior feature, and time series is added up and screening, dimensionality reduction, forms user behavior interest, is output as user's current behavior interest,
Content information is analyzed submodule according to the URL address of user's interest related web page, and web page contents is carried out text-processing, extracts Web page subject, and according to described Web page subject and other attribute informations of webpage, form user content interest, be output as the current content interest of user
The integrated study submodule uses the integrated study technology according to user's current behavior interest and current content interest, forms user interest, is output as the current interest of user;
Location analysis module by the GMLC gateway obtain the user current browse positional information;
Data service pushes module according to active user's interest of user interest extraction module output, utilizes the rule association strategy, judges whether to carry out the localization information Push Service; To not meeting active user's interest of localized service feature, service pushes module mates it with corresponding pre-pushed information, choose the highest pushed information of matching degree according to matching result; To meeting active user's interest of localized service feature, according to from the user of location analysis module current browse positional information, obtain location association information, the recycling matching strategy, the current interest of user and location association information are mated, and select the highest location association information of matching degree as pushed information according to matching result, push to portable terminal.
Wherein said radio network gateway comprises WAP GW, strengthens equipment such as GGSN, independent synthesized gateway, in the explanation of back, is the content that example is introduced whole invention with common WAP GW.
Wherein browsing pages is provided by the sp/cp server in the network, and portable terminal is visited these pages by radio network gateway.
The invention provides a kind of data service method for pushing based on wireless network, as shown in Figure 2, after it obtains the log information of user's use as the portable terminal of mobile phone by radio network gateway, use the mobile phone behavior to carry out filtration treatment to user in the scope for the previous period, obtain the user behavior feature, make interest that the user internally holds and behavioural habits in conjunction with the interest preference that forms the user, and associate in real time with the positional information of portable terminal, push to the portable terminal information of carrying out, comprising:
Receive the URL of browsing pages from radio network gateway, user's browsing page in the scope is for the previous period carried out filtration treatment, obtain user's interest related web page and user behavior feature;
According to the user behavior feature, time series is added up and screening, dimensionality reduction, form user behavior interest, as user's current behavior interest, URL address according to the user's interest related web page, web page contents is carried out text-processing, extract Web page subject, and according to described Web page subject and other attribute informations of webpage, form user content interest, as the current content interest of user, according to above-mentioned user's current behavior interest and current content interest, use the integrated study technology, form user interest, as the current interest of user;
By the GMLC gateway obtain the user current browse positional information;
According to active user's interest, utilize the rule association strategy, judge whether to carry out the localization information Push Service; To not meeting active user's interest of localized service feature, it is mated with corresponding pre-pushed information, choose the highest pushed information of matching degree according to matching result; To meeting active user's interest of localized service feature, according to the user current browse positional information, obtain location association information, the recycling matching strategy, the current interest of user and location association information are mated, and select the highest location association information of matching degree as pushed information according to matching result, push to portable terminal.
Time window is regulated and web data statistic of classification module comprises time window adjusting submodule and web data statistic of classification submodule, and web data statistic of classification submodule comprises behavioural information statistics submodule and Web page classifying submodule.Fig. 3 is the operational flowchart of time window adjusting and web data statistic of classification module.
Time window is regulated submodule execution time window control method,, determines and the adjusting time window the concentrated interest of reflection user current slot according to user's networking speed and custom.
In order to obtain user's interest related web page and user behavior feature, described system need carry out filtration treatment to user's browsing page in the scope for the previous period, the time range interval that needs statistical treatment in the prior art is fixed value normally, as the interest preference of user in a long period section processed, as one day, January even 1 year, though such processing is more comprehensive and accurate aspect analysis user interest, but the web page contents of analyzing is huge, real-time is relatively poor, or be trigger condition with single internet behavior or single browsing page, last net or browse a webpage and do once and recommend, though be real-time recommendation like this, but system can return too many content recommendation, has increased the burden of cordless communication network, has also reduced the entertaining that the user experiences.
The problems referred to above based on prior art, the present invention has adopted the control method of time window in, can take into account the long-term interest preference of user and interest preference in short-term, regulate between the two and control, control the quantity of obtaining webpage by regulating time window, the size of regulating time window reaches real-time effect, and is more timely and accurate.
The control method of described time window can be regulated submodule by time window and carry out.
The purpose of this method is to be beginning the current surf time with the user, is benchmark with a time range that meets user's networking speed and custom, analyzes the category of interest that the user reflects by online in this time range.
Networking speed and custom that the control method of described time window is different according to the user, the initial setting time value of setting-up time window, the setting-up time of time window automatically adjusts along with user's online custom afterwards, and step is:
The statistics user is reticular density in history Wherein, T is the phase of history time, and M is the user in T internet behavior quantity in the time period;
The initial setting time value is
Figure BDA00003146319500092
Wherein, α is an empirical value, is used for regulating the time window size, and the time range of setting guarantees that the user has certain online amount and surf time, and the time range of setting is shorter, makes user interest more concentrated, and user's displacement range is little;
Certain hour week after date, calculate again the user in a new time period on reticular density, d = M ′ T ′ ;
The setting-up time value is: t ′ = t + D - d D + d ;
Wherein, the α adjustable size, Statistics online quantity total amount is adjusted α according to above-mentioned formula after a long period.
Web data classification processing sub comprises behavioural information processing sub and Web page classifying/content information processing sub, and behavioural information and Web page classifying/content information are handled, and obtains user's interest related web page and user behavior feature.
Submodule and user's current behavior feature that the behavioural information processing sub comprises note behavioral statistics submodule, communication behavior statistics submodule, internet behavior statistics submodule, delete the user behavior feature by the PCA method are determined submodule.It carries out the time statistics according to the browsing page that obtains to the above-mentioned behavior of user in above-mentioned time window, obtain user's behavioural characteristic.
The operation steps of behavioural information processing sub is: the behavior of statistics note; The statistics communication behavior; The statistics internet behavior; By the PCA method user behavior feature is deleted; Determine user's current behavior feature.
Web page classifying/content information processing sub comprises that web page text obtains submodule, web page text classification submodule, visiting frequency statistics submodule and the current content interest of user and determines submodule.It is in the above-mentioned time window, and the webpage that the user browses carries out filtration treatment, obtains one group of related web page, according to the URL address of accessed web page, obtains the content of text of the page, to the content of text processing of classifying; To each class frequency statistics that conducts interviews, be the user's interest related web page with the highest webpage collection of visiting frequency value.Fig. 4 is the operational flowchart of Web page classifying/content information processing sub.
The operation steps of Web page classifying/content information processing sub is: obtain web page text; The web page text classification; The statistics visiting frequency; Determine the user's interest related web page.
Web page text obtains submodule to the URL address of input, gets rid of useless pages and some webpage that can't visit, to linking through the remaining URL address of screening, extracts title and text message.
The Word message of one piece of webpage source file distributes generally as follows:
Figure BDA00003146319500111
Wherein link 4, link 5 is link information, also is text message.
By format analysis, coupling<title〉the acquisition heading message; Get rid of useless link information, obtain text and useful link information, as text 1, link 4, text 2, link 5, text 3.
Web page text obtains the title of submodule output webpage and text message to the web page text submodule of classifying.
Web page text classification submodule is according to predefined subject categories, for each web document of web document set is determined a classification, the subject categories of webpage such as physical culture, food and drink, IT, real estate, automobile, tourism etc.Fig. 5 a is for making up the method for web page text sorter; Fig. 5 b is the using method of web page text sorter.
The Web page classifying device comprises following two parts:
The structure of Web page classifying device and training part, it is input as the training text collection, by text representation and feature selecting, makes up sorter model according to the feature dictionary, is output as the classifying rules collection that is similar to tree structure, shown in Fig. 5 a;
The training process of Web page classifying device namely constantly divides into groups to training sample, by setting up target variable about the classification forecast model of each input variable, packet under the different values with target variable of round Realization input variable, and then for classification and prediction to new data-objects.
The training process step of sorter is: when decision tree nodes at different levels are selected attribute, with the choice criteria of gain ratio as attribute.
Web page classifying device classified part, it is input as the text of handling through the text pretreatment module to be sorted (web document object), pass through text representation, carry out feature selecting according to the feature dictionary, carry out text classification with the classifying rules of training the sorter model that generates, be output as the affiliated classification information of each text, shown in Fig. 5 b.
The Web page classifying device uses the decision tree classification method, the steps include:
1. test sample book is expressed as the form same with training sample;
2. t ← decision tree root node;
3. the testing attribute and the threshold value that depend on plan tree node t compare the value of sample character pair to be tested with it, determine according to the standard of t node division then to be
The right child of left child or t ← t of t ← t;
4. recurrence is carried out ⑶, is leafy node up to t;
5. the classification of test sample book is the classification of leaf t representative.
In the text representation step, adopt characteristic vector space to represent text feature, document i can be expressed as the proper vector of following formula:
W ij=(W i1,W i2,...,W im)
Wherein, W IjBe entry j frequency of occurrences f in document i IjFunction, directly use entry in the frequency of occurrences of document as eigenwert, computing formula is:
W ij=f ij
In the feature selecting step, adopt the feature dimension reduction method based on improved χ 2 statistics and pattern polymerization, step is:
⑴ according to formula
x ij ′ 2 = sign ( n 11 × n 22 - n 12 × n 21 ) n × ( n 11 × n 22 - n 12 × n 21 ) 2 ( n 11 + n 12 ) × ( n 21 + n 22 ) × ( n 11 + n 21 ) × ( n 12 + n 22 )
( sign ( x ) = 1 x &GreaterEqual; 0 - 1 x < 0 ) Calculate each entry to the improved χ of every class 2Statistic;
⑵ according to formula
Figure BDA00003146319500133
Calculate the CHI of each entry, then feature is sorted from high to low by the CHI value, choose preceding M big feature entry of CHI value, the eigenmatrix that then obtains thus has M pattern;
⑶ for relatively whether each pattern is consistent to all kinds of classification contribution proportions at first handle the improvement statistic unification of each pattern between [1,1], and processing mode is as follows:
A ij = x ij &prime; 2 / ( max - min )
Wherein max, min are respectively the improvement χ of pattern i 2The maximal value of statistic and minimum value;
⑷ adopt simple clustering algorithm, carry out cluster (pattern of every line display of A) according to the pattern of A, of a sort pattern is polymerized to a new pattern, to obtain L new model like this, wherein L is much smaller than M, adopt the stratification of cohesion to carry out cluster, the most frequently used Euclidean distance is adopted in range observation, and is as follows:
d ( i , j ) = ( A il - A jl ) 2 + ( A i 2 - A j 2 ) 2 + . . . + ( A is - A js ) 2
With Euclidean distance d (i j) carries out cluster less than the pattern of certain threshold value, and the process of cluster is:
1. calculate distance less than the pattern of threshold value according to matrix A, it is carried out cluster;
2. after the cluster, the pattern in every class is merged into a pattern, and this pattern comprises the whole entries in this class, and its word frequency is exactly the word frequency sum of these entries, recomputates the improvement statistic of new model, forms matrix A again according to new model;
Repeat 1., 2. two steps, till all patterns can not polymerization;
⑸ recomputate the CHI value of each characteristic item, the individual characteristic item of L ' before selecting according to CHI value size.
The user interest extraction module comprises that behavioural information is analyzed submodule, content information is analyzed submodule and integrated study submodule,
Behavioural information is analyzed submodule according to the user behavior feature, and time series is added up and screening, dimensionality reduction, forms user behavior interest, is output as user's current behavior interest,
Content information is analyzed submodule according to the URL address of user's interest related web page, and web page contents is carried out text-processing, extracts Web page subject, and according to described Web page subject and other attribute informations of webpage, form user content interest, be output as the current content interest of user
The integrated study submodule uses the integrated study technology according to user's current behavior interest and current content interest, forms user interest, is output as the current interest of user.
User interest is divided into behavior interest and two parts of content interest, extracts with behavioural information analysis submodule and user content interest analysis submodule respectively, and is integrated by the integrated study submodule at last.
User's usage behavior is analyzed submodule: the current behavioural characteristic of user is carried out obtaining user's current behavior interest based on the decision Tree algorithms classification.
User content interest is extracted submodule: the webpage to the current category of interest of user carries out text analyzing, obtains the web page text attribute information, according to the web page text attribute information, obtains the current content interest of user, and step is:
(1) obtains corresponding keyword and index thereof;
(2) calculate the user to the attention rate of keyword;
(3) according to the attention rate threshold value, obtain the current content interest of user.
The keyword acquisition process comprises:
1. to carrying out word segmentation processing (be to separate with the space as English between Chinese word, be convenient to handle) in full;
2. (it is the word that less semantic meaning is arranged, as function word and some high frequency words to filter out stop words.Stop words is owing to appearing in a lot of files, so information analysis there is not contribution);
3. extract text header, deposit the title word set in vectorial V h
4. extract first section in text, second section, latter end, deposit the content word set in vectorial V c
If 5. | V h∩ V c|<P, judge that then text header is " abstract type " title.Wherein, P is a given threshold value, is defined as 3 according to experiment;
6.
Figure BDA00003146319500151
If x were ∈ { query dictionary }-, text header also would be judged as " abstract type " title (x refers to any one value of extracting from title set Vk);
If 7. title does not have (5) or (6) middle feature, judge that then it is " concrete type " title;
Title for " abstract type ", adopt the TFIDF method to search weights in the text and be higher than the word of certain threshold value as candidate word, whether this word of position judgment by the candidate word place is key word (weights of place sentence are more high, and the possibility that becomes key word is more big) then.
To with " concrete type " title, behind the title participle, the noun that obtains and verb just are the key word of the text.When calculating the sentence weight, give the bigger weight proportion factor of word in the heading tabulation.
By above method, can obtain the weight of each sentence, can calculate the weights of each sentence, for time of back provides foundation, and having upgraded the weight of lists of keywords, the keyword chained list of each article correspondence is the keyword of this article by the weight ordering.
Attention rate is calculated: by to each browsing content information of user with browse behavioural information analysis, just can quantitative calculation go out the user to the attention rate of each interest topic.Calculation procedure comprises:
1. the keyword in all theme vectors under the identical generic A is joined among this type of subordinate's the lists of keywords K;
2. with the duplicate key word normalizing that occurs in the same item subordinate keyword interpolation process, the duplicate key word has triggered the gathering of the similar theme of candidate, and all webpages under this word are integrated into form a similar theme group of candidate together;
3. for the similar theme group of the candidate at each duplicate key word place, the original weights of this word in this group theme vector are relatively found out the theme vector at weights the maximum place as the core theme representative of this group theme vector (and join among the K it);
4. calculate the similarity of each theme vector in the similar theme group to the place candidate of core theme, set a threshold value, all exceed thresholding person and join the similar theme group Ki group of formation among the theme group Ki, have also namely formed a topic Ki;
5. the core theme of being found out with the front is as the representative of topic Ki, will be core theme temperature after adjusting with the frequency stack of all theme vector place themes among the topic Ki, and the core theme after adjusting is joined in candidate's focus topic list;
6. calculate the attention rate of each theme among the K according to foregoing fever thermometer metering method;
The integrated study submodule is at same training set, train different sorters, it is the decision tree Weak Classifier, then these decision tree Weak Classifiers are gathered, constitute a stronger final sorter, form the final classification of user interest, adopt the AdaBoost algorithm that the result of user behavior sorter and user content categorize interests device is carried out the iteration adjustment, obtain the weight of different decision tree Weak Classifiers, and then obtain the current interest of user.
User interest preference comprises item of interest, category of interest, attention rate and generation time; In concrete enforcement, user's interest preference can be expressed as tree-shaped version, the upper strata of tree structure represents that it is interest subclass or theme that the type of interest preference, lower floor are represented.User's interest pattern confidence, the information that also can preserve user interest feature word both can have been preserved with tree structure.Fig. 7 is the exemplary tree-shaped structure of user interest preference of the present invention.
Data service pushes module: the described rule association strategy that utilizes, judge whether described user interest and preference are fit to local service, and as satisfying the condition of doing local service, then the trigger position analysis module obtains the current position of browsing; Otherwise, do general relevance Information Push Service.
The Rule of judgment of local service can for:
(1) the current categories of websites of browsing of user is as service system of food and drink, shopping, lodging, traffic website or the value added service provider of city version etc.
(2) classification of the current interest of user is as weather, inquiry traffic, predetermined ticketing service, discount, tourism classics, distinguishing products etc.
Above Rule of judgment can make up, as the current website of browsing of user be certain city version search the website, room, and the interest of browsing page reflection is to rent a house, and then can be fit to localized service recommendation.
Location analysis module is obtained the current position of browsing by the GMLC gateway, i.e. user residing geographic position when browsing current web page.Fig. 9 is location analysis module operational flowchart of the present invention.
Wherein, push module to service in described location analysis module and also comprise that described location analysis module browses URL that the positional information customization is associated with described mobile phone users present position or the step of URL content of pages based on described acquisition before sending positional information.Figure 10 is the process flow diagram of positional information association of the present invention.
The location association information bank: record is the information on services that provides of identical or close place or site attribute information etc. geographically, as:
Figure BDA00003146319500181
The location finding coupling: the process with user interest preference, customer position information and corresponding location association information are mated specifically comprises:
(1) with user's current location information as key word of the inquiry, carry out location association inquiry, obtain with as key word input consistent location information record;
(2) classification of the current interest preference of user and the information on services that provides in the location association information are mated, calculate matching degree, if matching degree exceeds a certain threshold value, then export this location association information;
1. if matching result is more, then the theme of the current interest preference of user and the information on services that provides in the location association information are mated, calculate matching degree
2. sort according to matching degree;
3. the output matching degree exceeds the positional information of threshold value.
(3) otherwise, the core position in the customer position information as key word of the inquiry, is carried out location association inquiry, obtain with as key word input consistent location information record, change (2);
Above step is in position analysis and the location association identical or close with the current present position of user.
If the matching degree of above information all is lower than preset threshold, the place or the service that do not have suitable interest preference in user's current location are described then.Therefore, need find suitable place or service according to its interest and preference.
The target location is analyzed: the target location comprises address or scene for the information of match user interest and preference, and process comprises:
(1) with the theme of the current interest preference of user as key word of the inquiry, carry out the location association inquiry, obtain with as key word input consistent location information record, export this location association information;
(2) if there is not consistent positional information record, then calculate the theme of the current interest preference of user and the matching degree that information on services is provided in the location association information,
1. sort according to matching degree;
2. the output matching degree exceeds the positional information of threshold value.
(3) positional information with output passes to the route recommendation unit.
The route recommendation unit comprises:
(1) recommended route generation unit is used for calculating and the selection schemer data;
(2) output route data, thus be created on from the departure place recommended route of recommending when moving to the destination;
(3) display unit is used for showing demonstration information.
It should be noted that at last: above embodiment is only in order to technical scheme of the present invention to be described but not limit it, although with reference to preferred embodiment the present invention is had been described in detail, those of ordinary skill in the art is to be understood that: those skilled in the art can make amendment or are equal to replacement technical scheme of the present invention, and these modifications or be equal to replacement and also can not make amended technical scheme break away from the spirit and scope of technical solution of the present invention.

Claims (8)

1. data service supplying system based on wireless network is characterized in that: comprise that time window is regulated and web data statistic of classification module, user interest extraction module, data service propelling movement module and location analysis module,
Time window is regulated and web data statistic of classification module receives the URL of browsing pages from radio network gateway, and user's browsing page in the scope is for the previous period carried out filtration treatment, acquisition user's interest related web page and user behavior feature;
The user interest extraction module is used for obtaining the current interest of user according to user's interest related web page and user behavior feature;
Location analysis module by the GMLC gateway obtain the user current browse positional information;
Data service pushes module according to active user's interest of user interest extraction module output, utilizes the rule association strategy, judges whether to carry out the localization information Push Service; To not meeting active user's interest of localized service feature, service pushes module mates it with corresponding pre-pushed information, choose the highest pushed information of matching degree according to matching result; To meeting active user's interest of localized service feature, according to from the user of location analysis module current browse positional information, obtain location association information, the recycling matching strategy, the current interest of user and location association information are mated, and select the highest location association information of matching degree as pushed information according to matching result, push to portable terminal.
2. a kind of data service supplying system based on wireless network as claimed in claim 1 is characterized in that: time window is regulated and web data statistic of classification module comprises time window adjusting submodule and web data statistic of classification submodule,
Time window is regulated submodule and is used for adjusting automatically the time window scope so that system to this user's browsing page in time window handle;
Web data statistic of classification submodule comprises behavioural information statistics submodule and Web page classifying submodule, and behavioural information statistics submodule is used for obtaining the user behavior feature, and the Web page classifying submodule is used for obtaining the user's interest related web page.
3. a kind of data service supplying system based on wireless network as claimed in claim 1 is characterized in that: the user interest extraction module comprises that behavioural information is analyzed submodule, content information is analyzed submodule and integrated study submodule,
Behavioural information is analyzed submodule according to the user behavior feature, and time series is added up and screening, dimensionality reduction, forms user behavior interest, is output as user's current behavior interest,
Content information is analyzed submodule according to the URL address of user's interest related web page, and web page contents is carried out text-processing, extracts Web page subject, and according to described Web page subject and other attribute informations of webpage, form user content interest, be output as the current content interest of user
The integrated study submodule uses the integrated study technology according to user's current behavior interest and current content interest, forms user interest, is output as the current interest of user.
4. a kind of data service supplying system based on wireless network as claimed in claim 3, it is characterized in that: the Web page classifying submodule comprises that web page text obtains submodule, web page text classification submodule, visiting frequency statistics submodule and the current content interest of user and determines submodule, web page text obtains submodule in the above-mentioned time window, the webpage that the user browses carries out filtration treatment, obtain one group of related web page, URL address according to accessed web page, obtain the content of text of the page, web page text classification submodule is to the content of text processing of classifying; Visiting frequency statistics submodule is to each class frequency statistics that conducts interviews, and the current content interest of user determines that submodule is the user's interest related web page with the highest webpage collection of visiting frequency value.
5. data service method for pushing based on wireless network is characterized in that comprising step:
Receive the URL of browsing pages from radio network gateway, user's browsing page in the scope is for the previous period carried out filtration treatment, obtain user's interest related web page and user behavior feature;
Obtain the current interest of user according to user's interest related web page and user behavior feature;
By the GMLC gateway obtain the user current browse positional information;
According to active user's interest of user interest extraction module output, utilize the rule association strategy, judge whether to carry out the localization information Push Service; To not meeting active user's interest of localized service feature, service pushes module mates it with corresponding pre-pushed information, choose the highest pushed information of matching degree according to matching result; To meeting active user's interest of localized service feature, according to from the user of location analysis module current browse positional information, obtain location association information, the recycling matching strategy, the current interest of user and location association information are mated, and select the highest location association information of matching degree as pushed information according to matching result, push to portable terminal.
6. a kind of data service method for pushing based on wireless network as claimed in claim 5 is characterized in that:
Can adjust the time window scope automatically so that system to this user's browsing page in time window handle.
7. a kind of data service method for pushing based on wireless network as claimed in claim 5 is characterized in that: obtain the current interest step of user and comprise:
According to the user behavior feature, time series is added up and screening, dimensionality reduction, form user behavior interest, be output as user's current behavior interest,
According to the URL address of user's interest related web page, web page contents is carried out text-processing, extract Web page subject, and according to described Web page subject and other attribute informations of webpage, form user content interest, be output as the current content interest of user,
According to user's current behavior interest and current content interest, use the integrated study technology, form user interest, be output as the current interest of user.
8. a kind of data service method for pushing based on wireless network as claimed in claim 7, it is characterized in that: obtain user's interest related web page step and comprise: in above-mentioned time window, the webpage that the user browses carries out filtration treatment, obtain one group of related web page, URL address according to accessed web page, obtain the content of text of the page, to the content of text processing of classifying; To each class frequency statistics that conducts interviews, be the user's interest related web page with the highest webpage collection of visiting frequency value.
CN2013101682183A 2013-05-06 2013-05-06 Wireless network based data traffic pushing system and method Pending CN103246725A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013101682183A CN103246725A (en) 2013-05-06 2013-05-06 Wireless network based data traffic pushing system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013101682183A CN103246725A (en) 2013-05-06 2013-05-06 Wireless network based data traffic pushing system and method

Publications (1)

Publication Number Publication Date
CN103246725A true CN103246725A (en) 2013-08-14

Family

ID=48926245

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013101682183A Pending CN103246725A (en) 2013-05-06 2013-05-06 Wireless network based data traffic pushing system and method

Country Status (1)

Country Link
CN (1) CN103246725A (en)

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103399965A (en) * 2013-08-26 2013-11-20 百度在线网络技术(北京)有限公司 Reading content recommending method, reading content recommending system and server
CN103530339A (en) * 2013-10-08 2014-01-22 北京百度网讯科技有限公司 Mobile application information push method and device
CN104038908A (en) * 2014-05-27 2014-09-10 小米科技有限责任公司 Push message sending method and device
CN104361465A (en) * 2014-11-25 2015-02-18 深圳市中兴移动通信有限公司 Log analysis method and device
CN104394512A (en) * 2014-12-04 2015-03-04 成都思邦力克科技有限公司 Message push system
CN104410952A (en) * 2014-10-30 2015-03-11 苏州德鲁森自动化系统有限公司 System for acquiring area-of-interest of user
CN104504059A (en) * 2014-12-22 2015-04-08 合一网络技术(北京)有限公司 Multimedia resource recommending method
WO2016101446A1 (en) * 2014-12-23 2016-06-30 中兴通讯股份有限公司 Data analysis method, apparatus, system, and terminal, and server
CN105975529A (en) * 2016-04-29 2016-09-28 维沃移动通信有限公司 Information processing method and mobile terminal
CN106156259A (en) * 2015-04-28 2016-11-23 天脉聚源(北京)科技有限公司 A kind of user behavior information displaying method and system
CN106294534A (en) * 2016-07-18 2017-01-04 中国银联股份有限公司 A kind of user interest coupling supplying system and user interest coupling method for pushing
CN106375369A (en) * 2016-08-18 2017-02-01 南京邮电大学 Mobile Web service recommendation method and collaborative recommendation system based on user behavior analysis
CN106510734A (en) * 2016-09-30 2017-03-22 维沃移动通信有限公司 Data processing method and device based on mobile terminal
CN106910092A (en) * 2017-02-28 2017-06-30 成都瑞小博科技有限公司 A kind of active marketing method and system based on business WIFI industry attributes
CN106936652A (en) * 2015-12-29 2017-07-07 北京喜乐航科技股份有限公司 The data transmission method of multi-terminal equipment, apparatus and system
CN106960017A (en) * 2017-03-03 2017-07-18 掌阅科技股份有限公司 E-book is classified and its training method, device and equipment
CN107291755A (en) * 2016-04-01 2017-10-24 中国移动通信有限公司研究院 A kind of terminal method for pushing and device
CN107657007A (en) * 2017-09-22 2018-02-02 广东欧珀移动通信有限公司 Information-pushing method, device, terminal, readable storage medium storing program for executing and system
CN107704575A (en) * 2017-09-30 2018-02-16 郑州轻工业学院 User behavior analysis method and user behavior analysis device based on data mining
CN107844598A (en) * 2017-11-22 2018-03-27 广州优视网络科技有限公司 Content recommendation method, device and computer equipment
CN108154177A (en) * 2017-12-20 2018-06-12 广东宜通世纪科技股份有限公司 Business recognition method, device, terminal device and storage medium
CN109063028A (en) * 2018-07-09 2018-12-21 清远网博信息技术有限公司 A kind of method, the system of tourism data push
CN110555748A (en) * 2018-06-04 2019-12-10 阿里巴巴集团控股有限公司 business object recommendation method and device and travel platform
CN110750726A (en) * 2019-10-25 2020-02-04 武汉惠利德科技有限公司 Personalized service pushing method and system based on intelligent calculator
CN111651682A (en) * 2020-05-28 2020-09-11 广西东信互联科技有限公司 System for mining circle-level social business value
CN111754311A (en) * 2020-07-03 2020-10-09 重庆智者炎麒科技有限公司 Method and system for recommending personalized seats in venue
CN113065895A (en) * 2021-03-29 2021-07-02 上海酷量信息技术有限公司 Advertisement recommendation method and device based on geographic position

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040181604A1 (en) * 2003-03-13 2004-09-16 Immonen Pekka S. System and method for enhancing the relevance of push-based content
CN101034997A (en) * 2006-03-09 2007-09-12 新数通兴业科技(北京)有限公司 Method and system for accurately publishing the data information
CN101866341A (en) * 2009-04-17 2010-10-20 华为技术有限公司 Information push method, device and system
US20110078137A1 (en) * 2006-07-12 2011-03-31 Loc-Aid Technologies, Inc. System and method for generating use statistics for location-based applications
CN102141986A (en) * 2010-01-28 2011-08-03 北京邮电大学 Individualized information providing method and system based on user behaviors
CN102340529A (en) * 2010-07-21 2012-02-01 中国移动通信集团福建有限公司 Page generating system and page generating method based on WAP (Wireless Application Protocol) platform
CN102411596A (en) * 2010-09-21 2012-04-11 阿里巴巴集团控股有限公司 Information recommendation method and system
CN102421062A (en) * 2011-12-01 2012-04-18 中国联合网络通信集团有限公司 Method and system for pushing application information
CN102611785A (en) * 2011-01-20 2012-07-25 北京邮电大学 Personalized active news recommending service system and method for mobile phone user
CN102622445A (en) * 2012-03-15 2012-08-01 华南理工大学 User interest perception based webpage push system and webpage push method
CN102760163A (en) * 2012-06-12 2012-10-31 奇智软件(北京)有限公司 Personalized recommendation method and device of characteristic information
CN102929939A (en) * 2012-09-28 2013-02-13 北京奇虎科技有限公司 Personalized information supply method and device

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040181604A1 (en) * 2003-03-13 2004-09-16 Immonen Pekka S. System and method for enhancing the relevance of push-based content
CN101034997A (en) * 2006-03-09 2007-09-12 新数通兴业科技(北京)有限公司 Method and system for accurately publishing the data information
US20110078137A1 (en) * 2006-07-12 2011-03-31 Loc-Aid Technologies, Inc. System and method for generating use statistics for location-based applications
CN101866341A (en) * 2009-04-17 2010-10-20 华为技术有限公司 Information push method, device and system
CN102141986A (en) * 2010-01-28 2011-08-03 北京邮电大学 Individualized information providing method and system based on user behaviors
CN102340529A (en) * 2010-07-21 2012-02-01 中国移动通信集团福建有限公司 Page generating system and page generating method based on WAP (Wireless Application Protocol) platform
CN102411596A (en) * 2010-09-21 2012-04-11 阿里巴巴集团控股有限公司 Information recommendation method and system
CN102611785A (en) * 2011-01-20 2012-07-25 北京邮电大学 Personalized active news recommending service system and method for mobile phone user
CN102421062A (en) * 2011-12-01 2012-04-18 中国联合网络通信集团有限公司 Method and system for pushing application information
CN102622445A (en) * 2012-03-15 2012-08-01 华南理工大学 User interest perception based webpage push system and webpage push method
CN102760163A (en) * 2012-06-12 2012-10-31 奇智软件(北京)有限公司 Personalized recommendation method and device of characteristic information
CN102929939A (en) * 2012-09-28 2013-02-13 北京奇虎科技有限公司 Personalized information supply method and device

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103399965A (en) * 2013-08-26 2013-11-20 百度在线网络技术(北京)有限公司 Reading content recommending method, reading content recommending system and server
CN103530339A (en) * 2013-10-08 2014-01-22 北京百度网讯科技有限公司 Mobile application information push method and device
CN104038908B (en) * 2014-05-27 2017-05-10 小米科技有限责任公司 Push message sending method and device
CN104038908A (en) * 2014-05-27 2014-09-10 小米科技有限责任公司 Push message sending method and device
CN104410952A (en) * 2014-10-30 2015-03-11 苏州德鲁森自动化系统有限公司 System for acquiring area-of-interest of user
CN104410952B (en) * 2014-10-30 2018-06-19 北京蚂蜂窝网络科技有限公司 A kind of system for obtaining user's area-of-interest
CN104361465A (en) * 2014-11-25 2015-02-18 深圳市中兴移动通信有限公司 Log analysis method and device
CN104394512A (en) * 2014-12-04 2015-03-04 成都思邦力克科技有限公司 Message push system
CN104504059A (en) * 2014-12-22 2015-04-08 合一网络技术(北京)有限公司 Multimedia resource recommending method
CN104504059B (en) * 2014-12-22 2018-03-27 合一网络技术(北京)有限公司 Multimedia resource recommends method
WO2016101446A1 (en) * 2014-12-23 2016-06-30 中兴通讯股份有限公司 Data analysis method, apparatus, system, and terminal, and server
CN106156259A (en) * 2015-04-28 2016-11-23 天脉聚源(北京)科技有限公司 A kind of user behavior information displaying method and system
CN106936652A (en) * 2015-12-29 2017-07-07 北京喜乐航科技股份有限公司 The data transmission method of multi-terminal equipment, apparatus and system
CN107291755A (en) * 2016-04-01 2017-10-24 中国移动通信有限公司研究院 A kind of terminal method for pushing and device
CN105975529A (en) * 2016-04-29 2016-09-28 维沃移动通信有限公司 Information processing method and mobile terminal
CN106294534A (en) * 2016-07-18 2017-01-04 中国银联股份有限公司 A kind of user interest coupling supplying system and user interest coupling method for pushing
CN106294534B (en) * 2016-07-18 2019-12-24 中国银联股份有限公司 User interest matching pushing system and user interest matching pushing method
CN106375369A (en) * 2016-08-18 2017-02-01 南京邮电大学 Mobile Web service recommendation method and collaborative recommendation system based on user behavior analysis
CN106375369B (en) * 2016-08-18 2019-05-28 南京邮电大学 The business recommended method of mobile Web and Collaborative Recommendation system based on user behavior analysis
CN106510734A (en) * 2016-09-30 2017-03-22 维沃移动通信有限公司 Data processing method and device based on mobile terminal
CN106510734B (en) * 2016-09-30 2019-08-20 维沃移动通信有限公司 A kind of data processing method and device based on mobile terminal
CN106910092A (en) * 2017-02-28 2017-06-30 成都瑞小博科技有限公司 A kind of active marketing method and system based on business WIFI industry attributes
CN106960017A (en) * 2017-03-03 2017-07-18 掌阅科技股份有限公司 E-book is classified and its training method, device and equipment
CN107657007A (en) * 2017-09-22 2018-02-02 广东欧珀移动通信有限公司 Information-pushing method, device, terminal, readable storage medium storing program for executing and system
CN107657007B (en) * 2017-09-22 2020-12-22 Oppo广东移动通信有限公司 Information pushing method, device, terminal, readable storage medium and system
CN107704575A (en) * 2017-09-30 2018-02-16 郑州轻工业学院 User behavior analysis method and user behavior analysis device based on data mining
CN107844598A (en) * 2017-11-22 2018-03-27 广州优视网络科技有限公司 Content recommendation method, device and computer equipment
CN108154177A (en) * 2017-12-20 2018-06-12 广东宜通世纪科技股份有限公司 Business recognition method, device, terminal device and storage medium
CN108154177B (en) * 2017-12-20 2020-01-21 宜通世纪科技股份有限公司 Service identification method, device, terminal equipment and storage medium
CN110555748A (en) * 2018-06-04 2019-12-10 阿里巴巴集团控股有限公司 business object recommendation method and device and travel platform
CN109063028A (en) * 2018-07-09 2018-12-21 清远网博信息技术有限公司 A kind of method, the system of tourism data push
CN110750726A (en) * 2019-10-25 2020-02-04 武汉惠利德科技有限公司 Personalized service pushing method and system based on intelligent calculator
CN111651682A (en) * 2020-05-28 2020-09-11 广西东信互联科技有限公司 System for mining circle-level social business value
CN111754311A (en) * 2020-07-03 2020-10-09 重庆智者炎麒科技有限公司 Method and system for recommending personalized seats in venue
CN113065895A (en) * 2021-03-29 2021-07-02 上海酷量信息技术有限公司 Advertisement recommendation method and device based on geographic position

Similar Documents

Publication Publication Date Title
CN103246725A (en) Wireless network based data traffic pushing system and method
CN103235823A (en) Method and system for determining current interest of users according to related web pages and current behaviors
CN103235824A (en) Method and system for determining web page texts users interested in according to browsed web pages
CN103235826B (en) A kind of control method of time window
Ren et al. Context-aware probabilistic matrix factorization modeling for point-of-interest recommendation
CN106815297B (en) Academic resource recommendation service system and method
CN102982042B (en) A kind of personalization content recommendation method, platform and system
CN101551806B (en) Personalized website navigation method and system
CN103914478B (en) Webpage training method and system, webpage Forecasting Methodology and system
CN101866341A (en) Information push method, device and system
CN102033883B (en) A kind of method, Apparatus and system improving data transmission speed of website
Cufoglu User profiling-a short review
CN103336793B (en) A kind of personalized article recommends method and system thereof
CN105718579A (en) Information push method based on internet-surfing log mining and user activity recognition
CN103544188B (en) The user preference method for pushing of mobile Internet content and device
CN106682686A (en) User gender prediction method based on mobile phone Internet-surfing behavior
CN109800350A (en) A kind of Personalize News recommended method and system, storage medium
CN103049440A (en) Recommendation processing method and processing system for related articles
CN103324666A (en) Topic tracing method and device based on micro-blog data
CN109165367B (en) News recommendation method based on RSS subscription
Markou et al. Predicting taxi demand hotspots using automated internet search queries
CN104484431A (en) Multi-source individualized news webpage recommending method based on field body
CN109284443A (en) A kind of tourism recommended method and system based on crawler technology
CN105869058B (en) A kind of method that multilayer latent variable model user portrait extracts
Aliannejadi et al. User model enrichment for venue recommendation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20130814