CN110110219A - The method and device of user preference is determined according to network behavior - Google Patents

The method and device of user preference is determined according to network behavior Download PDF

Info

Publication number
CN110110219A
CN110110219A CN201810108024.7A CN201810108024A CN110110219A CN 110110219 A CN110110219 A CN 110110219A CN 201810108024 A CN201810108024 A CN 201810108024A CN 110110219 A CN110110219 A CN 110110219A
Authority
CN
China
Prior art keywords
user
webpage
category
access
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810108024.7A
Other languages
Chinese (zh)
Other versions
CN110110219B (en
Inventor
陈实如
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
FOUNDER BROADBAND NETWORK SERVICE Co Ltd
Peking University Founder Group Co Ltd
Original Assignee
FOUNDER BROADBAND NETWORK SERVICE Co Ltd
Peking University Founder Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by FOUNDER BROADBAND NETWORK SERVICE Co Ltd, Peking University Founder Group Co Ltd filed Critical FOUNDER BROADBAND NETWORK SERVICE Co Ltd
Priority to CN201810108024.7A priority Critical patent/CN110110219B/en
Publication of CN110110219A publication Critical patent/CN110110219A/en
Application granted granted Critical
Publication of CN110110219B publication Critical patent/CN110110219B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Physics & Mathematics (AREA)
  • Game Theory and Decision Science (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The present invention provides a kind of method and device that user preference is determined according to network behavior.Method includes: to obtain the access information of user, wherein access information includes webpage information, access time;Classification belonging to the webpage for determining user's access according to webpage information;According to classification belonging to access time, webpage, the determining user in predetermined period accesses the number of every class webpage daily;According to number, determine that user accesses the average time of every class webpage and the number variance yields of the every class webpage of access;According to average time, the number variance yields for accessing every class webpage, the preference of the user in predetermined period is determined.Scheme provided by the invention can make full use of the access information generated when user's online, determine user preference, so that network operator be enable adequately to understand user preferences, and then can targetedly provide a user better service.

Description

The method and device of user preference is determined according to network behavior
Technical field
The present invention relates to Internet technology more particularly to a kind of methods and dress that user preference is determined according to network behavior It sets.
Background technique
Currently, with the development of internet technology, demand of the user to network service is also higher and higher.For network operation For quotient, it is to be understood that the preference of user is transformed network further according to user preference, optimizes network, designs precision marketing set meal, And then service level is promoted, to meet increasing user demand.
Inventors have found that user, when accessing to network, bandwidth operator is able to record the internet information of user, and In the database by these information record.And how to use such information for determining the preference of user, it is those skilled in the art The technical issues of urgent need to resolve.
Summary of the invention
The present invention provides a kind of method and device that user preference is determined according to network behavior, by the access for obtaining user Information, according to the Information Statistics of acquisition, user accesses the number of every class webpage in predetermined period, further according to the access time of statistics Number calculates user and accesses the average time of every class webpage and the number variance yields of the every class webpage of access, ties further according to calculating Fruit determines that the preference of user in predetermined period, scheme provided by the invention can make full use of the access generated when user's online Information determines user preference, so that network operator be enable adequately to understand user preferences, and then can targetedly to User provides preferably service.
The first aspect of the invention is to provide a kind of method for determining user preference according to network behavior, comprising:
Obtain the access information of user, wherein the access information includes webpage information, access time;
Classification belonging to the webpage for determining user's access according to the webpage information;
According to classification belonging to the access time, the webpage, determine that the user described in predetermined period accesses daily The number of every class webpage;
According to the number, determine that the user accesses the average time of every class webpage and the number of the every class webpage of access Variance yields;
According to the average time, the number variance yields for accessing every class webpage, the institute in the predetermined period is determined State the preference of user.
Another aspect of the present invention is to provide a kind of device that user preference is determined according to network behavior, comprising:
Module is obtained, for obtaining the access information of user, wherein when the access information includes webpage information, access Between;
Category determination module, for according to the webpage information determine webpage that the user accesses belonging to classification;
Number determining module determines in predetermined period for the classification according to belonging to the access time, the webpage The user accesses the number of every class webpage daily;Computing module, for determining that the user accesses every class according to the number The average time of webpage and the number variance yields of the every class webpage of access;
Preference determining module, for determining according to the average time, the number variance yields for accessing every class webpage The preference of the user in the predetermined period.
The method and device provided by the invention that user preference is determined according to network behavior has the technical effect that
The method and device provided in this embodiment that user preference is determined according to network behavior, including obtain user and access net The access information of page, wherein access information includes webpage information, access time;The net of user's access is determined according to webpage information Classification belonging to page;According to classification belonging to access time, webpage, determine that user accesses every class webpage daily within a preset time Number;According to number, determine that user accesses the average time of every class webpage and the number variance yields of the every class webpage of access;Root According to average time, the number variance yields for accessing every class webpage, the preference of the user in predetermined period is determined.Using the present embodiment institute The method and device of offer can make full use of the access information that user generates when browsing webpage, accurately determine user's Preference so as to make Network Provider understand the preference of user, and then promotes service level.
Detailed description of the invention
Fig. 1 is the process of the method that user preference is determined according to network behavior shown in an exemplary embodiment of the invention Figure;
Fig. 2 is the process of the method that user preference is determined according to network behavior shown in another exemplary embodiment of the present invention Figure;
Fig. 3 is the process of the method that user preference is determined according to network behavior shown in another exemplary embodiment of the present invention Figure;
Fig. 4 is the process of the method that user preference is determined according to network behavior shown in the another exemplary embodiment of the present invention Figure;
Fig. 5 is the structure of the device that user preference is determined according to network behavior shown in an exemplary embodiment of the invention Figure;
Fig. 6 is the structure of the device that user preference is determined according to network behavior shown in another exemplary embodiment of the present invention Figure.
Specific embodiment
Fig. 1 is the process of the method that user preference is determined according to network behavior shown in an exemplary embodiment of the invention Figure.
As shown in Figure 1, the method provided in this embodiment for determining user preference according to network behavior includes:
Step 101, the access information of user is obtained, wherein access information includes webpage information, access time.
Specifically, user when surfing the web webpage, can generate many access informations.What usual user browsed in online Content is all oneself interested content, and therefore, the access information generated when can be surfed the Internet according to user determines user preferences.
The access information of user can be can recorde in width carrier side, then database is recorded in these access informations In, browser used by a user also available user access information, then the information that will acquire are sent to the backstage of browser In database, so that background data base can store the access information of user.The data of record may include surf time, user ID, source IP address, Target IP, url inventory, access terminals type etc..It, can be direct when needing to carry out preference analysis to user These information are read from database.Furthermore it is also possible to method provided in this embodiment is stored in the memory of server, and The processor in server is set to run method provided in this embodiment, to enable the server to execute side provided in this embodiment Method.Method provided in this embodiment can also be encapsulated into application program, be installed in server, so that server energy Enough run method provided in this embodiment.
Further, the access information that user accesses webpage can be obtained from the data of acquisition, wherein available complete The user access information in portion, the also access information of available needs.Meanwhile access information should also be carried out according to user identifier Classification, to determine the preference of the user according to access information corresponding with user identifier.User identifier can be user mobile phone Number, account, ip etc..
When practical application, webpage information, the access time that user generates when accessing webpage can be only obtained.Webpage information It specifically includes: url, web page contents etc..For example, url is http://games.sina.com.Access time refers to that user accesses The time of webpage, for example, user has accessed the webpage of Sina's game on November 23rd, 2017.
Wherein, user access information can also be daily obtained, the access information of daily user, and the access to acquisition are obtained Information is analyzed.
Step 102, the classification according to belonging to the webpage that webpage information determines user's access.
Wherein it is possible to the webpage information to acquisition parses, it, can be according to preset rules if webpage information is url The keyword in url is extracted, webpage classification is determined according to keyword.For example, the symbol such as " " in url, " // " can be removed first Number, word combination { http games sina com } is obtained, then in preset class library, determination is corresponding with word combination Classification is game, so that it is determined that classification belonging to the webpage of user's access.
All or part of webpage classifications can be set in class library, such as only storage needs the webpage classification investigated.If When parsing to webpage information, corresponding webpage classification can not be determined in class library, then can abandon access letter Data are ceased, abnormal data can also be stored as, it is handled by maintenance personnel.If storing current energy in class library All webpage classifications enough determined and its corresponding keyword, then webpage classification corresponding with webpage can not be determined at this time, it can It can be since the data in class library are abundant caused not enough, then can be abnormal number by this visit information data storing According to by maintenance personnel according to the webpage classification or keyword in abnormal data supplement class library, it can according to the visit of acquisition Ask the content of abundant information class library.In addition, if what is stored in class library is the webpage classification for needing to investigate and its corresponding pass Keyword, then when webpage information occur and can not be matched to its corresponding webpage classification, it may be considered that the access of the webpage is believed Breath is not the range this time investigated, then can not count this visit information.
Step 103, the classification according to belonging to access time, webpage determines that user accesses every class daily in predetermined period The number of webpage.
Specifically, can all be screened user in the webpage information accessed on the same day according to access time, further according to Step 102 as a result, determine the corresponding webpage classification of webpage information that filters out, and calculate the number that every class webpage occurs, It is exactly the number that user accessed such webpage within one day time.If user has browsed multiple webpages in one day, Multiple type of webpage can be determined within one day.If user's no browsing webpage in one day, intraday at this Data can use 0 substitution.For example, accessing game class in second day user in the website 5 times that first day user accesses game classification Other website 0 time.
Further, it can be traversed with traverse user classification corresponding to the webpage information accessed on the same day when first time As soon as will access the other access times of the web page class when to webpage classification and be set as 1, when traversing the webpage classification for the second time When, increase by 1 on this basis.The other access times of each web page class can also be initialized as 0, traverse one of net When page classification, 1 is superimposed in the other access times of the web page class.Webpage can also be accessed using other modes counting user daily The number of classification, herein with no restrictions.
Wherein it is possible to preset predetermined period according to demand, such as five days, one week, one month etc..Again with default week Phase is unit, determines that user in predetermined period accesses the number of every class webpage daily.For example, being determined continuous as unit of five days Five days in, user accesses the number of every class webpage daily.
Specifically, for the ease of being calculated or being analyzed according to statistical result first can be established according to determining data Matrix:
Wherein, aijThe number of the i-th class webpage is accessed in jth day for user.
Specifically, each row of data represents the number that user accesses classification webpage corresponding to the row daily, i.e., every row in matrix A The corresponding webpage classification of data be it is identical, as the first row represents the related data of first category webpage.Every columns in matrix A According to representing on the same day, user accesses every other number of class web page class, the date that every column data generates be it is identical, such as first Column represent in the predetermined period, the number of every class webpage of first day user access.It altogether include m class webpage in matrix A, Predetermined period is n days.
Further, the type of webpage that can according to need investigation determines m value, for example, needing to investigate 5 class webpages altogether, then M can be set to 5, and in each predetermined period, m is equal to 5.
Step 104, according to number, determine that user accesses the average time of every class webpage and the number of the every class webpage of access Variance yields.
Wherein, the calculation method that user accesses the average time of every class webpage can be with are as follows:
Namely by user, the number of the i-th class webpage of access is added in n days, then divided by n, so that it is determined that user is daily out Access the average time of the i-th class webpage.
Specifically, the calculation method that user accesses the number variance yields of every class webpage can be with are as follows:
Wherein,Indicate that user accesses the number variance yields of the i-th class webpage
By calculating varianceUser can further be understood and access the number of the i-th class webpage daily for visiting daily Ask the departure degree of the average time of such webpage.
For the ease of statistics, the second matrix can also be established according to determining average time and number variance yields:
Wherein, average time and the number side of every class webpage are accessed in the second matrix in a predetermined period for user Difference.
Step 105, according to average time, the number variance yields for accessing every class webpage, the user in predetermined period is determined Preference.
Wherein it is possible to which continue counting user accesses the number of every class webpage daily, and calculates user's access and do not have class webpage Total degree determines the forward webpage classification of access total degree, and as user preference.It can also be every by calculating user It accesses the average value of every class webpage number, is ranked up to webpage classification.When using above-mentioned method of determination, it can count It is normal to handle the access information stored in database after complete user accesses the number of every class webpage daily, it such as covers, empty Deng these information capable of being made full use of to carry out user before access information is processed using method provided in this embodiment Preference analysis, and after access information is processed, it will not influence the analysis result to user.
Specifically, the number that can also access daily each classification webpage according to user in predetermined period determines that user is inclined It is good, further according to being determined in multiple predetermined periods as a result, further determining user preference.
Further, in scheme provided in this embodiment, average time, the number of every class webpage can be accessed according to user Variance yields determines the preference of the user in predetermined period.
Average time can intuitively characterize the number that user accesses all kinds of webpages, if user accesses being averaged for certain class webpage Number is more, it may be considered that user relatively pays close attention to the content of this kind of webpage.It is used furthermore it is also possible to be determined according to number variance yields The preference at family.When the number that user accesses the i-th class webpage daily is relatively average, stable, σiWill be smaller, on the contrary, if user is every The number of its i-th class webpage of access is unequal, such as has accessed within first day 40 the i-th webpages, but in other days, does not see The i-th class webpage is seen, at this point, σiIt will be larger.Therefore, average time and number variance yields can be comprehensively considered, determine user Whether the content of this kind of webpage is relatively paid close attention to.For example, frequency threshold value can be greater than in average time, and number variance yields is less than side When poor threshold value, determine that user is interested in this kind of web page contents, so that it is determined that the preference of user.
When practical application, in the webpage for numerous types that user accessed in predetermined period, it may be possible to determine more A interested webpage classification of user, at this point it is possible to obtain multiple user preferences.For example, user is simultaneously to sport and house property It is interested.
The method provided in this embodiment for determining user preference according to network behavior accesses the visit of webpage including obtaining user Ask information, wherein access information includes webpage information, access time;Belonging to the webpage for determining user's access according to webpage information Classification;According to classification belonging to access time, webpage, determine that user accesses the secondary of every class webpage daily in predetermined period Number;According to number, determine that user accesses the average time of every class webpage and the number variance yields of the every class webpage of access;According to visit It asks average time, the number variance yields of every class webpage, determines the preference for determining user in predetermined period.Using the present embodiment institute The method of offer can make full use of the access information that user generates when browsing webpage, and true according to the access information of user Determine user and access the average time and variance yields of every class webpage, to comprehensively consider average time and number variance yields is more quasi- The preference of true determination user, and then Network Provider can be made to understand the preference of user, and then promote service level.
Fig. 2 is the process of the method that user preference is determined according to network behavior shown in another exemplary embodiment of the present invention Figure.
As shown in Fig. 2, the method provided in this embodiment for determining user preference according to network behavior, comprising:
Step 201, the access information that user accesses webpage is obtained, wherein when access information includes webpage information, access Between.
The concrete principle and implementation of step 201 and step 101 are all the same, and details are not described herein.
Step 202, keyword is extracted in the content of the uniform resource locator url of webpage and/or webpage.
Wherein it is possible to determine the keyword of the webpage according to the url of webpage.The uniform resource locator url of webpage refers to The position for the resource that can be obtained from internet and a kind of succinct expression of access method, are standard resources on internet Address.The part mode/agreement (scheme) is included at least in url and has the host IP address of the resource.It can be according to pre- If rule handles the url of acquisition, for example, preset rules can be removal including " ", the symbols such as " // ", also Can remove including mode, protocol section, such as " http ", " https ", " ftp " can also remove general domain name Format and WWW mark, such as " com ", " cn ", " www ", to obtain useful information from url.For example, http: // Www.iqiyi.com/, after getting rid of Versatile content according to preset rules, obtained vocabulary is " iqiyi ", then can be made For the keyword of webpage.
Specifically, can also be according to the keyword of the contents extraction webpage of webpage.It in general, all include website in webpage Title, the web site name in available webpage, and as the keyword of webpage.Such as all webpages in youku.com website In, top all includes the mark of youku.com, can identify the mark, determines the keyword of the webpage.
Step 203, classification belonging to webpage is determined in preset class library according to keyword, and/or, keyword is made For classification belonging to webpage.
Further, class library can be preset, includes keyword and the other corresponding relationship of web page class in class library.Example Such as, a kind of webpage classification may include multiple keywords.For the ease of being safeguarded to class library, can also be arranged keyword with The other corresponding table of web page class, for storing keyword and the other corresponding relationship of web page class, such as table 1.
Table 1
It is the line number being arranged by the number of url keyword in table 1, i.e. url keyword a line can also be by web page class Line number is arranged in other quantity, i.e. a webpage classification corresponds to a line, correspondingly, can will belong to the other url of the same web page class Keyword is placed in a box.
When practical application, table 1 can be safeguarded, delete, increase, modifying url keyword therein and webpage classification, It can also delete, increase, modifying corresponding relationship therein.For example, the webpage classification for needing to investigate can be added in corresponding table And its corresponding url keyword.When not parsing keyword included in corresponding table in the access information of user, then do not record This visit situation.The data in class library can be made less using this embodiment, it is convenient for safeguarding.
If desired class library is preset, then method provided in this embodiment can also include:
The corresponding relationship of keyword and classification is received, and by corresponding relationship storage into pre-set categories library.
Wherein, keyword and the corresponding relationship of classification can be user's active upload, receive the correspondence of user's upload After relationship, it can save it in preset class library.
Specifically, the corresponding relationship of keyword and classification can also be determined by the method for machine learning.In counting user Whether when access information, can detecte in pre-set categories library includes Web Page Key Words, is calculated if it is not, keyword can then be imported The self learning system of machine makes computer automatically determine classification corresponding to keyword, and the two corresponding relationship is stored in default In class library.Wherein it is possible to realize above-mentioned function using machine learning frame in the prior art.
Furthermore it is also possible to which whether detect in pre-set categories library includes keyword, closed if it is not, then being added in pre-set categories library Keyword, and the classification of keyword is determined as keyword itself.
Wherein it is possible to directly using keyword as webpage classification, when extracting Web Page Key Words, and in pre-set categories library Do not include the keyword, can directly be saved into class library, and using keyword as classification.
If class library directly can be arranged in table form using keyword as classification belonging to webpage, such as 2 institute of table Show.
Table 2
All webpage classifications can be covered when using this embodiment, in class library, data are compared with horn of plenty.
It is similar, when classifying according to the keyword extracted from web page contents to webpage, above-mentioned side can also be used Formula is only that the class library of maintenance is different, and details are not described herein.
Step 204, the classification according to belonging to access time, webpage determines that user accesses every class daily in predetermined period The number of webpage.
The concrete principle and implementation of step 204 and step 103 are all the same, and details are not described herein.
Step 205, according to number, determine that user accesses the average time of every class webpage and the number of the every class webpage of access Variance yields.
The concrete principle and implementation of step 205 and step 104 are all the same, and details are not described herein.
Step 206, according in predetermined period, user accesses the average time of every class webpage, number variance yields determines user Affiliated first category.
Specifically, the average time that according to user in predetermined period, can access every class webpage determines belonging to it The most type of webpage of user's Average visits can be determined as first category belonging to the user, for example, μ by one classification2 It is larger in multiple average values, and webpage classification corresponding with the 2nd class webpage is News class, then being assured that user belongs to News, preference are news.
Further, can also determine first category belonging to multiple users, for example, user simultaneously belong to News class, Videos class.
When practical application, there is also user only due to certain reason, the webpage for browsing a type on the same day compared with The case where more, and other times do not browse such webpage, causes user to access such webpage due to the access times of this day Average time it is more, in such a scenario, directly determine the affiliated first category of user according to Average visits, it will cause The problem for inaccuracy of classifying.
And hence it is also possible to consider according to number variance yields determine user belonging to first category.When user accesses i-th daily When the number of class webpage is relatively average, stable, σiWill be smaller, on the contrary, if user accesses the number injustice of the i-th class webpage daily , such as 40 the i-th webpages are had accessed within first day, but in other days, without the i-th class webpage of viewing, at this point, σiIt will be compared with Greatly.It therefore, can be by comparing σiIt is further to determine whether user belongs to first category with preset value.
In addition, the present embodiment also provides first kind method for distinguishing belonging to another determining user.
It may determine that μiWhether it is greater thanIf so, determining that user belongs to first category i. Wherein, a is preset correction value, and being usually arranged as 0.5, m is the other sum of web page class.μiThe flat of the i-th class webpage is accessed for user Equal number;μkFor the average time for accessing kth class webpage, σkFor the root of the number variance yields of the every k class webpage of access.
The access average value of all webpages after the i-th class webpage of removing and variance root are added, then divided by webpage classification Quantity m-1 can obtain the mean value of (μ+σ), multiplied by correction value 0.5, can obtain the mean value of μ, σ.By removing all nets The parameter of the i-th class webpage, reuses μ in pageiIt is compared with the mean value of μ, σ of other webpages, if μiIt is equal greater than what is finally calculated Value is as a result, it may be considered that user belongs to first category i.By above-mentioned method of determination, user can be accessed the i-th class webpage Situation is compared with whole access situation, then determines whether user belongs to first category i, to make the result of classification more Accurately.
Step 207, the preference of user is determined according to first category.
After determining first category belonging to user, the preference of user is determined further according to first category.For example, user belongs to News class, Videos class can then determine that the preference of the user is news, video display class.
Method provided in this embodiment can determine user in a predetermined period according to the access situation of user The average time and number variance yields for accessing every class webpage can be accurate further according to determining average time and variance yields Determine user's generic, and then can accurately determine user preference.
Fig. 3 is the process of the method that user preference is determined according to network behavior shown in another exemplary embodiment of the present invention Figure.
Method provided in this embodiment can determine that user's is inclined according to the access information of user in multiple predetermined periods It is good.
As shown in figure 3, the method provided in this embodiment for determining user preference according to network behavior, comprising:
Step 301, the first category that user belongs in each predetermined period in P predetermined period, the every class net of access are determined The average time and number variance yields of page.
Wherein, the predetermined predetermined period number P for needing to investigate, for example, 10 periods, 15 periods etc. are investigated, it can also With the preference of long-term investigation user according to demand.Due to user access information can not it is permanent storage in the database, because This can be after generating access information, just according to access information record access number, and one when predetermined period P is larger After a predetermined period, according to the data in the predetermined period determine user belonging to first category.To avoid database In data be disposed off, be but not based on these data to user carry out preference analysis the case where generation.In addition, working as long-term investigation When the preference of user, P can also be set to dynamic value, i.e., whenever a predetermined period terminates, then on the basis of current P value Superposition 1.First category is determined further according to the predetermined period newly increased, is determined in the predetermined period of comprehensive basis before this First category therefrom determines second category.
Specifically, can also determine to access the average time and number variance of every class webpage in each predetermined period Value, specific to determine that method is referred to step 201~step 206 in embodiment shown in Fig. 2, details are not described herein.
Step 302, according to the average time and number variance of the every class webpage of access determined in each predetermined period Value, determines the second category that user belongs in multiple first category.
Further, due to that can determine at least one first category in each predetermined period, then pre- at P If in the period, it will be able to determine multiple first category.For example, in first predetermined period determine first category be the 1st, 3 classes, the first category determined in second predetermined period are the 1st, 4 classes, the first kind determined in third predetermined period It Wei not the 1st, 5 classes.
Since the first category determined according to a predetermined period is that short-term determine will be as a result, namely this default week The more interested content of user in phase.It is, however, possible to which it is emerging that user's content a kind of to certain compares sense in this predetermined period Interest, but after a period of time, user is interested in other content.Therefore, determine that user is inclined according only to a predetermined period Well can not long-term investigation user hobby.Based on this, the present embodiment is in order to the investigation user in one section of longer time Preference comprehensively considers the average time generated when accessing every class webpage in multiple predetermined periods and number variance yields, in determination First category in filter out second category.
When practical application, average time corresponding to first category webpage can be accessed in obtaining each predetermined period And the number variance yields, determine whether the first category meets the requirement of second category further according to the size of each value.Example Such as, P=3, the first category determined in first predetermined period are the 1st, 3 classes, first determined in second predetermined period Classification is the 1st, 4 classes, and the first category determined in third predetermined period is the 1st, 5 classes.Then available first arrive third In predetermined period, access the 1st, 3,4, the average time of 5 classes and number variance yields, then compare any one first category net Whether the average time and number variance yields of page are all larger than its corresponding default value, if so, the first category can be with It is confirmed as the second category that user belongs to.
All first category can be traversed, the second category of the condition of satisfaction is screened out from it.
If the preference of long-term investigation user, that is, P value are increased according to time change situation, then whenever P value becomes After change, the average time of the first category, the every class webpage of access that can also include in the predetermined period according to newest determination and Number variance yields redefines the second category that user belongs to, so that second category be enable to be updated according to the variation of P value.
Step 303, the preference of user is determined according to second category.
Step 303 is similar to the implementation of step 208, and details are not described herein.
It should be noted that the user preference determined can change with the variation of second category.
The method provided in this embodiment for determining user preference according to network behavior produces when can access webpage according to user Raw access information, the preference of long-term investigation user, so as to understand for a long time user preference and preference variation, in turn Personalized service is provided a user according to the hobby of user.
Fig. 4 is the process of the method that user preference is determined according to network behavior shown in the another exemplary embodiment of the present invention Figure.
As shown in figure 4, provided in this embodiment determine user preference method according to network behavior, comprising:
Step 401, the first category that user belongs in each predetermined period in P predetermined period, the every class net of access are determined The average time and number variance yields of page.
Step 401 is identical as the realization principle of step 301 and mode, and details are not described herein.
Step 402, the first category i in each predetermined period is obtained, determines classification i being determined as the pre- of first category If amount of cycles P '.
Since access information of the user in each predetermined period is not quite similar, it is determined in each predetermined period First category also can be different.First category i in available each predetermined period, for example, P=3, first default week In phase, first category i is webpage classification 1,3;In second predetermined period, first category i is webpage classification 1,4;In third In a predetermined period, first category i is webpage classification 1,5.The default week that each classification i is determined as to first category is determined again Issue amount P '.For example, it is 3 that webpage classification 1, which is confirmed as the predetermined period quantity of the first category for first category, by webpage The predetermined period quantity that classification 3,4,5 is confirmed as the first category for first category is 1.
For the ease of statistical magnitude P ', set C ' can be established according to the first category determined in each predetermined period.Tool Body mode are as follows:
Subclass C ' is established according to the first category i determined in each predetermined periodn.Wherein, subclass C 'nIn include should All first category i determined in predetermined period.N indicates predetermined period mark, such as first predetermined period, then n=1.By In altogether including P predetermined period, therefore, available P subclass:
C′1=...e … Cf …}1
C′2=... Cg … Ch …}2
……
C′P=... Cc … Cd …}P
Wherein, Ce、Cf、Cg、ChDeng for first category that in each predetermined period, user belongs to.
Set C ' is determined according to multiple subclass:
C '=
{{… Ce … Cf …}1{… Cg … Ch …}2…{… Cc … Cd …}P}。
Further, since in each predetermined period, each classification i can only at most be confirmed as a first category, It is exactly can only at most occur a classification i in each subclass in above-mentioned set, then can be according to each first category i The number of appearance determines the predetermined period number P ' that classification i is determined as to first category.
Step 403, determine whether first category i meets first condition according to quantity P ', predetermined period quantity P.
Further, preset rules can be set, if determining that first category i meets the default rule according to quantity P ' and P Then, it is determined that first category i meets first condition.
When practical application, it can be determined that whether quantity P ' is greater thanIf so, judgement meets preset rules, otherwise sentence It is disconnected to be unsatisfactory for preset rules.
Wherein,It for correction value, can be configured according to demand, for example, willIt is set as 1/3.
If first category i meets the requirement of first condition, 404 are thened follow the steps.Otherwise, continue to determine next first kind The requirement of first condition whether is not met.
Step 404, it determines in P predetermined period, the overall average number of the webpage of user's access level i.
Specifically, recording the number a for thering is user to access every class webpage daily in each predetermined periodij(being detailed in matrix A), It can be determined in P predetermined period, total access times of the webpage of user's access level i according to these numerical value, then be visited with total Number is asked divided by total number of days of P predetermined period, to obtain the overall average number μ of the webpage of access level ii', wherein total Number of days is predetermined period number of days multiplied by P, for example, each predetermined period is 5 days, then total number of days is then 5 × P.
Step 405, according to the average time and number variance for accessing every class webpage in quantity P ', each predetermined period Value, overall average number determine whether first category i meets second condition, if so, determining that first category i is second category.
Further, can first to the average time and number variance yields that every class webpage is accessed in each predetermined period into Row processing.When with P predetermined period, available P matrix A, correspondingly, one can be obtained according to each matrix A Therefore a matrix B can obtain P matrix B, it may be assumed that
P matrix B can be handled, obtain third matrix Bi:
Further according to third matrix BjIn include μij、σij, quantity P ', overall average number μi' determines whether first category i is full Sufficient second condition, specific determining method, which can be, judges overall average number μiWhether ' meets:
If so, judging that classification i meets second condition.
If classification i meets first condition and second condition, judge that classification i presets for second category, that is, multiple In the multiple first category generated in period, according to the long-term access data of user, then the second category that user belongs to is determined, To keep the result finally determined more accurate.
After having determined whether classification i meets first condition and second condition, it can continue to determine that others classification i is It is no to meet first condition and second condition.
Step 406, the preference of user is determined according to second category.
Step 406 is similar to the implementation of step 208, and details are not described herein.
Step 407, however, it is determined that first category i is second category, then determines that user belongs to the general of classification i according to preset algorithm Rate, and output probability;
Wherein, preset algorithm are as follows:
Wherein, P ' refers to the predetermined period number that classification i is determined as to first category, that is, classification i is in set C ' Frequency of occurrence.P ' is calculated again and obtains q divided by P, that is, calculates classification i within P period, is confirmed as the general of first category Therefore rate can indicate the probability that user is determined as to classification i by q value.
Wherein, the execution sequence of step 406 and step 407 with no restrictions, can first carry out step 406, can also first hold Row step 407, may also be performed simultaneously step 406 and 407.
The method provided in this embodiment for determining user preference according to network behavior, is used in determining each predetermined period On the basis of first category belonging to family, according to the data in each predetermined period, determine P predetermined period this it is longer when In, second category belonging to user, so as to classify in one section of longer time to user.Also, pass through The mean value and variance yields in multiple predetermined periods are introduced, user can be investigated in the stationarity for accessing the i-th class webpage, thus more Accurately classify to user, and then more accurately the preference of user is analyzed.
Fig. 5 is the structure of the device that user preference is determined according to network behavior shown in an exemplary embodiment of the invention Figure.
As shown in figure 5, the device provided in this embodiment for determining user preference according to network behavior, comprising:
Module 51 is obtained, for obtaining the access information of user, wherein the access information includes webpage information, access Time;
Category determination module 52, for according to the webpage information determine webpage that the user accesses belonging to classification;
Number determining module 53 is determined for the classification according to belonging to the access time, the webpage in predetermined period Interior user accesses the number of every class webpage daily;
Computing module 54, for determining that the user accesses average time and the visit of every class webpage according to the number Ask the number variance yields of every class webpage;
Preference determining module 55, the number for accessing every class webpage daily according to the user determine that the user is inclined It is good.
The device provided in this embodiment that user preference is determined according to network behavior accesses the visit of webpage including obtaining user Ask information, wherein access information includes webpage information, access time;Belonging to the webpage for determining user's access according to webpage information Classification;According to classification belonging to access time, webpage, determine that user accesses the secondary of every class webpage daily in predetermined period Number;According to number, determine that user accesses the average time of every class webpage and the number variance yields of the every class webpage of access;According to visit It asks average time, the number variance yields of every class webpage, determines and determine user preference in predetermined period.It is mentioned using the present embodiment The device of confession can make full use of the access information that user generates when browsing webpage, and be determined according to the access information of user User accesses the average time and variance yields of every class webpage, to comprehensively consider average time and number variance yields is more acurrate Determination user preference, and then can make Network Provider understand user preference, and then promoted service level.
The concrete principle and implementation of the device provided in this embodiment that user preference is determined according to network behavior with Embodiment shown in FIG. 1 is similar, and details are not described herein again.
Fig. 6 is the structure of the device that user preference is determined according to network behavior shown in another exemplary embodiment of the present invention Figure.
As shown in fig. 6, on the basis of the above embodiments, it is provided in this embodiment that user preference is determined according to network behavior Device,
The preference determining module 55, comprising:
First category determination unit 551, for according to the average time, the number variance yields for accessing every class webpage Determine first category belonging to the user;
Preference determination unit 552, for determining the preference of the user according to the first category.
Wherein, first category determination unit 551 is specifically used for:
Judge μiWhether it is greater than
If so, determining that the user belongs to first category i;
Wherein, m is the other sum of the web page class, and a is preset correction value, μiThe i-th class webpage is accessed for the user Average time;μkFor the average time for accessing kth class webpage, σkFor the root of the number variance yields of the every k class webpage of access.
Specifically, device provided in this embodiment further include: multicycle determining module 56 is used for:
The determining first category that in the P predetermined periods, user described in each predetermined period belongs to, Access the average time and the number variance yields of every class webpage;
According to the average time and the number of the every class webpage of access determined in each predetermined period Variance yields determines the second category that user belongs in multiple first category;
The preference of the user is determined according to the second category.
Optionally, the multicycle determining module 56, comprising:
Acquiring unit 561 is determined for obtaining the first category i in each predetermined period by the class Other i is determined as the predetermined period quantity P ' of first category;
First determination unit 562, for whether determining the first category i according to quantity P ', predetermined period quantity P Meet first condition;
If so, the first determination unit 562 determines in the P predetermined periods that the user accesses the net of the classification i The overall average number of page;
First determination unit 562 is also used to access every class net according in quantity P ', each predetermined period The average time and the number variance yields, the overall average number of page determine whether the first category i meets the Two conditions, if so, determining that the first category i is the second category.
Optionally, device provided in this embodiment further includes probability output module 57, is used for:
If it is determined that the first category i is the second category, then determine that the user belongs to classification i according to preset algorithm Probability, and export the probability;
Wherein, the preset algorithm are as follows:
Wherein, q is the probability.
In addition, category determination module 52 can also include:
Extraction unit 521 extracts keyword in the url of the webpage and/or the content of the webpage;
Second determination unit 522, for being determined belonging to the webpage in preset class library according to the keyword Classification, and/or, using the keyword as classification belonging to the webpage.
It optionally, include the corresponding relationship of the keyword and the classification in the pre-set categories library;
Correspondingly, the category determination module 52 further include:
Receiving unit 523 is deposited for receiving the corresponding relationship of the keyword Yu the classification, and by the corresponding relationship It stores up in the pre-set categories library;
And/or adding unit 524, for detecting whether including the keyword in the pre-set categories library, if it is not, then The keyword is added in the pre-set categories library, and the classification of the keyword is determined as described keyword itself.
The concrete principle and implementation of the device provided in this embodiment that user preference is determined according to network behavior with Embodiment shown in Fig. 2~4 is similar, and details are not described herein again.
Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above-mentioned each method embodiment can lead to The relevant hardware of program instruction is crossed to complete.Program above-mentioned can be stored in a computer readable storage medium.The journey When being executed, execution includes the steps that above-mentioned each method embodiment to sequence;And storage medium above-mentioned include: ROM, RAM, magnetic disk or The various media that can store program code such as person's CD.
Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extent Pipe present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: its according to So be possible to modify the technical solutions described in the foregoing embodiments, or to some or all of the technical features into Row equivalent replacement;And these are modified or replaceed, various embodiments of the present invention technology that it does not separate the essence of the corresponding technical solution The range of scheme.

Claims (11)

1. a kind of method for determining user preference according to network behavior characterized by comprising
Obtain the access information of user, wherein the access information includes webpage information, access time;
Classification belonging to the webpage for determining user's access according to the webpage information;
According to classification belonging to the access time, the webpage, determine that the user described in predetermined period accesses every class daily The number of webpage;
According to the number, determine that the user accesses the average time of every class webpage and the number variance of the every class webpage of access Value;
According to the average time, the number variance yields for accessing every class webpage, the use described in the predetermined period is determined The preference at family.
2. the method according to claim 1, wherein it is described according to access every class webpage the average time, The number variance yields determines the preference of the user described in the predetermined period, comprising:
According in the predetermined period, the user accesses the average time of every class webpage, the number variance yields determines First category belonging to the user;
The preference of the user is determined according to the first category.
3. according to the method described in claim 2, the user accesses it is characterized in that, described according in the predetermined period The average time of every class webpage, the number variance yields determine first category belonging to the user, comprising:
Judge μiWhether it is greater than
If so, determining that the user belongs to first category i;
Wherein, m is the other sum of the web page class, and a is preset correction value, μiAverage time for accessing the i-th class webpage for the user Number;μkFor the average time for accessing kth class webpage, σkFor the root of the number variance yields of the every k class webpage of access.
4. according to the method in claim 2 or 3, which is characterized in that further include:
The determining first category that in the P predetermined periods, user described in each predetermined period belongs to, access The average time of every class webpage and the number variance yields;
According to the average time of the every class webpage of access determined in each predetermined period and the number variance Value, determines the second category that user belongs in multiple first category;
The preference of the user is determined according to the second category.
5. according to the method described in claim 4, it is characterized in that, the visit that the basis determines in each predetermined period The average time and the number variance yields for asking every class webpage, determine the user in multiple first category The second category belonged to, comprising:
The first category i in each predetermined period is obtained, determines the classification i being determined as the pre- of first category If amount of cycles P ';
Determine whether the first category i meets first condition according to quantity P ', predetermined period quantity P;
If so, determining in the P predetermined periods, the user accesses the overall average number of the webpage of the classification i;
According to the average time and the number side for accessing every class webpage in quantity P ', each predetermined period Difference, the overall average number determine whether the first category i meets second condition, if so, determining the first category i For the second category.
6. according to the method described in claim 5, it is characterized in that, however, it is determined that the first category i be the second category, then It determines that the user belongs to the probability of classification i according to preset algorithm, and exports the probability;
Wherein, the preset algorithm are as follows:
Wherein, q is the probability.
7. according to claim 1~3,5,6 described in any item methods, which is characterized in that described true according to the webpage information Classification belonging to the webpage of fixed user's access, comprising:
Keyword is extracted in the uniform resource locator url of the webpage and/or the content of the webpage;
Classification belonging to the webpage is determined in pre-set categories library according to the keyword, and/or, using the keyword as Classification belonging to the webpage.
8. the method according to the description of claim 7 is characterized in that in the pre-set categories library include the keyword with it is described The corresponding relationship of classification;
The method also includes:
The corresponding relationship of the keyword Yu the classification is received, and the corresponding relationship is stored to the pre-set categories library In;
And/or whether detect in the pre-set categories library includes the keyword, if it is not, then adding in the pre-set categories library Add the keyword, and the classification of the keyword is determined as described keyword itself.
9. a kind of device for determining user preference according to network behavior characterized by comprising
Module is obtained, for obtaining the access information of user, wherein the access information includes webpage information, access time;
Category determination module, for according to the webpage information determine webpage that the user accesses belonging to classification;
Number determining module determines described in predetermined period for the classification according to belonging to the access time, the webpage User accesses the number of every class webpage daily;Computing module, for determining that the user accesses every class webpage according to the number Average time and the every class webpage of access number variance yields;
Preference determining module, for determining described according to the average time, the number variance yields for accessing every class webpage The preference of the user in predetermined period.
10. device according to claim 9, which is characterized in that the preference determining module, comprising:
First category determination unit, for determining institute according to the average time, the number variance yields that access every class webpage State first category belonging to user;
Preference determination unit, for determining the preference of the user according to the first category.
11. device according to claim 10, which is characterized in that further include: multicycle determining module is used for:
The determining first category that in the P predetermined periods, user described in each predetermined period belongs to, access The average time of every class webpage and the number variance yields;
According to the average time of the every class webpage of access determined in each predetermined period and the number variance Value, determines the second category that user belongs in multiple first category;
The preference of the user is determined according to the second category.
CN201810108024.7A 2018-02-02 2018-02-02 Method and device for determining user preference according to network behavior Expired - Fee Related CN110110219B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810108024.7A CN110110219B (en) 2018-02-02 2018-02-02 Method and device for determining user preference according to network behavior

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810108024.7A CN110110219B (en) 2018-02-02 2018-02-02 Method and device for determining user preference according to network behavior

Publications (2)

Publication Number Publication Date
CN110110219A true CN110110219A (en) 2019-08-09
CN110110219B CN110110219B (en) 2022-02-18

Family

ID=67483141

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810108024.7A Expired - Fee Related CN110110219B (en) 2018-02-02 2018-02-02 Method and device for determining user preference according to network behavior

Country Status (1)

Country Link
CN (1) CN110110219B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110650161A (en) * 2019-10-30 2020-01-03 华南师范大学 Safe website and working method thereof
CN112131561A (en) * 2020-09-11 2020-12-25 北京北信源软件股份有限公司 Access boundary determination method, device, electronic device and storage medium
CN114780882A (en) * 2022-03-26 2022-07-22 武汉楷瀚文化传媒有限公司 Internet webpage display management method, equipment and computer storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104077714A (en) * 2014-06-16 2014-10-01 微梦创科网络科技(中国)有限公司 Method and system for acquiring preference of user visiting website and pushing advertisements to user visiting website
CN104217091A (en) * 2013-06-05 2014-12-17 北京齐尔布莱特科技有限公司 Website page view prediction method based on historical tendency weights

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104217091A (en) * 2013-06-05 2014-12-17 北京齐尔布莱特科技有限公司 Website page view prediction method based on historical tendency weights
CN104077714A (en) * 2014-06-16 2014-10-01 微梦创科网络科技(中国)有限公司 Method and system for acquiring preference of user visiting website and pushing advertisements to user visiting website

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
杨持: "《生物统计学》", 31 May 1996, 《呼和浩特:内蒙古大学出版社》 *
韩文玲等: "青海省雷暴日数对防雷等级划分的影响分析", 《现代农业科技》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110650161A (en) * 2019-10-30 2020-01-03 华南师范大学 Safe website and working method thereof
CN110650161B (en) * 2019-10-30 2021-09-24 华南师范大学 Safe website and working method thereof
CN112131561A (en) * 2020-09-11 2020-12-25 北京北信源软件股份有限公司 Access boundary determination method, device, electronic device and storage medium
CN114780882A (en) * 2022-03-26 2022-07-22 武汉楷瀚文化传媒有限公司 Internet webpage display management method, equipment and computer storage medium
CN114780882B (en) * 2022-03-26 2023-12-05 深圳市安睿信科技有限公司 Internet webpage display management method, equipment and computer storage medium

Also Published As

Publication number Publication date
CN110110219B (en) 2022-02-18

Similar Documents

Publication Publication Date Title
CN108197331B (en) User interest exploration method and device
JP5802745B2 (en) Intelligent navigation method, apparatus and system
CN110489633B (en) Intelligent brain service system based on library data
CN106503014B (en) Real-time information recommendation method, device and system
US7610276B2 (en) Internet site access monitoring
CN106504099A (en) A kind of system for building user's portrait
CN106372249B (en) A kind of clicking rate predictor method, device and electronic equipment
US10776431B2 (en) System and method for recommending content based on search history and trending topics
US8990241B2 (en) System and method for recommending queries related to trending topics based on a received query
CN109190024A (en) Information recommendation method, device, computer equipment and storage medium
US20100100537A1 (en) System and method for identifying trends in web feeds collected from various content servers
CN107707545A (en) A kind of abnormal web page access fragment detection method, device, equipment and storage medium
CN111008265A (en) Enterprise information searching method and device
CN105930507B (en) A kind of method and device for the web browsing interest obtaining user
CN102890689A (en) Method and system for building user interest model
CN106776860A (en) One kind search abstraction generating method and device
JP2013125468A (en) Advertisement distribution device
CN110110219A (en) The method and device of user preference is determined according to network behavior
US20170228378A1 (en) Extracting topics from customer review search queries
US9830344B2 (en) Evaluation of nodes
JP2011227721A (en) Interest extraction device, interest extraction method, and interest extraction program
CN117235586B (en) Hotel customer portrait construction method, system, electronic equipment and storage medium
CN110795613A (en) Commodity searching method, device and system and electronic equipment
Antoniou et al. A Semantic Web Personalizing Technique: The Case of Bursts in Web Visits
CN111178421A (en) Method, device, medium and electronic equipment for detecting user state

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220218

CF01 Termination of patent right due to non-payment of annual fee