CN108109035A - Webpage recommending method based on Web personalizations - Google Patents

Webpage recommending method based on Web personalizations Download PDF

Info

Publication number
CN108109035A
CN108109035A CN201711302953.3A CN201711302953A CN108109035A CN 108109035 A CN108109035 A CN 108109035A CN 201711302953 A CN201711302953 A CN 201711302953A CN 108109035 A CN108109035 A CN 108109035A
Authority
CN
China
Prior art keywords
page
mrow
user
web
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711302953.3A
Other languages
Chinese (zh)
Inventor
李宇佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Dianji University
Original Assignee
Shanghai Dianji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Dianji University filed Critical Shanghai Dianji University
Priority to CN201711302953.3A priority Critical patent/CN108109035A/en
Publication of CN108109035A publication Critical patent/CN108109035A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0641Shopping interfaces

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of webpage recommending methods based on Web personalizations.The present invention intactly proposes the personalized service architecture excavated based on Web, and Clustering algorithm is improved in terms of two, so that while method provided by the invention simple and effective property, accuracy is improved, can directly apply to the business personalized web site of actual motion.

Description

Webpage recommending method based on Web personalizations
Technical field
The present invention relates to a kind of web page navigation methods of personalization, belong to computer science and technology field.
Background technology
March nineteen ninety-five, the Robert Armstrong of Carnegie Mellon University et al. is in American Association for Artificial Inte (AAAI) Personalized Navigation system WebWatcher, the Marko Balabanovic of Stanford University are proposed in spring session Et al. personalized recommendation system LIRA is proposed in same meeting.The same year August, the Henry of the Massachusetts Institute of Technology Lieberman is closed in international manually Intelligent joint and is proposed Personalized Navigation intelligent body Letizia in conference (IJCAI).These three System is acknowledged as personalized service early stage of development system the most classical, indicates the beginning of personalized service.
In several years thereafter, individuation service system emerges in an endless stream.1996, University of California Irvine branch schooles Brian Starr et al., which are proposed, finds the valuable personalized clothes changed and then user is notified to access of user's page interested Be engaged in intelligent body Do-I-Care.The same year, the DunjaMladenic of Carnegie Mellon University on the basis of WebWatcher into Improvement is gone, it is proposed that personalized recommendation system Personal WebWatcher.1996, famous network company Yahoo! It is also noted that the huge advantage of personalized service and potential business opportunity, release personalized entrance MyYahoo!.1997, AT&T was real It tests room and proposes personalized recommendation system PHOAKS and the Referral Web based on approach to cooperation.The Marko of Stanford University Balabanovic and YoavShoham is proposed the personalized recommendation system Fab based on content and approach to cooperation.March in the same year, 《Communications of the ACM》The special report of personalized recommendation system has been organized, has indicated personalized service It is subjected to comparable attention.
1999, the TanjaJoerding of German Dresden technology universities realized individual electronic commercial affairs prototype system TELLIM.The Henry Lieberman of the Massachusetts Institute of Technology propose the Personalized Navigation system Let ' s based on approach to cooperation browse.Liliana Ardissono and the Anna Goy of Italian Torino universities propose personalized Online Store SETA. Personalized service starts to global evolution.
2000, the artificial search engine CiteSeer such as graduate Kurt D.Bollacker of NEC added personalization Recommendation function, CiteSeer is personalized.Mobasher constructs commending system in proposition in 2000 with the method for Clustering, And achieve preferable effect.Schechter et al. predicts the following possible HTTP of user according to the path traversal pattern of user Request allows proxy server to perform pre- extract operation, related Web page is put into its Cache, to accelerate access speed. Cooleyde et al. and Buchner et al. extracts the access module of user using data mining technology from the log files of access, For marketing decision and intelligent recommendation service.Nasraoui et al. will be using cluster user access module method prediction user's future Access behavior.Barry Smyth and the Paul Cotter of Irish Dublin universities propose personalized television website PTV.Together Year, U.S.'s NSF funds start to support the research in relation to personalized service.April in the same year, the multinational Individuation research based on the U.S. Mechanism and network company have set up personalized association, it is intended to promote the development of personalized service, while protect in personalized service The privacy of user being related to.This year, China have also begun to the research of personalized service.The it is proposeds such as the road hamming of Tsinghua University are based on Multi-Agent hybrid intelligent realizes personalized recommendation.
2001, GediminasAdomavicius the and Alexander Tuzhilin of New York University realized personalization The user modeling system 1: 1Pro of e-commerce website.IBM Corporation adds individual character in its e-commerce platform WebSphere Change function, so that businessman develops individual electronic business web site.Graduate Eric Glover of NEC et al. propose individual character Change META Search Engine prototype system Inquirus2.Also the research to personalized service has been carried out extensively in China, it is proposed that some are former Type system.Feng of Tsinghua University, which takes wing et al., proposes the Individualized Information Filtering System Open Bookmark based on Agent.South Pan Jingui of capital university et al. has designed and Implemented customized information and has collected intelligent body DOLTRI-Agent.
But existing system all has the following problems:
(1) performance issue
Web personalization systems all extend traditional browser/server architecture, Web information warp to some extent Client could be returned after crossing respective handling, just will necessarily extend the response time.Real time individual system is to response time requirement Compare high, current Web mining algorithms and offline mode is usually all used when handling data, but since existing algorithm has Certain deficiency has resulted in the reduction of performance.Such as if association rules method support and confidence level choose incorrect, meeting The recommendation performance that the calculating time is too long or poor, the webpage huge amount of general e-commerce website are caused, if advised with association Then method is recommended, and can make system very complex, efficiency is than relatively low.
(2) privacy concern
As soon as this is a unavoidable problem, because must have the participation of user to establish personalized Web system, together When also to analyze the information of user feedback, if using Cookies one kind technologies, this may be related to the privacy of user.Mesh Preceding Web personalization technology can't well solve this problem, i.e., realize personalized service simultaneously and do not invade User ground privacy.
(3) quality evaluation problem
Web personalized services are realized using Web digging technologies, and different system uses different Web digging technologies, how to comment The problem of their modeling effect of valency and the final service quality of system are also one extremely important.At present to personalization system The evaluation of service quality, different system in different ways and test data, therefore can not evaluate multiple and different personalized systems The quality for service quality of uniting.It needs to study a kind of general performance indicator and the corresponding Benchmark of exploitation is various to evaluate Web Different Web digging technologies.
The content of the invention
The purpose of the present invention is:Personalized technology is excavated based on Web, proposed to act in 2000 keeping Mobasher While the algorithm simple and effective property for cluster of being engaged in, accuracy is improved, the business that can directly apply to actual motion is personalized Website.
In order to achieve the above object, the technical scheme is that providing a kind of webpage recommending based on Web personalizations Method, which is characterized in that comprise the following steps:
Step 1 obtains different transaction types to historical data progress cluster analysis, comprises the following steps:
Step 1.1, browse page set expression are P, P={ p1, p2..., pi..., pn, user's business set expression is T, T={ t1, t2..., ti..., tm, in formula, pi represents i-th of browse page, and ti represents i-th of user's business;
Step 1.2, each affairs are represented as the n-dimensional vector of browse page set P, then for i-th of user's business ti, There is ti=<w(p1, ti), w (p2, ti) ..., w (pn, ti) ..., w (pN, ti)>, w (pn, ti) represent i-th of user's business tiIt is clear Look at n-th of browse page pnWeight;
Step 1.3, user are for n-th of browse page pnThe page browsing timeDetermine w (pn, ti):
Step 2, the current sessions for setting user are expressed as S, S={ s1, s2..., si..., sn, siRepresent i-th of page, It is overall to be expressed as using feature C:C={ w1C, w2C ..., wiC ..., wnC }, wiC represents the ith feature value in feature C, root According to user in current sessions S accession page URLiTimeTo set then siValue:
The then ith feature value w in feature CiC=weight (pi, C), weight (pi, C) and represent the power of browsing pages pi Weight, and have:
Step 3 calculates the matching factor match between feature C and current sessions S using classical cosine similarity function (S, C), if the t of match (S, C), t smallest match degree threshold value, then feature C is the matching cluster of current sessions S, is calculated in matching cluster The recommendation coefficient of each page, if i-th of page s in matching clusteriRecommendation coefficient be Rec (si, p), then have:
All pages that coefficient is recommended to be greater than or equal to minimum recommended threshold value form current meeting in step 4, each matching cluster The recommendation collection of S is talked about, recommends all page links concentrated by recommendation coefficient sequence.
A kind of webpage recommending method based on Web personalizations as described in claim 1, which is characterized in that in the step In rapid 3, the calculation formula of match (S, C) is:
The present invention intactly proposes the personalized service architecture excavated based on Web, and to affairs in terms of two Clustering algorithm is improved so that while method provided by the invention simple and effective property, accuracy is improved, it can be direct Apply to the business personalized web site of actual motion.
Description of the drawings
Fig. 1 is the method for the present invention compared with former algorithm precision ratio;
Fig. 2 is the method for the present invention compared with former algorithm recall ratio.
Specific embodiment
With reference to specific embodiment, the present invention is further explained.It is to be understood that these embodiments are merely to illustrate the present invention Rather than it limits the scope of the invention.In addition, it should also be understood that, after reading the content taught by the present invention, people in the art Member can make various changes or modifications the present invention, and such equivalent forms equally fall within the application the appended claims and limited Scope.
The present invention provides a kind of webpage recommending methods based on Web personalizations, comprise the following steps:
Step 1 obtains different transaction types to historical data progress cluster analysis, comprises the following steps:
Step 1.1, browse page set expression are P, P={ p1, p2..., pi..., pn, user's business set expression is T, T={ t1, t2..., ti..., tm, in formula, piRepresent i-th of browse page, tiRepresent i-th of user's business;
Step 1.2, each affairs are represented as the n-dimensional vector of browse page set P, then for i-th of user's business ti, There is ti=<w(p1, ti), w (p2, ti) ..., w (pn, ti) ..., w (pN, ti)>, w (pn, ti) represent i-th of user's business tiIt is clear Look at n-th of browse page pnWeight;
Step 1.3, user are for n-th of browse page pnThe page browsing timeDetermine w (pn, ti):
Step 2, the current sessions for setting user are expressed as S, S={ s1, s2..., si..., sn, siRepresent i-th of page, It is overall to be expressed as using feature C:C={ w1C, w2C ..., wiC ..., wnC }, wiC represents the ith feature value in feature C, root According to user in current sessions S accession page URLiTimeTo set the value of then si:
The then ith feature value w in feature CiC=weight (pi, C), weight (pi, C) and represent the power of browsing pages pi Weight, and have:
Step 3 calculates the matching factor match between feature C and current sessions S using classical cosine similarity function (S, C),
If match (S, C) >=t, t smallest match degree threshold value, then feature C is the matching cluster of current sessions S, calculates matching cluster In each page recommendation coefficient, if i-th of page s in matching clusteriRecommendation coefficient be Rec (si, p), then have:
All pages that coefficient is recommended to be greater than or equal to minimum recommended threshold value form current meeting in step 4, each matching cluster The recommendation collection of S is talked about, recommends all page links concentrated by recommendation coefficient sequence.
The experimental data of the present invention calculates Clustering using the data of " online resource " website of DePaul universities of the U.S. Coverage rate and accuracy rate when method and its improved method are recommended are compared.The data set have 683 URL, 13745 A conversation recording, we select wherein to record the access of preceding 200 URL, reject access times and be less than 0.1% or more than 85% URL, length be less than 4 conversation recording.Using the 2/3 of data set as training set after processing, carry out Web and excavate to generate recommendation Cluster analysis result, remaining is tested 1/3 as test set.
When the present invention is in an experiment compared Clustering algorithm and innovatory algorithm, clusters number is starting point with 10. In Clustering algorithm, the bigger recommendation collection of scope is obtained, it is necessary first to smallest match threshold value is reasonably determined, if uncommon It hopes and recommends more novel information, can go deep into recommendation process, continuously decrease matching threshold to adjust and recommend the big of collection Small, this will just improve coverage rate.In turn, if wishing quickly to position interested information, can go deep into recommendation process, The size of matching threshold is gradually stepped up, this will just improve the accuracy rate recommended.In addition, clustering method is when carrying out matching primitives, Do not emphasize that user accesses the sequencing of operation.Experiment herein is primarily to verify the performance of innovatory algorithm, therefore The matching threshold that the present invention is set is changeless.
According to selected above data and test method, the present invention using coverage rate and accuracy rate the two measurement indexs, Analysis comparison is carried out to Clustering algorithm and innovatory algorithm.Experimental result is as shown in Figures 1 and 2.The result shows that after improving Clustering algorithm on accuracy rate this measurement index, hence it is evident that better than former algorithm;The substantially phase on coverage rate this measurement index Together.It was proved that improved algorithm improves accuracy rate while the advantages of holding former algorithm.

Claims (2)

1. a kind of webpage recommending method based on Web personalizations, which is characterized in that comprise the following steps:
Step 1 obtains different transaction types to historical data progress cluster analysis, comprises the following steps:
Step 1.1, browse page set expression are P, P={ p1, p2..., pi..., pn, user's business set expression is T, T= {t1, t2..., ti..., tm, in formula, piRepresent i-th of browse page, tiRepresent i-th of user's business;
Step 1.2, each affairs are represented as the n-dimensional vector of browse page set P, then for i-th of user's business ti, there is ti =< w (p1, ti), w (p2, ti) ..., w (pn, ti) ..., w (pN, ti) >, w (pn, ti) represent i-th of user's business tiBrowsing N-th of browse page pnWeight;
Step 1.3, user are for n-th of browse page pnThe page browsing timeDetermine w (pn, ti):
Step 2, the current sessions for setting user are expressed as S, S={ s1, s2..., si..., sn, siRepresent i-th of page, it is overall It is expressed as using feature C:C={ w1C, w2C ..., wiC ..., wnC }, wiC represents the ith feature value in feature C, according to Family accession page URL in current sessions SiTimeTo set then siValue:
The then ith feature value w in feature CiC=weight (pi, C), weight (pi, C) and represent the weight of browsing pages pi, And have:
Step 3, using classical cosine similarity function come calculate the matching factor match between feature C and current sessions S (S, C), if match (S, C) >=t, t smallest match degree threshold value, then feature C is the matching cluster of current sessions S, is calculated every in matching cluster The recommendation coefficient of a page, if i-th of page s in matching clusteriRecommendation coefficient be Rec (si, p), then have:
<mrow> <mi>Re</mi> <mi>c</mi> <mrow> <mo>(</mo> <msub> <mi>s</mi> <mi>i</mi> </msub> <mo>,</mo> <mi>p</mi> <mo>)</mo> </mrow> <mo>=</mo> <msqrt> <mrow> <mi>w</mi> <mi>e</mi> <mi>i</mi> <mi>g</mi> <mi>h</mi> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>s</mi> <mi>i</mi> </msub> <mo>,</mo> <mi>C</mi> <mo>)</mo> </mrow> <mo>*</mo> <mi>m</mi> <mi>a</mi> <mi>t</mi> <mi>c</mi> <mi>h</mi> <mrow> <mo>(</mo> <msub> <mi>s</mi> <mi>i</mi> </msub> <mo>,</mo> <mi>C</mi> <mo>)</mo> </mrow> </mrow> </msqrt> <mo>;</mo> </mrow>
All pages that coefficient is recommended to be greater than or equal to minimum recommended threshold value form current sessions S's in step 4, each matching cluster Recommend collection, recommend all page links concentrated by recommendation coefficient sequence.
2. a kind of webpage recommending method based on Web personalizations as described in claim 1, which is characterized in that in the step 3 In, the calculation formula of match (S, C) is:
CN201711302953.3A 2017-12-08 2017-12-08 Webpage recommending method based on Web personalizations Pending CN108109035A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711302953.3A CN108109035A (en) 2017-12-08 2017-12-08 Webpage recommending method based on Web personalizations

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711302953.3A CN108109035A (en) 2017-12-08 2017-12-08 Webpage recommending method based on Web personalizations

Publications (1)

Publication Number Publication Date
CN108109035A true CN108109035A (en) 2018-06-01

Family

ID=62208197

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711302953.3A Pending CN108109035A (en) 2017-12-08 2017-12-08 Webpage recommending method based on Web personalizations

Country Status (1)

Country Link
CN (1) CN108109035A (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678652A (en) * 2013-12-23 2014-03-26 山东大学 Information individualized recommendation method based on Web log data

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678652A (en) * 2013-12-23 2014-03-26 山东大学 Information individualized recommendation method based on Web log data

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
张海鹏: "基于Web日志挖掘的个性化推荐研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
易明: "基于Web挖掘的电子商务个性化推荐机理与方法研究", 《中国博士学位论文全文数据库 信息科技辑》 *
解男男 等: "基于Web日志挖掘的网页推荐方法", 《吉林大学学报(理学版)》 *

Similar Documents

Publication Publication Date Title
Wang et al. Location recommendation in location-based social networks using user check-in data
US9535897B2 (en) Content recommendation system using a neural network language model
CN112785397A (en) Product recommendation method, device and storage medium
Liu et al. GNNRec: gated graph neural network for session-based social recommendation model
CN106407381A (en) Method and device for pushing information based on artificial intelligence
Petridou et al. Time-aware web users' clustering
CN104866490B (en) A kind of video intelligent recommended method and its system
Hao et al. Service recommendation based on description reconstruction in cloud manufacturing
Rao et al. An optimal machine learning model based on selective reinforced Markov decision to predict web browsing patterns
Sumathi et al. Automatic Recommendation of Web Pages in Web Usage Mining C
TWI480749B (en) Method of identifying organic search engine optimization
Papadimitriou et al. Geo-social recommendations
Han et al. Data preprocessing method based on user characteristic of interests for web log mining
CN108109035A (en) Webpage recommending method based on Web personalizations
Khonsha et al. New hybrid web personalization framework
Yang et al. A new interest extraction method based on multi-head attention mechanism for CTR prediction
CN112559905B (en) Conversation recommendation method based on dual-mode attention mechanism and social similarity
Ramya et al. Building web personalization system with time-driven web usage mining
Silva et al. USTAR: Online multimodal embedding for modeling user-guided spatiotemporal activity
Wang et al. A Tri‐Attention Neural Network Model‐BasedRecommendation
Sun et al. Deep Session Interest Network Based on the Time Interval Encoding for the Click-through Rate Prediction
Ji et al. Multi‐channel Convolutional Neural Network Feature Extraction for Session Based Recommendation
Jalali et al. A recommender system approach for classifying user navigation patterns using longest common subsequence algorithm
Maheswari et al. Algorithm for Tracing Visitors' On-Line Behaviors for Effective Web Usage Mining
CN109074365A (en) Parameterize network communication path

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180601

RJ01 Rejection of invention patent application after publication