CN102819575B - Personalized search method for Web service recommendation - Google Patents

Personalized search method for Web service recommendation Download PDF

Info

Publication number
CN102819575B
CN102819575B CN201210253884.2A CN201210253884A CN102819575B CN 102819575 B CN102819575 B CN 102819575B CN 201210253884 A CN201210253884 A CN 201210253884A CN 102819575 B CN102819575 B CN 102819575B
Authority
CN
China
Prior art keywords
word
user
service
interest
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210253884.2A
Other languages
Chinese (zh)
Other versions
CN102819575A (en
Inventor
窦万春
胡蓉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Huakang Information Technology Co Ltd
Ten Party Health Management (jiangsu) Ltd
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201210253884.2A priority Critical patent/CN102819575B/en
Publication of CN102819575A publication Critical patent/CN102819575A/en
Application granted granted Critical
Publication of CN102819575B publication Critical patent/CN102819575B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a personalized search method for Web service recommendation. The personalized search method comprises the following steps of: 1, preprocessing a WSDL (Web Services Description Language) file, i.e., forming a bag of words through two preprocessing steps of removing stop words and extracting stems; 2, extracting user interest, i.e., calculating weight of each word in the bag of words by using an improved TF-IDF (Term Frequency-Inverse Document Frequency) formula, and multiplying by a time decay factor of the word to obtain a new weight; selecting previous k words according to the weight from large to small as interest words of a user and corresponding weight of each word to form a k-dimension user interest vector; 3, calculating interest similarity, i.e., setting a similarity threshold and selecting the users with interest similarity exceeding the threshold as neighbor users of a target user; and 4, ordering service search results, calculating a recommended predicted value of the service according to similarity of neighbor users and the frequency of selecting service of the users, and arranging the searched results in a descending order according to the recommended predicted value, thereby obtaining the personalized search result.

Description

A kind of individuation search method for Web service recommendation
Technical field
The present invention relates to web search, recommendation in a kind of computer software technical field, particularly a kind of individuation search method for Web service recommendation.
Background technology
In order to constantly meet the demand of the dirigibility of software systems, expansibility, correctness and robustness, the practice of soft project is progressively developed and certain methods, makes the structure of software systems can based on existing software resource, but not from the beginning all develop.These methods successfully accelerate the tempo of development of software systems, improve production efficiency.At the technological layer of method, the Function Decomposition realized by software is some relatively simple reusable functional modules, also for soft project provides a kind of better software administration technology.
Current, widely accepted software reuse technology is component based software development (Components-Based Software Engineering, CBSE).Service oriented computing (Service Oriented Computing, SOC) is a kind of new Component-based Design normal form; The infrastructure of SOC is service-oriented architectural framework (Service Oriented Architecture, SOA); Web service and SOA are that the one of SOC realizes version.
As a kind of emerging, towards the distributed computing model of Internet, SOC provides better enabling tool for structure loose coupling, inter-organization Integrated predict model.Service-Oriented Architecture Based provides basic guarantee by the pattern of " issuing-search-binding " for using Service Source.But service user and ISP are separated, add the difficulty that user understands, obtains and use required service.Particularly when the demand of user changes along with the evolution of application construction process, how allowing user obtain suitable service is a problem needing to solve.For this problem, traditional services discovery technique initiatively provides the mode of inquiry request to obtain the demand for services of user mainly through user, or directly allows user oneself in resource collection according to manual the searching of certain taxonomic hierarchies.When resource collection constantly expands, the operation of manually searching service will become loaded down with trivial details, time-consuming, fallibility.At present, Web service search technique comprises based on UDDI registration center, by Web service website (as XMethods, RemoteMethods etc.), uses universal search engine (as Google, Yahoo etc.) and use professional search engine (as seekda, Merobase etc.) four kinds of modes.These ways of search mainly support key search mode, and in retrieving, also no user participates in, and thus result for retrieval and user interest have nothing to do, and more can not change with the change of user interest.
Different from the thinking of conventional search techniques, personalized search technology can carry out analyzing also to the service page in Search Results and the interest of user compares, help user therefrom find out more interested service and it be preferentially presented in search result list, thus improve the search efficiency of user.As in Google personalized search, the look & feel that system allows customization oneself to like (comprises the rank of information filtering, speech selection and query suggestion customization etc.), the Subscribed Links of Google personalization allows user in the Google search engine of oneself, to create self-defined result, for client represents service chaining.The personalized search released allows user to search for interest information according to factum mode, and supports user for the management of result for retrieval and share.User can add annotation, can to classify and sequence etc. according to individual need to Web page.
The personalization preferences of personalized recommendation technology degree of depth digging user, information " propelling movement " mode of formula of taking the initiative, automatically provide the information met individual requirements to user, instead of need user oneself from the Web information of magnanimity, find oneself interested content, thus improve the efficiency of user's effective information acquisition.1992, first commending system Tapestry was born, it for Email collaborative filtering and obtain good effect.After this, commending system, with its wide using value, obtains increasing concern.1996, commending system was introduced portal website by Yahoo, added personalized user entrance MyYahoo, proposed personalized service for different user; 1997, AT & T laboratory proposed personalized recommendation system Referral Web and PHOAKS based on collaborative filtering; Calendar year 2001, IBM Corporation adds personalized recommendation system in its e-commerce platform Websphere, so that businessman's exploitation individual electronic business web site; Similar product also has GroupLens, Amazon, Netflix etc., and application relates to electronic mail filtering, ecommerce class website, theme of news class website, search engine, online DVD rental web site and some web2.0 socialization websites etc.
Personalized search uses the ultimate principle in personalized recommendation in a large number, and personalized recommendation also needs to use for reference the basic fundamental in personalized search in a large number, both are as two technology of tight association in personalized service and core the most, the differentiation information requirement of different user can be met in high degree Shangdi, be with a wide range of applications.
Search engine, as the instrument of effective information retrieval, can help user from magnanimity Web resource, get the content of oneself needs efficiently, quickly, thus greatly improve the efficiency of user's obtaining information.Along with enriching constantly of Web service resource and further developing of search engine technique, under the driving of user's actual need, individuation search method becomes the focus of search field research gradually.For the individuation search method of Web service, its core is interest, the preference of the personalization according to user, service retrieval result is carried out to screening and the sequence of " varying with each individual ", thus provide the result for retrieval of the differentiation meeting its individual demand to export for different user.
But how finding a kind of comparatively objective and accurate searching method in Web Internet resources, accurately service implementation pushes, and meeting the needs of different the main consuming body, is a difficult point.
Summary of the invention
Goal of the invention: technical matters to be solved by this invention is the defect for searching for out of true time length in prior art, provides a kind of individuation search method for Web service recommendation.
In order to solve the problems of the technologies described above, the invention discloses a kind of individuation search method for Web service recommendation, comprising the following steps:
Step 1, pre-service Web Services Description Language (WSDL) WSDL(Web Service Description Language, Web Services Description Language (WSDL)) document, the WSDL document obtaining it and selected is used record from user, by removing stop words and extracting stem two pre-treatment step, form word bag (bag ofwords);
Step 2, extracts user interest, uses the weight of each word in the TF-IDF formulae discovery word bag improved, and is multiplied by time decay factor, obtain new weight δ ij; Select new weight δ ija front k word is as the interest word of user from large to small, and the respective weights δ of each word ij, the user interest vector of composition k dimension; The weights that before selecting, k is excellent, and corresponding word forms user interest vector together.This measure is conducive to the dimension of reduction user interest vector space and makes its dimension consistent, is conducive to calculating the Interest Similarity between every two users efficiently.
Step 3, calculates similarity, and the COS distance between every two users of use co sinus vector included angle formulae discovery is as its similarity; Setting similarity threshold, the user exceeding threshold value enters to elect as the neighbor user of targeted customer; The setting range of similarity threshold is 0 ~ 1.
Step 4, sequence service retrieval result: targeted customer submits services request to, goes out all services meeting request by Web service search engine retrieving; The number of times selecting these to serve according to neighbor user and the similarity with targeted customer thereof, adopt weighted mean predictor formula to calculate the recommendation predicted value of each result for retrieval; By result for retrieval according to the descending sort of recommendation predicted value, thus obtain personalized search results.
In the present invention, the TF-IDF(Term Frequency-Inverse Document Frequency of improvement, document-anti-document frequency) formula is as follows:
tf ( t ij ) = freq ( t ij , D i ) | D i | ,
idf ( t ij ) = log | D | | { D i : t ij ∈ D i } | ,
ω ij=tf(t ij)*idf 2(t ij),
Wherein, t ijthe jth word in i-th user's word bag, tf (t ij) be word t ijdocument frequency, D ithe word bag of i-th user, freq (t ij, D i) be word t ijat word bag D ithe number of times of middle appearance, | D i| be D ithe number of middle word, idf (t ij) be word t ijanti-document frequency, | D| is the number of WSDL document in corpus, | { D i: t ij∈ D i| represent word t ijoccurred in the word bag of how many users, ω ijword t ijweight;
The computing method of time decay factor are as follows:
Decay=2-e α*t
Wherein, Decay represents time decay factor, and e is the end of natural logarithm, general use numerical value 2.718.α is attenuation rate, and span is [0,0.1], such as, can be set as 0.1.When α value is 0, Decay=1, represent that weights are not decayed in time, α value is larger, decays faster, and t is the difference between current time and the distance users the last time selecting to serve.Corresponding to the decay characteristics in time that user interest has, devise time decay factor.New weight is the value of former weight and the product of time decay factor, and for non-selected word of a specified duration, its weight decays to 0 gradually.
Word t in each user's word bag ijnew weight δ ijcomputing formula is:
δ ij=ω ij*Decay。
In the present invention, calculate similarity formula as follows:
sim ( u a , u b ) = Σ j = 1 k δ aj * δ bj Σ j = 1 k δ aj 2 * Σ j = 1 k δ bj 2 ,
Wherein, u awith u bbe two different users, sim (u a, u b) refer to similarity between these two users, δ ajand δ bjuser u respectively awith user u bword bag in the weight of a jth word, k is the number of user interest word.
In the present invention, the formula that employing weighted mean predictor formula calculates the recommendation predicted value of each result for retrieval is as follows:
P u t , s t = c ‾ u t + Σ u i ∈ N ( c u i , s t - c ‾ u i ) * sim ( u t , u i ) Σ u i ∈ N sim ( u t , u i ) 2 ,
Wherein, u ttargeted customer, s tdestination service, i.e. the service of recommendation predicted value to be calculated, targeted customer u tto destination service s trecommendation predicted value, with targeted customer u respectively twith neighbor user u iaverage select service number of times, neighbor user u iselect target service s tnumber of times, sim (u t, u i) be targeted customer u twith neighbor user u iinterest Similarity, N is targeted customer u tneighborhood.
In the present invention, remove stop words and refer to: in information retrieval, stop words refers to the word that the frequency of occurrences is too high, do not have too overall search meaning.Stop words process is a step of vectorial participle in Knowledge Extraction process, the speed of its independent process meeting speed up document process and quality.At present, the English having had some to publish is stopped using vocabulary, and wherein more famous is that the inactive vocabulary delivered of Van Rijsbergen and Brown Corpus stop using vocabulary.Stop using dictionary, Baidu of what Chinese stoplist was more famous have Harbin Institute of Technology stops using vocabulary, Sichuan University's machine intelligence laboratory stops using vocabulary etc.General vocabulary of stopping using not only comprises some general stop words, as a, by, is, at etc., and some vocabulary being included in that Web service field often occurs, such as service, soap, response, request, set, get etc., these words are little for discrimination Web service, and easily introduce interference.The word be contained in this table is removed from WSDL document.The parameter that WSDL document 7 is important: types, import, message, portType, operation, binding and service.These parameters are nested in definitions root element.Adopt WSDL4J(Web Services Description Language for Java Toolkit, the JAVA kit of Web Services Description Language (WSDL)) the WSDL document that user selected is resolved, the content parsed is removed stop words, extracts stem, form the word bag of this user.
In the present invention, stem refers to that all inflectional affixes are removed rear remaining part, and extracting stem is remove the process that affixe obtains root.The baud stem algorithm that the present invention invents at univ cambridge uk's computer laboratory in 1979 according to Martin doctor Poter, carries out the extraction of stem for the word in WSDL document, so that more accurately without repeatedly extracting interest word.
Compared with existing individuation search method, this method has three features: one is that not only implicit expression extracts the interest of user itself, and obtain the relation between different user interest by calculating Interest Similarity, and adopt collaborative filtering, personalized ordering based on interest is carried out to the Search Results of service, improves accuracy and the correlativity of Search Results to a certain extent; Two is add time decay factor in the process of interest formation, illustrates the feature that user interest develops in time more exactly; Three is that first, second and third step of method all can complete by off-line, very little on the impact of recall precision.
The present invention uses the ultimate principle in personalized recommendation just, collaborative filtering is applied to the personalized search of Web service, improves user satisfaction and retrieval precision.Specifically, the present invention collects the search records of user, describes document and extracts user interest, and form interest vector from its Web service selected; According to the similarity of the COS distance measure user interest of interest vector, the user exceeding certain threshold value with the similarity of targeted customer is selected to form the neighbours of this user; When targeted customer submits Service Search Request to, service recommendation system adopts one of certain search technique above to retrieve the service of multiple keyword match for it, but directly result for retrieval is not returned to user, but the recommendation predicted value of these result for retrieval is calculated according to the selection experience of neighbours and Interest Similarity thereof, then by descending sort, user is returned to.Like this, take part in the customization of service search result user transparent, adopt service recommendation method to complete personalized service search.
Beneficial effect: effect of the present invention is embodied in: the extraction of user interest, to user transparent, does not need frequently to inquire user or obtain the explicit feedback of user, thus can obtain approval and the use of more users.User interest and time correlation, the long-time weight not repeating the interest selected decays gradually, finally exit user interest vector, and the service interests of up-to-date frequent selection can add to user interest vector in time, thus can express and follow the tracks of the change of user interest more accurately.Adopt the method for collaborative filtering to carry out recommendation prediction and sequence to Search Results, even if targeted customer does not have the correlation experience of current required service, also can obtain personalized recommendation from the experience of other similar users.Can be widely used in the personalization of Web service search, Service supportive is recommended, and belongs to computer software technical field.
Accompanying drawing explanation
To do the present invention below in conjunction with the drawings and specific embodiments and further illustrate, above-mentioned and/or otherwise advantage of the present invention will become apparent.
Fig. 1 is the process flow diagram of a kind of individuation search method for Web service recommendation of the present invention.
Embodiment
As shown in Figure 1, the invention discloses a kind of individuation search method for Web service recommendation, comprise the following steps:
Step 1, pre-service WSDL document, uses record from user the WSDL document obtaining it and selected, and by removing stop words and extracting stem two pre-treatment step, forms word bag.
Step 2, extracts user interest, uses the weight of each word in the TF-IDF formulae discovery word bag improved, and is multiplied by the time decay factor of this word, obtain new weight; Before selecting weight from large to small, k word is as the interest word of user, and the respective weights of each word, and the user interest of composition k dimension is vectorial.
Step 3, calculates similarity, and the COS distance between every two users of use co sinus vector included angle formulae discovery is as its similarity; Setting similarity threshold, the user exceeding threshold value enters to elect as the neighbor user of targeted customer.
Step 4, sequence service retrieval result: targeted customer submits services request to, goes out all services meeting request by Web service search engine retrieving; The number of times selecting these to serve according to neighbor user and the similarity with targeted customer thereof, adopt weighted mean predictor formula to calculate the recommendation predicted value of each result for retrieval; By result for retrieval according to the descending sort of recommendation predicted value, thus obtain personalized search results.
The TF-IDF formula improved is as follows:
tf ( t ij ) = freq ( t ij , D i ) | D i | ,
idf ( t ij ) = log | D | | { D i : t ij ∈ D i } | ,
ω ij=tf(t ij)*idf 2(t ij),
Wherein, t ijthe jth word in i-th user's word bag, tf (t ij) be word t ijdocument frequency, D ithe word bag of i-th user, freq (t ij, D i) be word t ijat word bag D ithe number of times of middle appearance, | D i| be D ithe number of middle word, idf (t ij) be word t ijanti-document frequency, | D| is the number of WSDL document in corpus, | { D i: t ij∈ D i| mean t ijoccurred in the word bag of how many users, ω ijword t ijweight.
The computing method of time decay factor are as follows:
Decay=2-e α*t
Wherein, Decay represents time decay factor, and e is the end of natural logarithm, and α is attenuation rate, span is [0,0.1], when α value is 0, Decay=1, represent that weights are not decayed in time, α value is larger, decays faster, and t is the difference between current time and the distance users the last time selecting to serve;
Word t in each user's word bag ijnew weight calculation formula be:
δ ij=ω ij*Decay。
In the present invention, calculate similarity formula as follows:
sim ( u a , u b ) = Σ j = 1 k δ aj * δ bj Σ j = 1 k δ aj 2 * Σ j = 1 k δ bj 2 ,
Wherein, u awith u bbe two different users, sim (u a, u b) refer to similarity between these two users, δ ajand δ bjuser u respectively awith user u bword bag in the weight of a jth word, k is the number of user interest word.
In the present invention, the formula that employing weighted mean predictor formula calculates the recommendation predicted value of each result for retrieval is as follows:
P u t , s t = c ‾ u t + Σ u i ∈ N ( c u i , s t - c ‾ u i ) * sim ( u t , u i ) Σ u i ∈ N sim ( u t , u i ) 2 ,
Wherein, u ttargeted customer, s tdestination service, i.e. the service of recommendation predicted value to be calculated, targeted customer u tto destination service s trecommendation predicted value, with targeted customer u respectively twith neighbor user u iaverage select service number of times, neighbor user u iselect target service s tnumber of times, sim (u t, u i) be targeted customer u twith neighbor user u iinterest Similarity, N is targeted customer u tneighborhood.
Embodiment
The substance of the present embodiment is from Web service supermarket (http: // 125.221.225.2:8080/WSSM/) background data base.
The present embodiment comprises following four steps:
(1) pre-service WSDL document
From the background data base in Web service supermarket, extract the use record of 200 users, obtain raw data, the use record of certain customers is as follows:
Table 1 user uses record (part)
List four users in table 1, user name is respectively: " tailaoliu ", " fangfang ", " zww ", and " skh " selected some Web services respectively.Download the Web service that also each user of pre-service selected and describe document, remove stop words according to the inactive vocabulary that Van Rijsbergen delivers, adopt the poter stem algorithm of Martin doctor Poter to extract stem, form word bag.As " taolaoliu " selected service " BookingService " by name, " JasonsBooking ", three Web services of " HotelBookingEngine ", the WSDL document of three Web services is downloaded from corresponding service website, through removal stop words and extract the word bag that formed after stem for " render(84); hotel(99); reservation(40), invoice(33), room(269); city(81); client(13), book(194), ticket(13); basket(42), rate(25) ".This word bag comprises 11 words altogether, and what mark in the bracket wherein after each word is the number of times that this word occurs in a document.
(2) user interest is extracted
The word bag composition corpus of all users, uses the weight of each word in the TF-IDF formulae discovery word bag improved; The weight of the word in each user's word bag is multiplied by time decay factor, obtains new weight.The word that before weight, k is excellent and corresponding weight composition user interest vector thereof.As " render " word in the word bag of " taolaoliu ", the number of times occurred in the word bag of " taolaoliu " is 84 times, in the word bag of 200 users, one has in the word bag of 68 users and occurred this word, therefore, as follows according to the weights of the TF-IDF formulae discovery " render " improved in claim steps 2
tf ( ′ ′ render ′ ′ ) = 84 11 = 7.64 ,
idf ( ′ ′ render ′ ′ ) = log 200 68 = 0.47 ,
ω “render”=7.64*0.47 2=1.68,
Continue decay factor computing time, α value is 0.05, t value is that initial value 1, Decay is calculated as follows:
Decay=2-e 0.05*1=0.95,
New weight computing is as follows:
δ “render”=1.68*0.95=1.59
Similarly, calculate the weights of remaining word in word bag, and get front 6 (i.e. k=6) words of maximum weight, the interest vector obtaining user " taolaoliu " is: < (basket, 6.42), (hotel, 4.03), (room, 3.15), (book, 2.82), (render, 1.59), (information, 1.24) >, the interest vector of user " fangfang " is: < (book, 3.31), (price, 3.26), (title, 3.23), (author, 3.17), (ISBN, 2.15), (infomation, 1.13) >, the interest vector of user " zww " is: < (weather, 4.42), (city, 3.33), (forecast, 3.29), (replication, 2.12), (add, 1.12), (id, 1.11) >, the interest vector of user " skh " is: < (weather, 3.39), (comment, 3.31), (forecast, 2.27), (city, 2.22), (replication, 1.20), (add, 1.10) >.Wherein, the new weights of this word of numeral after interest word.
(3) Interest Similarity is calculated
COS distance between every two users of use co sinus vector included angle formulae discovery is as its similarity; Setting similarity threshold, the user exceeding threshold value enters to elect as the neighbours of targeted customer.Such as, adopt the similarity of calculating formula of similarity calculating user " taolaoliu " and user " fangfang " as follows:
sim ( &prime; &prime; taolaoliu &prime; &prime; , &prime; &prime; fangfang &prime; &prime; ) =
2.82 * 3.31 + 1.24 * 1.13 6.42 2 + 4.03 2 + 3.15 2 + 2.82 2 + 1.59 2 + 1.24 2 * 3.31 2 + 3.26 3 + 3.23 2 + 3.17 2 + 2.15 2 + 1.13 2 = 0.17 ,
Setting similarity threshold is 0.15, then " taolaoliu " and user " fangfang " neighbor user each other.
(4) sort service retrieval result
Targeted customer submits services request to, and Web service supermarket retrieves all services meeting request for it; According to the services selection experience of neighbours and the similarity with targeted customer thereof, weighted mean predictor formula is adopted to calculate the recommendation predicted value of each result for retrieval.Such as, for the services request that targeted customer " taolaoliu " submits to, result for retrieval comprises the Web service of service " BookStoreService " by name, if this service was only selected 3 times by " fangfang ", the average number of times selecting service of user " taolaoliu " is 2, the average number of times selecting service of user " fangfang " is 1.5, then the recommendation predictor calculation of this service is as follows:
P &prime; &prime; BookStoreService &prime; &prime; = 2 + ( 3 - 1.5 ) * 0.17 0.17 2 = 3.5
By result for retrieval according to the descending sort of recommendation predicted value, thus user can obtain rapidly from the first page of result for retrieval the personalized search results meeting its interest.
Implementation result:
User " zww ", as current goal user, ites is desirable to obtain Online Book Shopping service.Respectively to seekda search system ( http:// webservices.seekda.com/, belong to prior art) and after Web service supermarket submits services request key word " book " to, the Search Results of front 10 ranks obtained is respectively as shown in table 2 and table 3.
The service of table 2.seekda Search Results top 10
In table 2, only have sequence number to be 2,4, the service of 5 provides Online Book Shopping function, user " zww " also will manually find the service meeting oneself demand after obtaining the returning results of system, and this process is consuming time, uninteresting, fallibility often.
Table 3.Web serves the service of supermarket Search Results top 10
In table 3, except sequence number is the service of 9, all the other services are all relevant to Online Book Shopping.As can be seen here, personalized search uses collaborative filtering mode calculation services to recommend predicted value, can improve searching accuracy and the user search efficiency of service, improve the satisfaction of user to Web service search engine.
The invention provides a kind of individuation search method for Web service recommendation; the method and access of this technical scheme of specific implementation is a lot; the above is only the preferred embodiment of the present invention; should be understood that; for those skilled in the art; under the premise without departing from the principles of the invention, can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.The all available prior art of each ingredient not clear and definite in the present embodiment is realized.

Claims (1)

1. for an individuation search method for Web service recommendation, it is characterized in that, comprise the following steps:
Step 1, pre-service Web Services Description Language (WSDL) WSDL document: use record the WSDL document obtaining it and selected from user, by removing stop words and extracting stem two pre-treatment step, forms word bag;
Step 2, extracts user interest: the weight calculating each word in word bag, and is multiplied by time decay factor, obtain new weight δ ij; Select new weight δ ija front k word is as the interest word of user from large to small, and the respective weights δ of each word ij, the user interest vector of composition k dimension;
Step 3, calculates Interest Similarity: calculate COS distance between every two user interest vectors as its Interest Similarity; Setting similarity threshold, the user exceeding threshold value enters to elect as the neighbor user of targeted customer;
Step 4, sequence service retrieval result: targeted customer submits services request to, goes out all services meeting request by Web service search engine retrieving; The number of times selecting these to serve according to neighbor user and the similarity with targeted customer thereof, adopt weighted mean predictor formula to calculate the recommendation predicted value of each result for retrieval; By result for retrieval according to the descending sort of recommendation predicted value, thus obtain personalized search results;
In step 2, calculate the weight of each word in word bag, and be multiplied by the time decay factor of this word, obtain new weight δ ijcomprise the steps:
Use the TF-IDF formulae discovery weights omega improved ij:
tf ( t ij ) = freq ( t ij , D i ) | D i | ,
idf ( t ij ) = log | D | | { D i : t ij &Element; D i } | ,
ω ij=tf(t ij)*idf 2(t ij),
Wherein, t ijthe jth word in i-th user's word bag, tf (t ij) be word t ijdocument frequency, D ithe word bag of i-th user, freq (t ij, D i) be word t ijat word bag D ithe number of times of middle appearance, | D i| be D ithe number of middle word, idf (t ij) be word t ijanti-document frequency, | D| is the number of WSDL document in corpus, | { D i: t ij∈ D i| represent word t ijoccurred in the word bag of how many users, ω ijword t ijweight;
The computing method of time decay factor are as follows:
Decay=2-e α*t
Wherein, Decay represents time decay factor, and e is the end of natural logarithm, and α is attenuation rate, span is [0,0.1], when α value is 0, Decay=1, represent that weights are not decayed in time, α value is larger, decays faster, and t is the difference between current time and the distance users the last time selecting to serve;
Word t in each user's word bag ijnew weight δ ijcomputing formula is:
δ ij=ω ij*Decay;
In step 4, the formula that employing weighted mean predictor formula calculates the recommendation predicted value of each result for retrieval is as follows:
P u t , s t = c &OverBar; u t + &Sigma; u i &Element; N ( c u i , s t - c &OverBar; u i ) * sim ( u t , u i ) &Sigma; u i &Element; N sim ( u t , u i ) 2 ,
Wherein, u ttargeted customer, s tdestination service, i.e. the service of recommendation predicted value to be calculated, targeted customer u tto destination service s trecommendation predicted value, with targeted customer u respectively twith neighbor user u iaverage select service number of times, neighbor user u iselect target service s tnumber of times, sim (u t, u i) be targeted customer u twith neighbor user u iinterest Similarity, N is targeted customer u tneighborhood;
Adopt in step 3 and calculate user interest similarity with the following method:
sim ( u a , u b ) = &Sigma; j = 1 k &delta; aj * &delta; bj &Sigma; j = 1 k &delta; aj 2 * &Sigma; j = 1 k &delta; bj 2 ,
Wherein, u awith u bbe two different users, sim (u a, u b) refer to similarity between these two users, δ ajand δ bjuser u respectively awith user u bword bag in the weight of a jth word, k is the number of user interest word.
CN201210253884.2A 2012-07-20 2012-07-20 Personalized search method for Web service recommendation Active CN102819575B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210253884.2A CN102819575B (en) 2012-07-20 2012-07-20 Personalized search method for Web service recommendation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210253884.2A CN102819575B (en) 2012-07-20 2012-07-20 Personalized search method for Web service recommendation

Publications (2)

Publication Number Publication Date
CN102819575A CN102819575A (en) 2012-12-12
CN102819575B true CN102819575B (en) 2015-06-17

Family

ID=47303686

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210253884.2A Active CN102819575B (en) 2012-07-20 2012-07-20 Personalized search method for Web service recommendation

Country Status (1)

Country Link
CN (1) CN102819575B (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104102648B (en) * 2013-04-07 2017-12-01 腾讯科技(深圳)有限公司 Interest based on user behavior data recommends method and device
CN104111959B (en) * 2013-04-22 2017-06-20 浙江大学 Service recommendation method based on social networks
CN103324690A (en) * 2013-06-03 2013-09-25 焦点科技股份有限公司 Mixed recommendation method based on factorization condition limitation Boltzmann machine
CN103473291B (en) * 2013-09-02 2017-01-18 中国科学院软件研究所 Personalized service recommendation system and method based on latent semantic probability models
CN103678652B (en) * 2013-12-23 2017-02-01 山东大学 Information individualized recommendation method based on Web log data
US9953060B2 (en) 2014-03-31 2018-04-24 Maruthi Siva P Cherukuri Personalized activity data gathering based on multi-variable user input and multi-dimensional schema
CN104318268B (en) * 2014-11-11 2017-09-08 苏州晨川通信科技有限公司 A kind of many trading account recognition methods based on local distance metric learning
CN104899266B (en) * 2015-05-22 2017-10-24 广东欧珀移动通信有限公司 Method and device is recommended in one kind application
CN105205139B (en) * 2015-09-17 2019-06-14 罗旭斌 A kind of personalization document retrieval method
CN106055594A (en) * 2016-05-23 2016-10-26 成都陌云科技有限公司 Information providing method based on user interests
CN106126669B (en) * 2016-06-28 2019-07-16 北京邮电大学 User collaborative filtering content recommendation method and device based on label
US10147335B2 (en) * 2016-07-15 2018-12-04 Lakshmi Arthi Krishnaswami Education data platform to support a holistic model of a learner
CN106708920A (en) * 2016-10-09 2017-05-24 南京双运生物技术有限公司 Screening method for personalized scientific research literature
CN107463683B (en) * 2017-08-09 2018-07-24 深圳壹账通智能科技有限公司 The naming method and terminal device of code element
CN108268584A (en) * 2017-08-25 2018-07-10 广州市动景计算机科技有限公司 Message push method, device and server
CN107562919B (en) * 2017-09-13 2020-07-17 云南大学 Multi-index integrated software component retrieval method and system based on information retrieval
CN109978642A (en) * 2017-12-27 2019-07-05 中移(杭州)信息技术有限公司 A kind of information recommendation method, device and communication equipment
CN109408713B (en) * 2018-10-09 2020-12-04 哈尔滨工程大学 Software demand retrieval system based on user feedback information

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101685456A (en) * 2008-09-26 2010-03-31 华为技术有限公司 Search method, system and device
CN101996200A (en) * 2009-08-19 2011-03-30 华为技术有限公司 Method and device for searching file
CN102156733A (en) * 2011-03-25 2011-08-17 清华大学 Search engine and method based on service oriented architecture

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101685456A (en) * 2008-09-26 2010-03-31 华为技术有限公司 Search method, system and device
CN101996200A (en) * 2009-08-19 2011-03-30 华为技术有限公司 Method and device for searching file
CN102156733A (en) * 2011-03-25 2011-08-17 清华大学 Search engine and method based on service oriented architecture

Also Published As

Publication number Publication date
CN102819575A (en) 2012-12-12

Similar Documents

Publication Publication Date Title
CN102819575B (en) Personalized search method for Web service recommendation
US8200617B2 (en) Automatic mapping of a location identifier pattern of an object to a semantic type using object metadata
Zhong et al. Time-aware service recommendation for mashup creation in an evolving service ecosystem
CN104866554B (en) A kind of individuation search method and system based on socialization mark
Fang et al. Towards automatic tagging for web services
Gao et al. SeCo-LDA: Mining service co-occurrence topics for recommendation
CN102156747B (en) Method and device for forecasting collaborative filtering mark by introduction of social tag
Gao et al. SeCo-LDA: Mining service co-occurrence topics for composition recommendation
KR100954842B1 (en) Method and System of classifying web page using category tag information and Recording medium using by the same
CN105468649A (en) Method and apparatus for determining matching of to-be-displayed object
Li et al. CoWS: An Internet-enriched and quality-aware Web services search engine
Zhang et al. MMOY: Towards deriving a metallic materials ontology from Yago
CN105677825A (en) Analysis method for client browsing operation
Rawat et al. Topic modelling of legal documents using NLP and bidirectional encoder representations from transformers
Bin et al. A neural multi-context modeling framework for personalized attraction recommendation
Wang et al. Towards services discovery based on service goal extraction and recommendation
Yochum et al. Tourist attraction recommendation based on knowledge graph
Zhuo Consumer demand behavior mining and product recommendation based on online product review mining and fuzzy sets
Ma et al. Api prober–a tool for analyzing web api features and clustering web apis
Du et al. Scientific users' interest detection and collaborators recommendation
Vo An integrated topic modeling and auto-encoder for semantic-rich network embedding and news recommendation
Hu et al. A personalised search approach for web service recommendation
KR101277300B1 (en) Method and apparatus for presenting personalized advertisements
Ibrahim et al. A Scientometric Approach for Personalizing Research Paper Retrieval.
Layfield et al. Experiments with document retrieval from small text collections using latent semantic analysis or term similarity with query coordination and automatic relevance feedback

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20160310

Address after: 222000 Jinqiao Road 19, Lianyungang economic and Technological Development Zone, Jiangsu, Lianyungang

Patentee after: Ten Party health management (Jiangsu) Limited

Patentee after: JIANGSU HUAKANG INFORMATION TECHNOLOGY CO., LTD.

Address before: Qixia Xianlin Avenue District of Nanjing City, Jiangsu Province, Nanjing University No. 163 210093

Patentee before: Nanjing University