CN102819575B - Personalized search method for Web service recommendation - Google Patents
Personalized search method for Web service recommendation Download PDFInfo
- Publication number
- CN102819575B CN102819575B CN201210253884.2A CN201210253884A CN102819575B CN 102819575 B CN102819575 B CN 102819575B CN 201210253884 A CN201210253884 A CN 201210253884A CN 102819575 B CN102819575 B CN 102819575B
- Authority
- CN
- China
- Prior art keywords
- word
- user
- service
- interest
- similarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 239000013598 vector Substances 0.000 claims abstract description 18
- 239000000284 extract Substances 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 6
- 239000000203 mixture Substances 0.000 claims description 5
- 238000002203 pretreatment Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 abstract 2
- 238000001914 filtration Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 7
- 230000008859 change Effects 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 230000004069 differentiation Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000010076 replication Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000000344 soap Substances 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 230000003319 supportive effect Effects 0.000 description 1
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a personalized search method for Web service recommendation. The personalized search method comprises the following steps of: 1, preprocessing a WSDL (Web Services Description Language) file, i.e., forming a bag of words through two preprocessing steps of removing stop words and extracting stems; 2, extracting user interest, i.e., calculating weight of each word in the bag of words by using an improved TF-IDF (Term Frequency-Inverse Document Frequency) formula, and multiplying by a time decay factor of the word to obtain a new weight; selecting previous k words according to the weight from large to small as interest words of a user and corresponding weight of each word to form a k-dimension user interest vector; 3, calculating interest similarity, i.e., setting a similarity threshold and selecting the users with interest similarity exceeding the threshold as neighbor users of a target user; and 4, ordering service search results, calculating a recommended predicted value of the service according to similarity of neighbor users and the frequency of selecting service of the users, and arranging the searched results in a descending order according to the recommended predicted value, thereby obtaining the personalized search result.
Description
Technical field
The present invention relates to web search, recommendation in a kind of computer software technical field, particularly a kind of individuation search method for Web service recommendation.
Background technology
In order to constantly meet the demand of the dirigibility of software systems, expansibility, correctness and robustness, the practice of soft project is progressively developed and certain methods, makes the structure of software systems can based on existing software resource, but not from the beginning all develop.These methods successfully accelerate the tempo of development of software systems, improve production efficiency.At the technological layer of method, the Function Decomposition realized by software is some relatively simple reusable functional modules, also for soft project provides a kind of better software administration technology.
Current, widely accepted software reuse technology is component based software development (Components-Based Software Engineering, CBSE).Service oriented computing (Service Oriented Computing, SOC) is a kind of new Component-based Design normal form; The infrastructure of SOC is service-oriented architectural framework (Service Oriented Architecture, SOA); Web service and SOA are that the one of SOC realizes version.
As a kind of emerging, towards the distributed computing model of Internet, SOC provides better enabling tool for structure loose coupling, inter-organization Integrated predict model.Service-Oriented Architecture Based provides basic guarantee by the pattern of " issuing-search-binding " for using Service Source.But service user and ISP are separated, add the difficulty that user understands, obtains and use required service.Particularly when the demand of user changes along with the evolution of application construction process, how allowing user obtain suitable service is a problem needing to solve.For this problem, traditional services discovery technique initiatively provides the mode of inquiry request to obtain the demand for services of user mainly through user, or directly allows user oneself in resource collection according to manual the searching of certain taxonomic hierarchies.When resource collection constantly expands, the operation of manually searching service will become loaded down with trivial details, time-consuming, fallibility.At present, Web service search technique comprises based on UDDI registration center, by Web service website (as XMethods, RemoteMethods etc.), uses universal search engine (as Google, Yahoo etc.) and use professional search engine (as seekda, Merobase etc.) four kinds of modes.These ways of search mainly support key search mode, and in retrieving, also no user participates in, and thus result for retrieval and user interest have nothing to do, and more can not change with the change of user interest.
Different from the thinking of conventional search techniques, personalized search technology can carry out analyzing also to the service page in Search Results and the interest of user compares, help user therefrom find out more interested service and it be preferentially presented in search result list, thus improve the search efficiency of user.As in Google personalized search, the look & feel that system allows customization oneself to like (comprises the rank of information filtering, speech selection and query suggestion customization etc.), the Subscribed Links of Google personalization allows user in the Google search engine of oneself, to create self-defined result, for client represents service chaining.The personalized search released allows user to search for interest information according to factum mode, and supports user for the management of result for retrieval and share.User can add annotation, can to classify and sequence etc. according to individual need to Web page.
The personalization preferences of personalized recommendation technology degree of depth digging user, information " propelling movement " mode of formula of taking the initiative, automatically provide the information met individual requirements to user, instead of need user oneself from the Web information of magnanimity, find oneself interested content, thus improve the efficiency of user's effective information acquisition.1992, first commending system Tapestry was born, it for Email collaborative filtering and obtain good effect.After this, commending system, with its wide using value, obtains increasing concern.1996, commending system was introduced portal website by Yahoo, added personalized user entrance MyYahoo, proposed personalized service for different user; 1997, AT & T laboratory proposed personalized recommendation system Referral Web and PHOAKS based on collaborative filtering; Calendar year 2001, IBM Corporation adds personalized recommendation system in its e-commerce platform Websphere, so that businessman's exploitation individual electronic business web site; Similar product also has GroupLens, Amazon, Netflix etc., and application relates to electronic mail filtering, ecommerce class website, theme of news class website, search engine, online DVD rental web site and some web2.0 socialization websites etc.
Personalized search uses the ultimate principle in personalized recommendation in a large number, and personalized recommendation also needs to use for reference the basic fundamental in personalized search in a large number, both are as two technology of tight association in personalized service and core the most, the differentiation information requirement of different user can be met in high degree Shangdi, be with a wide range of applications.
Search engine, as the instrument of effective information retrieval, can help user from magnanimity Web resource, get the content of oneself needs efficiently, quickly, thus greatly improve the efficiency of user's obtaining information.Along with enriching constantly of Web service resource and further developing of search engine technique, under the driving of user's actual need, individuation search method becomes the focus of search field research gradually.For the individuation search method of Web service, its core is interest, the preference of the personalization according to user, service retrieval result is carried out to screening and the sequence of " varying with each individual ", thus provide the result for retrieval of the differentiation meeting its individual demand to export for different user.
But how finding a kind of comparatively objective and accurate searching method in Web Internet resources, accurately service implementation pushes, and meeting the needs of different the main consuming body, is a difficult point.
Summary of the invention
Goal of the invention: technical matters to be solved by this invention is the defect for searching for out of true time length in prior art, provides a kind of individuation search method for Web service recommendation.
In order to solve the problems of the technologies described above, the invention discloses a kind of individuation search method for Web service recommendation, comprising the following steps:
Step 1, pre-service Web Services Description Language (WSDL) WSDL(Web Service Description Language, Web Services Description Language (WSDL)) document, the WSDL document obtaining it and selected is used record from user, by removing stop words and extracting stem two pre-treatment step, form word bag (bag ofwords);
Step 2, extracts user interest, uses the weight of each word in the TF-IDF formulae discovery word bag improved, and is multiplied by time decay factor, obtain new weight δ
ij; Select new weight δ
ija front k word is as the interest word of user from large to small, and the respective weights δ of each word
ij, the user interest vector of composition k dimension; The weights that before selecting, k is excellent, and corresponding word forms user interest vector together.This measure is conducive to the dimension of reduction user interest vector space and makes its dimension consistent, is conducive to calculating the Interest Similarity between every two users efficiently.
Step 3, calculates similarity, and the COS distance between every two users of use co sinus vector included angle formulae discovery is as its similarity; Setting similarity threshold, the user exceeding threshold value enters to elect as the neighbor user of targeted customer; The setting range of similarity threshold is 0 ~ 1.
Step 4, sequence service retrieval result: targeted customer submits services request to, goes out all services meeting request by Web service search engine retrieving; The number of times selecting these to serve according to neighbor user and the similarity with targeted customer thereof, adopt weighted mean predictor formula to calculate the recommendation predicted value of each result for retrieval; By result for retrieval according to the descending sort of recommendation predicted value, thus obtain personalized search results.
In the present invention, the TF-IDF(Term Frequency-Inverse Document Frequency of improvement, document-anti-document frequency) formula is as follows:
ω
ij=tf(t
ij)*idf
2(t
ij),
Wherein, t
ijthe jth word in i-th user's word bag, tf (t
ij) be word t
ijdocument frequency, D
ithe word bag of i-th user, freq (t
ij, D
i) be word t
ijat word bag D
ithe number of times of middle appearance, | D
i| be D
ithe number of middle word, idf (t
ij) be word t
ijanti-document frequency, | D| is the number of WSDL document in corpus, | { D
i: t
ij∈ D
i| represent word t
ijoccurred in the word bag of how many users, ω
ijword t
ijweight;
The computing method of time decay factor are as follows:
Decay=2-e
α*t,
Wherein, Decay represents time decay factor, and e is the end of natural logarithm, general use numerical value 2.718.α is attenuation rate, and span is [0,0.1], such as, can be set as 0.1.When α value is 0, Decay=1, represent that weights are not decayed in time, α value is larger, decays faster, and t is the difference between current time and the distance users the last time selecting to serve.Corresponding to the decay characteristics in time that user interest has, devise time decay factor.New weight is the value of former weight and the product of time decay factor, and for non-selected word of a specified duration, its weight decays to 0 gradually.
Word t in each user's word bag
ijnew weight δ
ijcomputing formula is:
δ
ij=ω
ij*Decay。
In the present invention, calculate similarity formula as follows:
Wherein, u
awith u
bbe two different users, sim (u
a, u
b) refer to similarity between these two users, δ
ajand δ
bjuser u respectively
awith user u
bword bag in the weight of a jth word, k is the number of user interest word.
In the present invention, the formula that employing weighted mean predictor formula calculates the recommendation predicted value of each result for retrieval is as follows:
Wherein, u
ttargeted customer, s
tdestination service, i.e. the service of recommendation predicted value to be calculated,
targeted customer u
tto destination service s
trecommendation predicted value,
with
targeted customer u respectively
twith neighbor user u
iaverage select service number of times,
neighbor user u
iselect target service s
tnumber of times, sim (u
t, u
i) be targeted customer u
twith neighbor user u
iinterest Similarity, N is targeted customer u
tneighborhood.
In the present invention, remove stop words and refer to: in information retrieval, stop words refers to the word that the frequency of occurrences is too high, do not have too overall search meaning.Stop words process is a step of vectorial participle in Knowledge Extraction process, the speed of its independent process meeting speed up document process and quality.At present, the English having had some to publish is stopped using vocabulary, and wherein more famous is that the inactive vocabulary delivered of Van Rijsbergen and Brown Corpus stop using vocabulary.Stop using dictionary, Baidu of what Chinese stoplist was more famous have Harbin Institute of Technology stops using vocabulary, Sichuan University's machine intelligence laboratory stops using vocabulary etc.General vocabulary of stopping using not only comprises some general stop words, as a, by, is, at etc., and some vocabulary being included in that Web service field often occurs, such as service, soap, response, request, set, get etc., these words are little for discrimination Web service, and easily introduce interference.The word be contained in this table is removed from WSDL document.The parameter that WSDL document 7 is important: types, import, message, portType, operation, binding and service.These parameters are nested in definitions root element.Adopt WSDL4J(Web Services Description Language for Java Toolkit, the JAVA kit of Web Services Description Language (WSDL)) the WSDL document that user selected is resolved, the content parsed is removed stop words, extracts stem, form the word bag of this user.
In the present invention, stem refers to that all inflectional affixes are removed rear remaining part, and extracting stem is remove the process that affixe obtains root.The baud stem algorithm that the present invention invents at univ cambridge uk's computer laboratory in 1979 according to Martin doctor Poter, carries out the extraction of stem for the word in WSDL document, so that more accurately without repeatedly extracting interest word.
Compared with existing individuation search method, this method has three features: one is that not only implicit expression extracts the interest of user itself, and obtain the relation between different user interest by calculating Interest Similarity, and adopt collaborative filtering, personalized ordering based on interest is carried out to the Search Results of service, improves accuracy and the correlativity of Search Results to a certain extent; Two is add time decay factor in the process of interest formation, illustrates the feature that user interest develops in time more exactly; Three is that first, second and third step of method all can complete by off-line, very little on the impact of recall precision.
The present invention uses the ultimate principle in personalized recommendation just, collaborative filtering is applied to the personalized search of Web service, improves user satisfaction and retrieval precision.Specifically, the present invention collects the search records of user, describes document and extracts user interest, and form interest vector from its Web service selected; According to the similarity of the COS distance measure user interest of interest vector, the user exceeding certain threshold value with the similarity of targeted customer is selected to form the neighbours of this user; When targeted customer submits Service Search Request to, service recommendation system adopts one of certain search technique above to retrieve the service of multiple keyword match for it, but directly result for retrieval is not returned to user, but the recommendation predicted value of these result for retrieval is calculated according to the selection experience of neighbours and Interest Similarity thereof, then by descending sort, user is returned to.Like this, take part in the customization of service search result user transparent, adopt service recommendation method to complete personalized service search.
Beneficial effect: effect of the present invention is embodied in: the extraction of user interest, to user transparent, does not need frequently to inquire user or obtain the explicit feedback of user, thus can obtain approval and the use of more users.User interest and time correlation, the long-time weight not repeating the interest selected decays gradually, finally exit user interest vector, and the service interests of up-to-date frequent selection can add to user interest vector in time, thus can express and follow the tracks of the change of user interest more accurately.Adopt the method for collaborative filtering to carry out recommendation prediction and sequence to Search Results, even if targeted customer does not have the correlation experience of current required service, also can obtain personalized recommendation from the experience of other similar users.Can be widely used in the personalization of Web service search, Service supportive is recommended, and belongs to computer software technical field.
Accompanying drawing explanation
To do the present invention below in conjunction with the drawings and specific embodiments and further illustrate, above-mentioned and/or otherwise advantage of the present invention will become apparent.
Fig. 1 is the process flow diagram of a kind of individuation search method for Web service recommendation of the present invention.
Embodiment
As shown in Figure 1, the invention discloses a kind of individuation search method for Web service recommendation, comprise the following steps:
Step 1, pre-service WSDL document, uses record from user the WSDL document obtaining it and selected, and by removing stop words and extracting stem two pre-treatment step, forms word bag.
Step 2, extracts user interest, uses the weight of each word in the TF-IDF formulae discovery word bag improved, and is multiplied by the time decay factor of this word, obtain new weight; Before selecting weight from large to small, k word is as the interest word of user, and the respective weights of each word, and the user interest of composition k dimension is vectorial.
Step 3, calculates similarity, and the COS distance between every two users of use co sinus vector included angle formulae discovery is as its similarity; Setting similarity threshold, the user exceeding threshold value enters to elect as the neighbor user of targeted customer.
Step 4, sequence service retrieval result: targeted customer submits services request to, goes out all services meeting request by Web service search engine retrieving; The number of times selecting these to serve according to neighbor user and the similarity with targeted customer thereof, adopt weighted mean predictor formula to calculate the recommendation predicted value of each result for retrieval; By result for retrieval according to the descending sort of recommendation predicted value, thus obtain personalized search results.
The TF-IDF formula improved is as follows:
ω
ij=tf(t
ij)*idf
2(t
ij),
Wherein, t
ijthe jth word in i-th user's word bag, tf (t
ij) be word t
ijdocument frequency, D
ithe word bag of i-th user, freq (t
ij, D
i) be word t
ijat word bag D
ithe number of times of middle appearance, | D
i| be D
ithe number of middle word, idf (t
ij) be word t
ijanti-document frequency, | D| is the number of WSDL document in corpus, | { D
i: t
ij∈ D
i| mean t
ijoccurred in the word bag of how many users, ω
ijword t
ijweight.
The computing method of time decay factor are as follows:
Decay=2-e
α*t,
Wherein, Decay represents time decay factor, and e is the end of natural logarithm, and α is attenuation rate, span is [0,0.1], when α value is 0, Decay=1, represent that weights are not decayed in time, α value is larger, decays faster, and t is the difference between current time and the distance users the last time selecting to serve;
Word t in each user's word bag
ijnew weight calculation formula be:
δ
ij=ω
ij*Decay。
In the present invention, calculate similarity formula as follows:
Wherein, u
awith u
bbe two different users, sim (u
a, u
b) refer to similarity between these two users, δ
ajand δ
bjuser u respectively
awith user u
bword bag in the weight of a jth word, k is the number of user interest word.
In the present invention, the formula that employing weighted mean predictor formula calculates the recommendation predicted value of each result for retrieval is as follows:
Wherein, u
ttargeted customer, s
tdestination service, i.e. the service of recommendation predicted value to be calculated,
targeted customer u
tto destination service s
trecommendation predicted value,
with
targeted customer u respectively
twith neighbor user u
iaverage select service number of times,
neighbor user u
iselect target service s
tnumber of times, sim (u
t, u
i) be targeted customer u
twith neighbor user u
iinterest Similarity, N is targeted customer u
tneighborhood.
Embodiment
The substance of the present embodiment is from Web service supermarket (http: // 125.221.225.2:8080/WSSM/) background data base.
The present embodiment comprises following four steps:
(1) pre-service WSDL document
From the background data base in Web service supermarket, extract the use record of 200 users, obtain raw data, the use record of certain customers is as follows:
Table 1 user uses record (part)
List four users in table 1, user name is respectively: " tailaoliu ", " fangfang ", " zww ", and " skh " selected some Web services respectively.Download the Web service that also each user of pre-service selected and describe document, remove stop words according to the inactive vocabulary that Van Rijsbergen delivers, adopt the poter stem algorithm of Martin doctor Poter to extract stem, form word bag.As " taolaoliu " selected service " BookingService " by name, " JasonsBooking ", three Web services of " HotelBookingEngine ", the WSDL document of three Web services is downloaded from corresponding service website, through removal stop words and extract the word bag that formed after stem for " render(84); hotel(99); reservation(40), invoice(33), room(269); city(81); client(13), book(194), ticket(13); basket(42), rate(25) ".This word bag comprises 11 words altogether, and what mark in the bracket wherein after each word is the number of times that this word occurs in a document.
(2) user interest is extracted
The word bag composition corpus of all users, uses the weight of each word in the TF-IDF formulae discovery word bag improved; The weight of the word in each user's word bag is multiplied by time decay factor, obtains new weight.The word that before weight, k is excellent and corresponding weight composition user interest vector thereof.As " render " word in the word bag of " taolaoliu ", the number of times occurred in the word bag of " taolaoliu " is 84 times, in the word bag of 200 users, one has in the word bag of 68 users and occurred this word, therefore, as follows according to the weights of the TF-IDF formulae discovery " render " improved in claim steps 2
ω
“render”=7.64*0.47
2=1.68,
Continue decay factor computing time, α value is 0.05, t value is that initial value 1, Decay is calculated as follows:
Decay=2-e
0.05*1=0.95,
New weight computing is as follows:
δ
“render”=1.68*0.95=1.59
Similarly, calculate the weights of remaining word in word bag, and get front 6 (i.e. k=6) words of maximum weight, the interest vector obtaining user " taolaoliu " is: < (basket, 6.42), (hotel, 4.03), (room, 3.15), (book, 2.82), (render, 1.59), (information, 1.24) >, the interest vector of user " fangfang " is: < (book, 3.31), (price, 3.26), (title, 3.23), (author, 3.17), (ISBN, 2.15), (infomation, 1.13) >, the interest vector of user " zww " is: < (weather, 4.42), (city, 3.33), (forecast, 3.29), (replication, 2.12), (add, 1.12), (id, 1.11) >, the interest vector of user " skh " is: < (weather, 3.39), (comment, 3.31), (forecast, 2.27), (city, 2.22), (replication, 1.20), (add, 1.10) >.Wherein, the new weights of this word of numeral after interest word.
(3) Interest Similarity is calculated
COS distance between every two users of use co sinus vector included angle formulae discovery is as its similarity; Setting similarity threshold, the user exceeding threshold value enters to elect as the neighbours of targeted customer.Such as, adopt the similarity of calculating formula of similarity calculating user " taolaoliu " and user " fangfang " as follows:
Setting similarity threshold is 0.15, then " taolaoliu " and user " fangfang " neighbor user each other.
(4) sort service retrieval result
Targeted customer submits services request to, and Web service supermarket retrieves all services meeting request for it; According to the services selection experience of neighbours and the similarity with targeted customer thereof, weighted mean predictor formula is adopted to calculate the recommendation predicted value of each result for retrieval.Such as, for the services request that targeted customer " taolaoliu " submits to, result for retrieval comprises the Web service of service " BookStoreService " by name, if this service was only selected 3 times by " fangfang ", the average number of times selecting service of user " taolaoliu " is 2, the average number of times selecting service of user " fangfang " is 1.5, then the recommendation predictor calculation of this service is as follows:
By result for retrieval according to the descending sort of recommendation predicted value, thus user can obtain rapidly from the first page of result for retrieval the personalized search results meeting its interest.
Implementation result:
User " zww ", as current goal user, ites is desirable to obtain Online Book Shopping service.Respectively to seekda search system (
http:// webservices.seekda.com/, belong to prior art) and after Web service supermarket submits services request key word " book " to, the Search Results of front 10 ranks obtained is respectively as shown in table 2 and table 3.
The service of table 2.seekda Search Results top 10
In table 2, only have sequence number to be 2,4, the service of 5 provides Online Book Shopping function, user " zww " also will manually find the service meeting oneself demand after obtaining the returning results of system, and this process is consuming time, uninteresting, fallibility often.
Table 3.Web serves the service of supermarket Search Results top 10
In table 3, except sequence number is the service of 9, all the other services are all relevant to Online Book Shopping.As can be seen here, personalized search uses collaborative filtering mode calculation services to recommend predicted value, can improve searching accuracy and the user search efficiency of service, improve the satisfaction of user to Web service search engine.
The invention provides a kind of individuation search method for Web service recommendation; the method and access of this technical scheme of specific implementation is a lot; the above is only the preferred embodiment of the present invention; should be understood that; for those skilled in the art; under the premise without departing from the principles of the invention, can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.The all available prior art of each ingredient not clear and definite in the present embodiment is realized.
Claims (1)
1. for an individuation search method for Web service recommendation, it is characterized in that, comprise the following steps:
Step 1, pre-service Web Services Description Language (WSDL) WSDL document: use record the WSDL document obtaining it and selected from user, by removing stop words and extracting stem two pre-treatment step, forms word bag;
Step 2, extracts user interest: the weight calculating each word in word bag, and is multiplied by time decay factor, obtain new weight δ
ij; Select new weight δ
ija front k word is as the interest word of user from large to small, and the respective weights δ of each word
ij, the user interest vector of composition k dimension;
Step 3, calculates Interest Similarity: calculate COS distance between every two user interest vectors as its Interest Similarity; Setting similarity threshold, the user exceeding threshold value enters to elect as the neighbor user of targeted customer;
Step 4, sequence service retrieval result: targeted customer submits services request to, goes out all services meeting request by Web service search engine retrieving; The number of times selecting these to serve according to neighbor user and the similarity with targeted customer thereof, adopt weighted mean predictor formula to calculate the recommendation predicted value of each result for retrieval; By result for retrieval according to the descending sort of recommendation predicted value, thus obtain personalized search results;
In step 2, calculate the weight of each word in word bag, and be multiplied by the time decay factor of this word, obtain new weight δ
ijcomprise the steps:
Use the TF-IDF formulae discovery weights omega improved
ij:
ω
ij=tf(t
ij)*idf
2(t
ij),
Wherein, t
ijthe jth word in i-th user's word bag, tf (t
ij) be word t
ijdocument frequency, D
ithe word bag of i-th user, freq (t
ij, D
i) be word t
ijat word bag D
ithe number of times of middle appearance, | D
i| be D
ithe number of middle word, idf (t
ij) be word t
ijanti-document frequency, | D| is the number of WSDL document in corpus, | { D
i: t
ij∈ D
i| represent word t
ijoccurred in the word bag of how many users, ω
ijword t
ijweight;
The computing method of time decay factor are as follows:
Decay=2-e
α*t,
Wherein, Decay represents time decay factor, and e is the end of natural logarithm, and α is attenuation rate, span is [0,0.1], when α value is 0, Decay=1, represent that weights are not decayed in time, α value is larger, decays faster, and t is the difference between current time and the distance users the last time selecting to serve;
Word t in each user's word bag
ijnew weight δ
ijcomputing formula is:
δ
ij=ω
ij*Decay;
In step 4, the formula that employing weighted mean predictor formula calculates the recommendation predicted value of each result for retrieval is as follows:
Wherein, u
ttargeted customer, s
tdestination service, i.e. the service of recommendation predicted value to be calculated,
targeted customer u
tto destination service s
trecommendation predicted value,
with
targeted customer u respectively
twith neighbor user u
iaverage select service number of times,
neighbor user u
iselect target service s
tnumber of times, sim (u
t, u
i) be targeted customer u
twith neighbor user u
iinterest Similarity, N is targeted customer u
tneighborhood;
Adopt in step 3 and calculate user interest similarity with the following method:
Wherein, u
awith u
bbe two different users, sim (u
a, u
b) refer to similarity between these two users, δ
ajand δ
bjuser u respectively
awith user u
bword bag in the weight of a jth word, k is the number of user interest word.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210253884.2A CN102819575B (en) | 2012-07-20 | 2012-07-20 | Personalized search method for Web service recommendation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210253884.2A CN102819575B (en) | 2012-07-20 | 2012-07-20 | Personalized search method for Web service recommendation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102819575A CN102819575A (en) | 2012-12-12 |
CN102819575B true CN102819575B (en) | 2015-06-17 |
Family
ID=47303686
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210253884.2A Active CN102819575B (en) | 2012-07-20 | 2012-07-20 | Personalized search method for Web service recommendation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102819575B (en) |
Families Citing this family (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104102648B (en) * | 2013-04-07 | 2017-12-01 | 腾讯科技(深圳)有限公司 | Interest based on user behavior data recommends method and device |
CN104111959B (en) * | 2013-04-22 | 2017-06-20 | 浙江大学 | Service recommendation method based on social networks |
CN103324690A (en) * | 2013-06-03 | 2013-09-25 | 焦点科技股份有限公司 | Mixed recommendation method based on factorization condition limitation Boltzmann machine |
CN103473291B (en) * | 2013-09-02 | 2017-01-18 | 中国科学院软件研究所 | Personalized service recommendation system and method based on latent semantic probability models |
CN103678652B (en) * | 2013-12-23 | 2017-02-01 | 山东大学 | Information individualized recommendation method based on Web log data |
US9953060B2 (en) | 2014-03-31 | 2018-04-24 | Maruthi Siva P Cherukuri | Personalized activity data gathering based on multi-variable user input and multi-dimensional schema |
CN104318268B (en) * | 2014-11-11 | 2017-09-08 | 苏州晨川通信科技有限公司 | A kind of many trading account recognition methods based on local distance metric learning |
CN104899266B (en) * | 2015-05-22 | 2017-10-24 | 广东欧珀移动通信有限公司 | Method and device is recommended in one kind application |
CN105205139B (en) * | 2015-09-17 | 2019-06-14 | 罗旭斌 | A kind of personalization document retrieval method |
CN106055594A (en) * | 2016-05-23 | 2016-10-26 | 成都陌云科技有限公司 | Information providing method based on user interests |
CN106126669B (en) * | 2016-06-28 | 2019-07-16 | 北京邮电大学 | User collaborative filtering content recommendation method and device based on label |
US10147335B2 (en) * | 2016-07-15 | 2018-12-04 | Lakshmi Arthi Krishnaswami | Education data platform to support a holistic model of a learner |
CN106708920A (en) * | 2016-10-09 | 2017-05-24 | 南京双运生物技术有限公司 | Screening method for personalized scientific research literature |
CN107463683B (en) * | 2017-08-09 | 2018-07-24 | 深圳壹账通智能科技有限公司 | The naming method and terminal device of code element |
CN108268584A (en) * | 2017-08-25 | 2018-07-10 | 广州市动景计算机科技有限公司 | Message push method, device and server |
CN107562919B (en) * | 2017-09-13 | 2020-07-17 | 云南大学 | Multi-index integrated software component retrieval method and system based on information retrieval |
CN109978642A (en) * | 2017-12-27 | 2019-07-05 | 中移(杭州)信息技术有限公司 | A kind of information recommendation method, device and communication equipment |
CN109408713B (en) * | 2018-10-09 | 2020-12-04 | 哈尔滨工程大学 | Software demand retrieval system based on user feedback information |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101685456A (en) * | 2008-09-26 | 2010-03-31 | 华为技术有限公司 | Search method, system and device |
CN101996200A (en) * | 2009-08-19 | 2011-03-30 | 华为技术有限公司 | Method and device for searching file |
CN102156733A (en) * | 2011-03-25 | 2011-08-17 | 清华大学 | Search engine and method based on service oriented architecture |
-
2012
- 2012-07-20 CN CN201210253884.2A patent/CN102819575B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101685456A (en) * | 2008-09-26 | 2010-03-31 | 华为技术有限公司 | Search method, system and device |
CN101996200A (en) * | 2009-08-19 | 2011-03-30 | 华为技术有限公司 | Method and device for searching file |
CN102156733A (en) * | 2011-03-25 | 2011-08-17 | 清华大学 | Search engine and method based on service oriented architecture |
Also Published As
Publication number | Publication date |
---|---|
CN102819575A (en) | 2012-12-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102819575B (en) | Personalized search method for Web service recommendation | |
US8200617B2 (en) | Automatic mapping of a location identifier pattern of an object to a semantic type using object metadata | |
Zhong et al. | Time-aware service recommendation for mashup creation in an evolving service ecosystem | |
CN104866554B (en) | A kind of individuation search method and system based on socialization mark | |
Fang et al. | Towards automatic tagging for web services | |
Gao et al. | SeCo-LDA: Mining service co-occurrence topics for recommendation | |
CN102156747B (en) | Method and device for forecasting collaborative filtering mark by introduction of social tag | |
Gao et al. | SeCo-LDA: Mining service co-occurrence topics for composition recommendation | |
KR100954842B1 (en) | Method and System of classifying web page using category tag information and Recording medium using by the same | |
CN105468649A (en) | Method and apparatus for determining matching of to-be-displayed object | |
Li et al. | CoWS: An Internet-enriched and quality-aware Web services search engine | |
Zhang et al. | MMOY: Towards deriving a metallic materials ontology from Yago | |
CN105677825A (en) | Analysis method for client browsing operation | |
Rawat et al. | Topic modelling of legal documents using NLP and bidirectional encoder representations from transformers | |
Bin et al. | A neural multi-context modeling framework for personalized attraction recommendation | |
Wang et al. | Towards services discovery based on service goal extraction and recommendation | |
Yochum et al. | Tourist attraction recommendation based on knowledge graph | |
Zhuo | Consumer demand behavior mining and product recommendation based on online product review mining and fuzzy sets | |
Ma et al. | Api prober–a tool for analyzing web api features and clustering web apis | |
Du et al. | Scientific users' interest detection and collaborators recommendation | |
Vo | An integrated topic modeling and auto-encoder for semantic-rich network embedding and news recommendation | |
Hu et al. | A personalised search approach for web service recommendation | |
KR101277300B1 (en) | Method and apparatus for presenting personalized advertisements | |
Ibrahim et al. | A Scientometric Approach for Personalizing Research Paper Retrieval. | |
Layfield et al. | Experiments with document retrieval from small text collections using latent semantic analysis or term similarity with query coordination and automatic relevance feedback |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C41 | Transfer of patent application or patent right or utility model | ||
TR01 | Transfer of patent right |
Effective date of registration: 20160310 Address after: 222000 Jinqiao Road 19, Lianyungang economic and Technological Development Zone, Jiangsu, Lianyungang Patentee after: Ten Party health management (Jiangsu) Limited Patentee after: JIANGSU HUAKANG INFORMATION TECHNOLOGY CO., LTD. Address before: Qixia Xianlin Avenue District of Nanjing City, Jiangsu Province, Nanjing University No. 163 210093 Patentee before: Nanjing University |