CN101719145A - Individuation searching method based on book domain ontology - Google Patents

Individuation searching method based on book domain ontology Download PDF

Info

Publication number
CN101719145A
CN101719145A CN200910238155A CN200910238155A CN101719145A CN 101719145 A CN101719145 A CN 101719145A CN 200910238155 A CN200910238155 A CN 200910238155A CN 200910238155 A CN200910238155 A CN 200910238155A CN 101719145 A CN101719145 A CN 101719145A
Authority
CN
China
Prior art keywords
user
interest
individuation
algorithm
domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN200910238155A
Other languages
Chinese (zh)
Other versions
CN101719145B (en
Inventor
张铭
孙韬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN2009102381558A priority Critical patent/CN101719145B/en
Publication of CN101719145A publication Critical patent/CN101719145A/en
Application granted granted Critical
Publication of CN101719145B publication Critical patent/CN101719145B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides an individuation searching method based on book domain ontology, belonging to an individuation network searching service. The individuation searching method comprises the following steps of: establishing the domain ontology, introducing a collaborative filtering idea, and adding semantic relationships which present influences among users; analyzing and processing logs and establishing a user model based on user interests and preferences; computing individuation scoring through spreading Activation (SA) based on the user model and the domain ontology; in addition, rearranging searched results, rearranging the results returned by a primary search engine according to an individuation scoring sequence from high to low, and returning to the users. Through introducing the collaborative filtering idea into the domain ontology, establishing the user model which presents user interest changes timely, and analyzing user requirements accurately by the SA, the invention effectively eradicates ambiguity of key words and greatly improves satisfaction degree of the users on the searched results.

Description

Individuation search method based on book domain ontology
Technical field
The present invention relates to the personalized service in library, relate in particular to the method that personalized search is provided for the library, belong to Computer Applied Technology information management technique field.
Background technology
Information age, along with the explosion type increase of quantity of information, " information overload " becomes a very important problem gradually.Universal search engine can return thousands of Search Results, but this has brought the difficulty of information sifting to the user such as Google and Baidu etc.And the user often submits to some that the keyword of ambiguity is arranged, such as, like the user A of thriller to use keyword " Leonardo da Vinci " to search for red Blang's masterpiece " Leonardesque password ", and the user B that makes earnest efforts the Renaissance art select keyword " Leonardo da Vinci " to search for Leonardesque paintings too.Obviously they have different information requirements, but Google, Baidu can return to their same Search Results.Digital library is faced with this severe problem too, because the continuous growth of digital document quantity and the indeterminate property of keyword, the user has to spend the more and more longer time and selects really required from return results.Personalized search can be by the analysis user historical record, set up user model, returns Search Results more accurately at user's real demand, thus " information overload " problem of solution.
At present, in the personalized search technical field, Chinese scholars has been launched a large amount of and deep research work, main individuation search method has: based on personalized PageRank algorithm (T.H.Haveliwala., Topic-sensitive PageRank.Proceedings of the 11th internationalconference on World Wide Web.New York, USA, 2002.), based on the individuation search method of this algorithm according to user's browse history analysis user interest, the document of deflection particular category when random walk (Random Walk) then; Based on clustering algorithm (Ferragina Gulli, A Personalized SearchEngine Based On Web Snippet Hierarchical Clustering, Software Practiceand Experi ence, Volume 38,2008.), this method is carried out cluster to document, highlighted then user's interest particular category or the like.Though above method can not effectively be eliminated the ambiguity of keyword according to user interest realization personalized search to a certain degree, does not also consider the semantic knowledge in searched field, has caused the disappearance of the semantic relation between the document.
People such as Yolanda Blanco-Fern á ndez have proposed method (the Yolanda Blanco-Fern á ndez that utilizes semantic reasoning to realize personalized service, Jos é J.Pazos Arias, etc., SemanticReasoning:A Path to New Possibilities of Personalization, the 5th AnnualEuropean Semantic Web Conference, Tenerife, Spain, 2008.).This method is at first set up domain body according to domain knowledge, uses ρ-path, the ρ-join of semantic reasoning technology, ρ-cp method to obtain potential semantic relation between the example (Instance) then.Utilize inference technology to expand after the body, calculate similarity between each example and the user preference based on domain body again.Though this method can utilize domain body effectively to eliminate the keyword ambiguity, has following shortcoming:
1. do not consider influencing each other between the user, only calculate similarity from the angle of example;
2. do not consider the variability of user interest, can't follow the tracks of the youngest demand of user.
Summary of the invention
In order to overcome the deficiencies in the prior art, the invention provides a kind of individuation search method and system based on book domain ontology.This method is at first set up book domain ontology, considers the influence between the user, adds new semantic relation; Set up user model (User Profile) then, interest has been carried out classification and weighting according to time sequencing; Use again by figure mining algorithm SA and calculate personalized score, and reset the result that search engine returns in view of the above, realize personalized search.The inventive method is effectively eliminated the keyword ambiguity by utilizing domain body, embody user interest immediately and change, thereby accurate analysis user demand significantly improves the satisfaction of user to Search Results.The technical solution adopted for the present invention to solve the technical problems is:
Based on the individuation search method of book domain ontology, it comprises the foundation of off-line certain customers model and the foundation of domain body, and the calculating of the personalized score of online part and the rearrangement of Search Results, and concrete steps are as follows:
Step 1, set up domain body:,, and add the thought of collaborative filtering with the classed thesaurus of a specific area body as description object according to user's history, set up domain body, thereby the concept definition of this specific area and the semantic relation between the notion are provided.
Described domain body provides abundant semantic information, strengthened the semantic relation between the entity, thereby overcome the phenomenons of using in the original searching results such as polysemant, synonym and word dependence, played the effect of disambiguation, and can reflect interacting of interest between the user.This domain body will be as the communication network of searching algorithm of the present invention.
Step 2 is set up user model, according to user's history analyzing and processing is carried out in daily record, the analysis user historical record.As time goes on reader's interest can constantly change, and therefore according to time sequencing user interest classified and weighting.Be divided into instant interest, in the recent period interest and long-term interest three classes according to borrowing the time period, and give this three classes interest weights from high to low, thereby set up user model based on user interest preference.
Described user model can in time embody the renewal and the migration of user interest.
The personalized score of step 3 is calculated, and according to domain body of having set up and user model, calculates this personalization score by figure mining algorithm SA, and concrete calculation procedure is as follows:
At first, domain body is seen mapping, utilization SA algorithm on domain body, and with the initial point (Initial Nodes) of the books that have been endowed weights in the user model for the propagation diffusion; Then, cycle index restriction, the travel path restriction of SA algorithm is set and propagates the terminal point restriction, to improve the efficient of algorithm; Pass through more new formula of score at last, constantly iteration is upgraded the activation value of each point, finishes up to whole algorithm.
After each loop ends of described figure mining algorithm SA, collect field feedback, upgrade the initial point activation value of next time propagating, collected field feedback comprises: the books that the books of the new clickthrough of user and user newly borrow, these two parts books are the instant interest of conduct all, and give weights.
Step 4 is reset Search Results, according to the personalized score that described SA obtains, according to order from high to low the result that former search engine returns is reset, and returns to the user then.
Based on the personalized search system of book domain ontology, it comprises:
The domain body module, system sets up domain body, thereby the concept definition of this specific area and the semantic relation between the notion is provided by with the body of a specific area as description object;
User model, it carries out analyzing and processing to daily record, and the analysis user historical record is because As time goes on user interest constantly changes, according to time sequencing described user interest is classified and weighting, thereby set up user model based on user interest preference;
Personalized score computing module, according to domain body of having set up and user model, SA calculates this score by the figure mining algorithm; And,
Reset the Search Results module, the personalized score that it obtains according to described SA is reset the result that former search engine returns according to order from high to low, returns to the user then.
Described figure mining algorithm SA in the described personalized score computing module comprises as lower unit:
Initial value determining unit: domain body is seen mapping, with the initial point of the books that have been endowed weights in the user model as the propagation diffusion;
The circulation of SA algorithm is provided with the unit: cycle index restriction, the travel path restriction of SA algorithm is set and propagates the terminal point restriction, to improve the efficient of algorithm; And,
Iteration unit: by score new formula more, constantly upgrade the activation value of each point iteratively, finish up to whole algorithm.
Beneficial effect of the present invention:
1. the inventive method combines collaborative filtering thought and SA algorithm, when setting up domain body, introduce new semantic relation-borrowIntent, thereby from the similarity between two entities of angle reflection of user interest, greatly enrich and improved the ability to express of domain body, guaranteed the information integrity of SA communication network simultaneously.
2. the more accurate interest of analysis user meticulously is to set up user model.When setting up user model, distinguished the interest of user's different time sections, by giving different weights, objective, comprehensive representation user interest knowledge embodies and follows the tracks of the variation of user interest, and has guaranteed the rationality of SA algorithm initial point weights.
Method of the present invention uses the true daily record data of Peking University Library (http://www.lib.pku.edu.cn) to evaluate and test, experimental data shows, personalized search by the inventive method is reset the result, can be in the Search Results that returns, effectively eliminate the keyword ambiguity, and significantly improve user's interest books rank, thereby save user's browsing time, improve user satisfaction.Simultaneously, method of the present invention has more than and is limited to the field, library, can expand applying to other field, has higher experimental value.
Description of drawings
Fig. 1 is the domain body synoptic diagram of the inventive method;
Fig. 2 is for providing the process synoptic diagram of personalized search service for the user according to the present invention;
Fig. 3 is for adopting the inventive method and other three kinds of method Top N results' Norm DCG mean value comparison diagram;
Fig. 4 is for adopting the comparison diagram of user's number as a result interested among the inventive method and other three kinds of method Top N.
Embodiment
Below in conjunction with the drawings and specific embodiments the present invention is described in further detail:
Embodiment 1: the true daily record data with Peking University Library year June in January, 2008 to 2008 is an example, describes the specific embodiment of the present invention in detail in conjunction with the domain body synoptic diagram of Fig. 1.
What this embodiment was described is to provide the personalized search service method for library users.Target is the retrieval request for same keyword, and different user can access the information that oneself needs of pressing close to most, thereby brings better user experience for the user.In this embodiment, as shown in Figure 2 based on the system architecture of the individuation search method of book domain ontology.
Specifically describe as follows:
First, the off-line part.Finish under the work of off-line part is online, promptly before the user submits keyword search to, finish.Concrete step is as follows:
Step 1 is set up domain body: with the body of a specific area as description object, set up domain body, thereby the concept definition of this specific area and the semantic relation between the notion are provided.Particularly, domain body suggestion process is as follows:
The domain body synoptic diagram as shown in Figure 1.When setting up body, introduce collaborative filtering (Badrul Sarwar, George Karypis, Joseph Konstan, John Riedl., Item-Based CollaborativeFiltering Recommendation Algorithms, Proceedings of the 10thinternational conference on World Wide Web, Hong Kong, 2001.) thought, consider the influence between the user, add new semantic relation.Set up user model then, interest has been carried out classification and weighting according to time sequencing.Use Spreading Activation (SA) Model (AM Collins again, EF Loftus.A spreading-activation theory of semantic processing.Psychological review.V.82 p.407-428,1975.) algorithm, reset the result that search engine returns, realize personalized search.In field this subject that the present invention sets up, notion (concept) comprises books classification (class) and books entity (instance), contact between the notion comprises the rdfs:subClassOf that W3C recommends, rdf:type, dc:creator, the new borrowIntent that proposes of dc:subject and the present invention.
Specifically, for the books field,, set up top layer classification " F economy ", " I literature ", " J art " etc., and subclass " J2 drawing ", " J22 Chinese painting works " etc. based on Chinese Library classification (CLC).Get in touch with rdfs:subClassOf between classification and the subclass.For every books,, be classified to the bottom classification of CLC according to the middle figure classification number of its correspondence.Classification number such as " Da Vinci Code " is I712.45/598, then is classified to " I7 ".Get in touch with rdf:type between books and the classification.Author and subject information according to books, continuation is introduced dc:creator and dc:subject contact at domain body, such as getting in touch with dc:creator between " Da Vinci Code " and its author " red Blang ", get in touch with dc:subject between " painting of Leonardo da Vinci " and its theme " the Renaissance art ".
Afterwards, use for reference the thought of collaborative filtering,, introduce the oriented asymmetric contact borrowIntent of books weighting between any two from reader's reading interest angle.
BorrowIntent specifically is defined as: if n is arranged 1Individual reader has borrowed books b 1, n 2Individual reader has borrowed books b 2, b 1→ b 2Limit weight (link weight) be: borrowIntent (b 1, b 2)=| n 1∩ n 2|/n 1, in like manner, b is arranged 2→ b 1The limit weight: borrowIntent (b 2, b 1)=| n 1∩ n 2|/n 2
After setting up domain body, carry out the daily record arrangement:
Generally, the form of log record is relatively chaotic and can contain a large amount of garbages.Therefore, at first need to put in order daily record, remove illegal or wrong record, such as have " MISSING " or "?? "Then, all log informations are organized into the form of table 1 and deposit relational database in.Wherein, entry_id represents record number, and book_id represents the middle figure classification numbering of these books, user_id is that (Customs Assigned Number is only used for the differentiation user of system for Customs Assigned Number, can not infer Any user information, not relate to privacy of user), timestamp is the date of this record.
Table 1 log information table
??entry_id ??book_id ??user_id ??timestamp
??1 ??B516.47/9.2 ??00000001 ??2008-01-02
??... ??... ??... ??...
??389,138 ??C37/2 ??00010009 ??2008-06-30
Simultaneously, need safeguard the table of another book information, wherein, book_title represents the complete title of these these books.
Table 2 book information table
??book_id ?book_title
??I712.45/598 Da Vinci Code
??... ?...
??K835.4657/6e Leonardo da Vinci draws biography=Da vinci
Step 2, set up user model: analyzing and processing is carried out in daily record, and the analysis user historical record is because As time goes on reader's interest constantly change, classify and weighting according to the books that time sequencing is borrowed the user, thereby set up user model based on user interest preference.Detailed process is as follows:
By analyzing the log information table, can obtain specific user's the history of borrowing, thereby analysis user interest is set up user interest model.The present invention is divided three classes user interest-instant interest (other books that this is borrowed) according to the time period, in the recent period interest (borrowing within one month) and long-term interest (other).For each these books i in the user interest model, weights A[i] to give formula as follows,
Figure G2009102381558D0000081
In the following formula, the present invention is α=4 through the final selected parameter of experiment contrast, β=2, γ=1.Say that intuitively the significance level of representing instant interest is the twice of recent interest, the significance level of interest is the twice of long-term interest in the recent period.
Because user's interest is not unalterable, user interest model of the present invention is brought in constant renewal in as time passes.
The online part of second portion.The work of online part is finished on line, promptly finishes after the user submits keyword search to.The concrete job step of online part is as follows:
Step 3, personalized score is calculated: according to domain body of having set up and user model, calculate this personalization score by figure mining algorithm SA.The concrete grammar that calculates personalized score with SA is as follows:
The activation value (Activation Score) that personalized score promptly calculates by SA.The domain body that off-line is partly set up is the communication network of SA, in this network, and node (node) expression books, classification, author and theme; Link (i, j) limit of expression link node i and node j.During SA propagates, remove borrowIntent, nonoriented edge is all regarded on other limits as, has guaranteed that like this activation value can be from book b 1Propagate into b 1Corresponding class/author/theme propagates into book b again 2That is to say that among all limits, having only borrowIntent is oriented weighting limit.Simultaneously, the user interest model that off-line partly obtains becomes the weighting initial point of SA, and other have an initial activation value is 0.In the SA communication process, the activation value A[j of node j] upgrade as follows.
A [ j ] = A [ j ] + Σ i ∈ { i | link ( i , k ) } A [ i ] * DecayFactor
In the following formula, DecayFactor is an attenuation coefficient, and expression propagates into activation value after the decay of neighbours' node j by node i.The parameter setting that the present invention uses according to Ming-Hung Hsu (Ming-Hung Hsu, Hsin-Hsi Chen.A methodto predict social annotations.CIKM, Napa Valley, CA, USA, 2008.), the attenuation coefficient default value is made as 0.8.Different with Ming-Hung Hsu is that when the limit was borrowIntent, attenuation coefficient was the limit weight on this limit.
Figure G2009102381558D0000091
In the SA communication process, in order to raise the efficiency, the present invention is following restriction for SA is provided with:
(1) cycle index restriction.In this embodiment, the cycle index of restriction SA is 3.
(2) travel path restriction.The distance that control is propagated, in this embodiment, limiting farthest, propagation distance is 2.
(3) propagate the terminal point restriction.In this embodiment, propagate terminal point and be limited in, after propagation ran into specified point, propagation stopped.
It is pointed out that the present invention can collect field feedback after each loop ends of SA, upgrade the initial point activation value of next time propagating.The field feedback that can collect comprises: the books of the new clickthrough of user, the books that the user newly borrows.These books all will be as instant interest, and give weights.
Step 4 is reset Search Results: according to the personalized score that described SA obtains, according to order from high to low the result that former search engine returns is reset, return to the user then.Specific embodiments is as follows:
After SA finished, personalized rearrangement can be regarded the final personalized score that obtains according to SA as, the process that original searching results is reset.
Actual evaluation result according to the inventive method is as follows:
Determine evaluation metrics.Evaluation metrics is Discounted Cumulative Gain (DCG).By the Jaime Teevan of Massachusetts Polytechnics at J J Teevan, ST Dumais, E Horvitz.PersonalizingSearch via Automated Analysis of Interests and Activities.Proceedingsof the 28th annual international ACM SIGIR, Salvador, 2005.New York, ACM Press, propose in 2005:178~185, utilize the method for manually mode of Query Result marking being evaluated and tested the personalized retrieval system in conjunction with the DCG formula.The method is given different importance degrees according to the difference that different web pages sorts to it in result for retrieval, the high more result for retrieval importance degree that sorts is big more, and the user is also big more to the influence of system performance to its marking.Therefore utilize the DCG formula that the user is combined with result's sorting position the marking of result for retrieval, the value that calculates is as the evaluation metrics of system performance.
In the test and appraisal of the present invention, in the Search Results that provides, the books G (i)=3 that the user borrows really, other are G (i)=1 as a result, and the computing formula of DCG iteration is as follows
DCG ( i ) = G ( 1 ) i = 1 DCG ( i - 1 ) + G ( i ) / log ( i ) i > 1
Because the quantity as a result that each search is returned is all inconsistent, also needs to do normalization.The high more search of correlated results ordering is desirable search more, and the DCG (i) of this moment is as ideal DCG (i), and then final evaluation and test formula is as follows.
normalizedDCG ( i ) = DCG ( i ) idealDCG ( i )
Obviously, standard DCC (normalized DCG is called for short Norm DCG) is high more, illustrates that Search Results coincide with user interest more, and the effect of personalized search is good more.
The evaluation result test data is the Peking University Library true daily record data in year June in January, 2008 to 2008.Method of the present invention contrasts in three kinds of additive methods, and the specific descriptions of each method are as follows:
" Lucene/VSM " method: the Lucene Score API of the search engine Lucene that increases income utilizes the original searching results of VectorSpace Model (VSM).Why comparing with the result of Lucene is because Lucene is adopted as index and search engine by how tame digital library in the world wide, such as Florence National Library (http://www.planetware.com/florence/national-library-i-to-fbc.ht m), New York Public Library (http://www.nypl.org/) or the like.The result of Lucene is the Search Results in each library in the simulating reality to greatest extent.
" SA " method: with Ahu Sieg at Ahu Sieg, Bamshad Mobasher, Robin Burke.Websearch personalization with ontological user profiles.CIKM, Lisbon, Portugal, 2007. the method for middle employing is similar: do not have borrowIntent in the domain body, also user interest is not classified.
" SA+B " method: add borrowIntent in the domain body, but user interest is not classified.
Method of the present invention " SA+B+S ": add borrowIntent in the domain body, simultaneously user interest is classified.The performance comparison result of each method is as shown in the table:
Each method performance of table 3 relatively
Figure G2009102381558D0000111
Based on last table as can be seen, method best performance of the present invention.Getting baseline results through rearrangement SA method with Lucene compares, Norm DCG mean value has improved 12.9%, by introducing borrowIntent and when setting up user interest model user interest being classified and weighting, the Norm DCG mean value of the inventive method has reached 0.848.
In actual search, the user often only browses and is listed in the highest Top N result of preceding two pages rank.Based on this, getting the performance of Top N as a result the time in contrived experiment method more of the present invention and aforementioned three kinds of methods, result such as Fig. 3, shown in Figure 4.From Fig. 3, the curve of Fig. 4 can find out obviously that method effect of the present invention is better than additive method, has improved user's interest result's rank greatly.
The present invention is not exceeded with the foregoing description, and the inventive method is equally applicable to the expansion of user's degrees of association such as electronic product, e-book, mobile phone and sells.In addition, above-mentioned only is preferred embodiment of the present invention, is not used for limiting practical range of the present invention.That is to say that any equal variation and modification of being made according to claim scope of the present invention is all claim scope of the present invention and contains.

Claims (9)

1. based on the individuation search method of book domain ontology, it is characterized in that, be included in the foundation of off-line certain customers model and domain body, and online part individualized feature calculates and the rearrangement Search Results, concrete steps are as follows:
Step 1, set up domain body:,, and add the thought of collaborative filtering with the classed thesaurus of a specific area body as description object according to user's history, set up domain body, thereby the concept definition of this specific area and the semantic relation between the notion are provided;
Step 2, set up user model: analyzing and processing is carried out in daily record, and the analysis user historical record is set up user model according to time sequencing;
Step 3, personalized score is calculated: according to domain body of having set up and user model, calculate this personalization score by figure mining algorithm SA;
Step 4 is reset Search Results: according to the personalized score that described SA obtains, according to order from high to low the result that former search engine returns is reset, return to the user then.
2. the individuation search method based on book domain ontology according to claim 1, it is characterized in that: domain body described in the step 1 is the communication network of this searching algorithm, by abundant semantic information being provided and strengthening semantic relation between the entity, overcome the polysemant, synonym and the word that use in the original searching results and relied on phenomenon, thereby played the effect of eliminating semantic ambiguity; And introduce collaborative filtering thought, can embody interacting between the user.
3. the individuation search method based on book domain ontology according to claim 1, it is characterized in that: in the step 2, because As time goes on reader's interest constantly change, according to time sequencing user interest is classified and weighting, thereby set up the user model that embodies user interest preference.
4. the individuation search method based on book domain ontology according to claim 3, it is characterized in that, described according to time sequencing to user interest classify and weighting be meant, according to borrowing the time period described time sequencing is divided into instant interest, in the recent period interest and long-term interest three classes, and gives this three classes interest weights from high to low.
5. according to claim 1 or 4 described individuation search methods based on book domain ontology, it is characterized in that: in the step 2, described user model is by the analysis user historical record, behavioural habits according to the user are differentiated the behavior preference of user in retrieving, and the renewal and the migration of embodiment user interest, thereby utilize user interest preference to realize personalized service.
6. the individuation search method based on book domain ontology according to claim 1 is characterized in that: in the step 3, the concrete calculation procedure of described figure mining algorithm SA is as follows:
At first, domain body is seen mapping, utilization figure mining algorithm SA on domain body, and with the initial point of the books that have been endowed weights in the user model for the propagation diffusion;
Then, cycle index restriction, the travel path restriction of SA algorithm is set and propagates the terminal point restriction, to improve the efficient of algorithm;
At last, by score new formula more, constantly iteration is upgraded the fractional value of each point, finishes up to whole algorithm.
7. according to claim 1 or 6 described individuation search methods based on book domain ontology, it is characterized in that: after each loop ends of described figure mining algorithm SA, collect field feedback, upgrade the initial point activation value of next time propagating, collected field feedback comprises: the books that the books of the new clickthrough of user and user newly borrow, these two parts books are the instant interest of conduct all, and give weights.
8. based on the personalized search system of book domain ontology, it is characterized in that, comprising:
The domain body module, system sets up domain body, thereby the concept definition of this specific area and the semantic relation between the notion is provided by with the body of a specific area as description object;
User model, it carries out analyzing and processing to daily record, and the analysis user historical record is because As time goes on user interest constantly changes, according to time sequencing described user interest is classified and weighting, thereby set up user model based on user interest preference;
Personalized score computing module, according to domain body of having set up and user model, SA calculates this score by the figure mining algorithm; And,
Reset the Search Results module, the personalized score that it obtains according to described SA is reset the result that former search engine returns according to order from high to low, returns to the user then.
9. the personalized search system based on book domain ontology according to claim 8 is characterized in that: the described figure mining algorithm SA in the described personalized score computing module comprises as lower unit:
Initial value determining unit: domain body is seen mapping, with the initial point of the books that have been endowed weights in the user model as the propagation diffusion;
The circulation of SA algorithm is provided with the unit: cycle index restriction, the travel path restriction of SA algorithm is set and propagates the terminal point restriction, to improve the efficient of algorithm; And,
Iteration unit: by score new formula more, constantly upgrade the activation value of each point iteratively, finish up to whole algorithm.
CN2009102381558A 2009-11-17 2009-11-17 Individuation searching method based on book domain ontology Expired - Fee Related CN101719145B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009102381558A CN101719145B (en) 2009-11-17 2009-11-17 Individuation searching method based on book domain ontology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009102381558A CN101719145B (en) 2009-11-17 2009-11-17 Individuation searching method based on book domain ontology

Publications (2)

Publication Number Publication Date
CN101719145A true CN101719145A (en) 2010-06-02
CN101719145B CN101719145B (en) 2011-08-10

Family

ID=42433719

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009102381558A Expired - Fee Related CN101719145B (en) 2009-11-17 2009-11-17 Individuation searching method based on book domain ontology

Country Status (1)

Country Link
CN (1) CN101719145B (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102081668A (en) * 2011-01-24 2011-06-01 熊晶 Information retrieval optimizing method based on domain ontology
CN102376057A (en) * 2010-08-16 2012-03-14 富士通株式会社 Method and device for processing consumer generated media information
CN102456019A (en) * 2010-10-18 2012-05-16 腾讯科技(深圳)有限公司 Retrieval method and device
CN102609465A (en) * 2012-01-16 2012-07-25 武汉大学 Information recommendation method based on potential communities
CN102637170A (en) * 2011-02-10 2012-08-15 北京百度网讯科技有限公司 Question pushing method and system
CN102693225A (en) * 2011-03-21 2012-09-26 赵红利 Internet consultation collaborative filtering method
CN102868737A (en) * 2012-08-30 2013-01-09 浪潮(北京)电子信息产业有限公司 Safe scheduling method and system
CN102880728A (en) * 2012-10-31 2013-01-16 中国科学院自动化研究所 Individualized ordering method for video searching results of famous persons
CN102902744A (en) * 2012-09-17 2013-01-30 杭州东信北邮信息技术有限公司 Book recommendation method
CN103235802A (en) * 2013-04-16 2013-08-07 武汉理工大学 Method and system for obtaining complex demands of user
CN103440242A (en) * 2013-06-26 2013-12-11 北京亿赞普网络技术有限公司 User search behavior-based personalized recommendation method and system
CN103984721A (en) * 2014-05-13 2014-08-13 中国矿业大学 Personalized book searching method based on interactive evolutionary optimization
CN104216884A (en) * 2013-05-29 2014-12-17 酷盛(天津)科技有限公司 Collaborative filtering system and method on basis of time decay
CN104484433A (en) * 2014-12-19 2015-04-01 东南大学 Book body matching method based on machine learning
CN104679743A (en) * 2013-11-26 2015-06-03 阿里巴巴集团控股有限公司 Method and device for determining preference model of user
CN104933116A (en) * 2015-06-04 2015-09-23 科大讯飞股份有限公司 Grading method and device of movie and television relevant information
CN105023178A (en) * 2015-08-12 2015-11-04 电子科技大学 Main body-based electronic commercere commendation method
CN106844436A (en) * 2016-12-15 2017-06-13 北京小度信息科技有限公司 The sort method and device of Query Result
CN106951684A (en) * 2017-02-28 2017-07-14 北京大学 A kind of method of entity disambiguation in medical conditions idagnostic logout
CN107908650A (en) * 2017-10-12 2018-04-13 浙江大学 Knowledge train of thought method for auto constructing based on mass digital books
CN109670922A (en) * 2018-12-29 2019-04-23 北京工业大学 Books are worth discovery method on a kind of line based on composite character
CN110489633A (en) * 2019-08-22 2019-11-22 广州图创计算机软件开发有限公司 A kind of wisdom brain service platform based on library data
CN113590945A (en) * 2021-07-26 2021-11-02 西安工程大学 Book recommendation method and device based on user borrowing behavior-interest prediction

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4124115B2 (en) * 2003-12-02 2008-07-23 ソニー株式会社 Information processing apparatus, information processing method, and computer program
CN101540874A (en) * 2009-04-23 2009-09-23 中山大学 Interactive TV program recommendation method based on collaborative filtration
CN101556603A (en) * 2009-05-06 2009-10-14 北京航空航天大学 Coordinate search method used for reordering search results

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102376057A (en) * 2010-08-16 2012-03-14 富士通株式会社 Method and device for processing consumer generated media information
CN102456019A (en) * 2010-10-18 2012-05-16 腾讯科技(深圳)有限公司 Retrieval method and device
CN102081668B (en) * 2011-01-24 2012-07-25 熊晶 Information retrieval optimizing method based on domain ontology
CN102081668A (en) * 2011-01-24 2011-06-01 熊晶 Information retrieval optimizing method based on domain ontology
CN102637170A (en) * 2011-02-10 2012-08-15 北京百度网讯科技有限公司 Question pushing method and system
CN102693225A (en) * 2011-03-21 2012-09-26 赵红利 Internet consultation collaborative filtering method
CN102609465B (en) * 2012-01-16 2014-04-16 武汉大学 Information recommendation method based on potential communities
CN102609465A (en) * 2012-01-16 2012-07-25 武汉大学 Information recommendation method based on potential communities
CN102868737A (en) * 2012-08-30 2013-01-09 浪潮(北京)电子信息产业有限公司 Safe scheduling method and system
CN102868737B (en) * 2012-08-30 2015-09-02 浪潮(北京)电子信息产业有限公司 Security dispatching method and system
CN102902744A (en) * 2012-09-17 2013-01-30 杭州东信北邮信息技术有限公司 Book recommendation method
CN102902744B (en) * 2012-09-17 2015-02-11 杭州东信北邮信息技术有限公司 Book recommendation method
CN102880728A (en) * 2012-10-31 2013-01-16 中国科学院自动化研究所 Individualized ordering method for video searching results of famous persons
CN102880728B (en) * 2012-10-31 2015-10-28 中国科学院自动化研究所 The method of famous person's video search result personalized ordering
CN103235802A (en) * 2013-04-16 2013-08-07 武汉理工大学 Method and system for obtaining complex demands of user
CN104216884A (en) * 2013-05-29 2014-12-17 酷盛(天津)科技有限公司 Collaborative filtering system and method on basis of time decay
CN104216884B (en) * 2013-05-29 2020-07-07 上海连尚网络科技有限公司 Collaborative filtering system and method based on time attenuation
CN103440242A (en) * 2013-06-26 2013-12-11 北京亿赞普网络技术有限公司 User search behavior-based personalized recommendation method and system
CN104679743A (en) * 2013-11-26 2015-06-03 阿里巴巴集团控股有限公司 Method and device for determining preference model of user
CN103984721A (en) * 2014-05-13 2014-08-13 中国矿业大学 Personalized book searching method based on interactive evolutionary optimization
CN103984721B (en) * 2014-05-13 2018-04-17 中国矿业大学 Books individuation search method based on interactive evolutionary optimization
CN104484433B (en) * 2014-12-19 2017-06-30 东南大学 A kind of books Ontology Matching method based on machine learning
CN104484433A (en) * 2014-12-19 2015-04-01 东南大学 Book body matching method based on machine learning
CN104933116A (en) * 2015-06-04 2015-09-23 科大讯飞股份有限公司 Grading method and device of movie and television relevant information
CN105023178A (en) * 2015-08-12 2015-11-04 电子科技大学 Main body-based electronic commercere commendation method
CN105023178B (en) * 2015-08-12 2018-08-03 电子科技大学 A kind of electronic commerce recommending method based on ontology
CN106844436A (en) * 2016-12-15 2017-06-13 北京小度信息科技有限公司 The sort method and device of Query Result
CN106844436B (en) * 2016-12-15 2020-07-31 北京星选科技有限公司 Query result sorting method and device
CN106951684B (en) * 2017-02-28 2020-10-09 北京大学 Method for entity disambiguation in medical disease diagnosis record
CN106951684A (en) * 2017-02-28 2017-07-14 北京大学 A kind of method of entity disambiguation in medical conditions idagnostic logout
CN107908650B (en) * 2017-10-12 2019-11-05 浙江大学 Knowledge train of thought method for auto constructing based on mass digital books
CN107908650A (en) * 2017-10-12 2018-04-13 浙江大学 Knowledge train of thought method for auto constructing based on mass digital books
CN109670922A (en) * 2018-12-29 2019-04-23 北京工业大学 Books are worth discovery method on a kind of line based on composite character
CN109670922B (en) * 2018-12-29 2022-02-08 北京工业大学 Online book value discovery method based on mixed features
CN110489633A (en) * 2019-08-22 2019-11-22 广州图创计算机软件开发有限公司 A kind of wisdom brain service platform based on library data
CN110489633B (en) * 2019-08-22 2020-03-24 广州图创计算机软件开发有限公司 Intelligent brain service system based on library data
CN113590945A (en) * 2021-07-26 2021-11-02 西安工程大学 Book recommendation method and device based on user borrowing behavior-interest prediction
CN113590945B (en) * 2021-07-26 2023-07-28 西安工程大学 Book recommendation method and device based on user borrowing behavior-interest prediction

Also Published As

Publication number Publication date
CN101719145B (en) 2011-08-10

Similar Documents

Publication Publication Date Title
CN101719145B (en) Individuation searching method based on book domain ontology
Xu et al. Exploring folksonomy for personalized search
Cai et al. Personalized search by tag-based user profile and resource profile in collaborative tagging systems
Qin et al. Query-level loss functions for information retrieval
Yuan et al. Make your travel smarter: Summarizing urban tourism information from massive blog data
CN102982042B (en) A kind of personalization content recommendation method, platform and system
Ceri et al. An introduction to information retrieval
Zheng et al. A survey of query result diversification
Gupta et al. An overview of social tagging and applications
Baliński et al. Re-ranking method based on inter-document distances
Carman et al. Tag data and personalized information retrieval
Shani et al. Mining recommendations from the web
Duwairi et al. An enhanced CBAR algorithm for improving recommendation systems accuracy
Dahir et al. A query expansion method based on topic modeling and DBpedia features
CN116595246A (en) Book recommendation retrieval system based on knowledge graph and reader portrait
Bellogín et al. Information retrieval and recommender systems
Soo Kim Text recommender system using user's usage patterns
Yang et al. Design and application of handicraft recommendation system based on improved hybrid algorithm
Yu et al. A novel framework to alleviate the sparsity problem in context-aware recommender systems
Fuentes-Lorenzo et al. Improving large-scale search engines with semantic annotations
Peng et al. Personalized web search using clickthrough data and web page rating
Agrawal et al. Similarity search using concept graphs
Guo et al. AOL4PS: A large-scale data set for personalized search
Du et al. Scientific users' interest detection and collaborators recommendation
Liang et al. A hybrid recommender systems based on weighted tags

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110810

Termination date: 20161117