CN106649738A - Method and device for aggregating personage information message in search engine result page - Google Patents

Method and device for aggregating personage information message in search engine result page Download PDF

Info

Publication number
CN106649738A
CN106649738A CN201611213441.5A CN201611213441A CN106649738A CN 106649738 A CN106649738 A CN 106649738A CN 201611213441 A CN201611213441 A CN 201611213441A CN 106649738 A CN106649738 A CN 106649738A
Authority
CN
China
Prior art keywords
information
word
default
search
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611213441.5A
Other languages
Chinese (zh)
Inventor
王艳丽
陈营营
马华蓉
佟思颖
高苏丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201611213441.5A priority Critical patent/CN106649738A/en
Publication of CN106649738A publication Critical patent/CN106649738A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention provides a method and device for aggregating a personage information message in a search engine result page. The method comprises the steps of receiving a target search term relative to personage, wherein the search term is input by a user on a search engine; judging whether the target search term hits a preset personage word list; searching the internet for the target search term, and at the same time searching a structured personage information message content database for an information message matched with the target search term if the target search term hits the preset personage word list, wherein the personage information message content database is created by the steps of grabbing the information message relevant to each personage preset term in the preset personage word list from multiple UGC websites, conducting treatment on the grabbed information message to generate the structured personage information message content database with attributes of the personage preset terms and the information message; clustering the matched information message to the search engine result page corresponding to the target search term and presenting the message to the user. According to the method and device for aggregating the personage information message in the search engine result page, more information messages can be provided in the search engine result page, and thus the content coverage is broadened.

Description

The method and device of polymerization figure kind's information in search results pages
Technical field
Figure kind's information letter the present invention relates to technical field of internet application, particularly one kind are polymerized in search results pages The method and device of breath.
Background technology
With developing rapidly for information technology, today's society enters the information explosion epoch, people more and more by Network come find oneself needs information, therefore, retrieval become people work, an indispensable part of living.
People are usually used search engine to enter line retrieval, and search engine is referred to according to certain strategy, with specific Computer program collects information from internet, after information is organized and processed, provides the user retrieval service, will be with The system that the related information of user search shows user.
Modern network has substantial amounts of user's contributing content, such as forum's note, wechat public number, top news number and interest clan note Son etc., this kind of website is referred to as user's original content (UGC, User-generated Content) or professional production content (PGC, Professionally-generated Content), in this application, this kind of website is referred to as UGC websites.At present, Also there are many high-quality information in these UGC websites, but at present in each search engine products without in fully excavation UGC websites Information, Search Results can not fully comprising the content in these UGC websites.
Inventor has found that the information on the UGC websites of some high-quality has in terms of content it during the present invention is found The advantage of oneself, for example, 1, data it is exclusive:From individual;2nd, can strike a chord:Similar mhkc, a good model has Many people's comments;3 and real Search Results it is complementary:Same query (search word), UGC data can supplement engine results, together Time delay exhibition is readability.Especially for the query of figure kind, the information of some UGC websites can more meet the demand of user.
At present, for how to provide the user the problem of the Search Results including the figure kind's information on UGC websites, Also without effective solution.
The content of the invention
In view of the above problems, it is proposed that the present invention so as to provide one kind overcome the problems referred to above or at least in part solve on State the method and corresponding device of the figure kind's information that is polymerized in search results pages of problem.
According to an aspect of of the present present invention, there is provided a kind of method of the figure kind's information that is polymerized in search results pages, Including:Receive the target search word related to figure kind that user is input on a search engine;Judge the target search word Whether figure kind default vocabulary is hit, wherein, the default word of N number of figure kind is have recorded in the default vocabulary of the figure kind, N is whole Number, and N is more than 1;If so, while the target search word is searched for from internet, in structurized figure kind's information Hold in database and search the information matched with the target search word, wherein, figure kind's information content database is pressed Following steps are generated:The multiple user generated content (UGC) websites for figure kind are collected, and is grabbed from the plurality of UGC websites Take the information related to the default word of each figure kind in the default vocabulary of the figure kind;The information to capturing enters Row is processed, and is classified according to the default word of the related figure kind of every information, is generated and is had the default word of figure kind and information Described structurized figure kind's information content database of attribute;The information of the matching is polymerized to the target search The corresponding search results pages of word are presented to user.
Alternatively, it is described before receiving the target search word related to figure kind that user is input on a search engine Method also includes:Obtain the default word of the most forward N number of figure kind of clicking rate and/or searching rate ranking, composition in reservations database The figure kind presets vocabulary.
Alternatively, the information to capturing is processed, and according to every information, related figure kind presets Word is classified, and generates the described structurized figure kind's information content database with the default word of figure kind and information attribute, Including:Classified according to the default word of the related figure kind of every information of crawl, and according to the money of every information News attribute is optimized sequence, generates the described structurized figure kind's information content with the default word of figure kind and information attribute Database.
Alternatively, the information attribute includes:Information dissemination time and/or the comment number of information.
Alternatively, for the UGC websites of specialized information distribution platform class, capture from the plurality of UGC websites and the N The related information of the default word of individual figure kind, including:In the search inframe of the UGC websites of the specialized information distribution platform class The default word of N number of figure kind is input into respectively, is captured each figure kind in the default word of N number of figure kind by issuing time and is preset The related information of word;Or, mark people in the information of the UGC website orientations of the specialized information distribution platform class Species information, captures the information related to the default word of N number of figure kind from figure kind's information of mark.
Alternatively, for the UGC websites of network themes community class, capture from the plurality of UGC websites and N number of people The related information of the default word of species, including:Word is preset for each figure kind in the default word of N number of figure kind, The theme community that the corresponding user of the default word of the figure kind is located is determined in the UGC websites of the theme class, it is pre- from the figure kind If it is pre- comprising the figure kind to capture title title or text in maximum theme community in the theme community that the corresponding user of word is located If the information of word.
Alternatively, for the UGC websites of network Ask-Answer Community class, capture from the plurality of UGC websites N number of with described The related information of the default word of figure kind, including:Judge in the UGC websites of network Ask-Answer Community class that what each delivered asks Whether whether the classification of topic is related to figure kind, if it is, including in judging the answer of the problem delivered and the problem With one or more corresponding keywords in the default word of N number of figure kind, if it has, then capturing problem that this delivers and this is asked The answer of topic is used as the related information of the default word of one or more figure kinds in the default word of N number of figure kind.
Alternatively, the information of the matching is polymerized to the corresponding search results pages of the target search word and is presented to User, including:Represent the result that the target search word is searched for from internet on the left of the search results pages;Judge institute Whether state has identical information in the result represented with search results pages left side in the information of matching, if Have, then remove the identical information in the information of the matching;The identical information will be removed The right side area that the information of the matching afterwards is polymerized to the corresponding search results pages of the target search word is presented to use Family.
Alternatively, the information of the matching is polymerized to the corresponding search results pages of the target search word and is presented to After user, methods described also includes:The information of the matching of the counting user for representing in the search results pages Trigger action, obtain statistics;Determine whether opened up in the corresponding page of subsequent search request according to the statistics The information of the existing matching.
Alternatively, determine whether represent the matching in the corresponding page of subsequent search request according to the statistics Information, including:If the quantity that the statistics is the trigger action is less than specified threshold, it is determined that subsequently searching Rope asks the information for no longer representing the matching in the corresponding page.
According to another aspect of the invention, there is provided a kind of dress of the figure kind's information that is polymerized in search results pages Put, including:Receiver module, for receiving the target search word related to figure kind that user is input on a search engine;Sentence Disconnected module, for judging whether the target search word hits the default vocabulary of figure kind, wherein, in the default vocabulary of the figure kind The default word of N number of figure kind is have recorded, N is integer, and N is more than 1;Search module, for determining the mesh in the judge module In the case that whether mark search word hits the default vocabulary of figure kind, while the target search word is searched for from internet, The information matched with the target search word is searched in structurized figure kind's information content database, wherein, it is described Figure kind's information content database is generated as follows:The multiple user generated content (UGC) websites for figure kind are collected, and The information related to the default word of each figure kind in the default vocabulary of the figure kind is captured from the plurality of UGC websites; The information to capturing is processed, and is classified according to the default word of the related figure kind of every information, is generated Described structurized figure kind's information content database with the default word of figure kind and information attribute;Display module, for inciting somebody to action The information of the matching is polymerized to the corresponding search results pages of the target search word and is presented to user.
Alternatively, also include:Acquisition module, for obtaining reservations database in clicking rate and/or searching rate ranking most lean on The default word of front N number of figure kind, constitutes the default vocabulary of the figure kind.
Alternatively, the information to capturing is processed, and according to every information, related figure kind presets Word is classified, and generates the described structurized figure kind's information content database with the default word of figure kind and information attribute, Including:Classified according to the default word of the related figure kind of every information of crawl, and according to the money of every information News attribute is optimized sequence, generates the described structurized figure kind's information content with the default word of figure kind and information attribute Database.
Alternatively, for the UGC websites of specialized information distribution platform class, capture from the plurality of UGC websites and the N The related information of the default word of individual figure kind, including:In the search inframe of the UGC websites of the specialized information distribution platform class The default word of N number of figure kind is input into respectively, is captured each figure kind in the default word of N number of figure kind by issuing time and is preset The related information of word;Or, mark people in the information of the UGC website orientations of the specialized information distribution platform class Species information, captures the information related to the default word of N number of figure kind from figure kind's information of mark.
Alternatively, for the UGC websites of network themes community class, capture from the plurality of UGC websites and N number of people The related information of the default word of species, including:Word is preset for each figure kind in the default word of N number of figure kind, The theme community that the corresponding user of the default word of the figure kind is located is determined in the UGC websites of the theme class, it is pre- from the figure kind If it is pre- comprising the figure kind to capture title title or text in maximum theme community in the theme community that the corresponding user of word is located If the information of word.
Alternatively, for the UGC websites of network Ask-Answer Community class, capture from the plurality of UGC websites and N number of people The related information of the default word of species, including:Judge the problem that each is delivered in the UGC websites of network Ask-Answer Community class Classification it is whether related to figure kind, if it is, whether include in judging the answer of the problem delivered and the problem with One or more corresponding keywords in the default word of N number of figure kind, if it has, then capturing problem and problem that this is delivered Answer as the related information of the default word of one or more figure kinds in the default word of N number of figure kind.
Alternatively, the display module is specifically for being in such a way polymerized the information of the matching to described The corresponding search results pages of target search word are presented to user:Represent on the left of the search results pages and searched for from internet The result of the target search word;Judge whether to have in the information of the matching and represent with search results pages left sides As a result middle identical information, if it has, then removing the identical information in the information of the matching; The information for removing the matching after the identical information is polymerized search corresponding to the target search word The right side area of result page is presented to user.
Alternatively, also include:Statistical module, for counting user the matching represented in the search results pages is directed to Information trigger action, obtain statistics;Determining module, is subsequently searching for being determined according to the statistics Rope asks the information for whether representing the matching in the corresponding page.
Alternatively, the determining module is specifically for determining in such a way in the corresponding page of subsequent search request Whether the information of the matching is represented:If the quantity that the statistics is the trigger action is less than specified threshold, It is determined that no longer representing the information of the matching in the corresponding page of subsequent search request.
In embodiments of the present invention, the target search related to figure kind that user is input on a search engine is being received During word, first judge whether target search word hits the default vocabulary of figure kind, if it is, in the data captured from UGC websites The information matched with target search word is searched in structurized figure kind's information content database of composition, and will be from structure The information found in figure kind's information content database of change is polymerized search results pages exhibition corresponding to target search word Now give user.As can be seen here, in technical scheme provided in an embodiment of the present invention, can be polymerized UGC nets in search results pages The figure kind's information stood, such that it is able to provide the user more fully information, widens content coverage rate.Also, by There are the structures of the default word of figure kind and information attribute in figure kind's information content database, with readability, can User is helped to be quickly found the information of needs.Further, figure kind's information content database, will from each UGC website Data in each UGC website are preposition to be represented in search results pages, goes to website to search by multi-pass operation without the need for user Relevent information information, reduces the retrieval cost of user.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention, And can be practiced according to the content of specification, and in order to allow the above and other objects of the present invention, feature and advantage can Become apparent, below especially exemplified by the specific embodiment of the present invention.
According to the detailed description below in conjunction with accompanying drawing to the specific embodiment of the invention, those skilled in the art will be brighter Above-mentioned and other purposes, the advantages and features of the present invention.
Description of the drawings
By the detailed description for reading hereafter preferred embodiment, various other advantages and benefit is common for this area Technical staff will be clear from understanding.Accompanying drawing is only used for illustrating the purpose of preferred embodiment, and is not considered as to the present invention Restriction.And in whole accompanying drawing, it is denoted by the same reference numerals identical part.In the accompanying drawings:
Fig. 1 shows the method for the figure kind's information that is polymerized in search results pages according to an embodiment of the invention Flow chart;
Fig. 2 shows the signal of the search results pages for being polymerized and having figure kind's information according to another embodiment of the present invention Figure;
Fig. 3 shows the device of the figure kind's information that is polymerized in search results pages according to an embodiment of the invention Structural representation;And
Fig. 4 shows the device of the figure kind's information that is polymerized in search results pages according to another embodiment of the present invention Structural representation.
Specific embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in accompanying drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure and should not be by embodiments set forth here Limited.On the contrary, there is provided these embodiments are able to be best understood from the disclosure, and can be by the scope of the present disclosure Complete conveys to those skilled in the art.
To solve above-mentioned technical problem, one kind polymerization figure kind's information in search results pages is embodiments provided The method of information, the method can be applied on the terminal devices such as PC, smart mobile phone, panel computer.Fig. 1 shows root According to the flow chart of the method for the figure kind's information that is polymerized in search results pages of one embodiment of the invention.As shown in figure 1, should Method at least may comprise steps of S102 to step S108.
Step S102, receives the target search word related to figure kind that user is input on a search engine.
Step S104, judges whether the target search word hits the default vocabulary of figure kind, if it is, execution step S106, otherwise, scans for according to normal search pattern, only the target search word is searched for from internet;Wherein, it is described The default word of N number of figure kind is have recorded in the default vocabulary of figure kind, N is integer, and N is more than 1.
In an optional embodiment of the embodiment of the present invention, the default vocabulary of figure kind can be before step S102 Obtain, i.e., in the optional embodiment, before step S102, the method can also include:Obtain reservations database midpoint The default word of the most forward N number of figure kind of rate and/or searching rate ranking is hit, the default vocabulary of the figure kind is constituted.Wherein, predetermined number Can as the case may be specify according to storehouse, can be 360 hot lists and search daily record, you can with reference to 360 hot lists and search for example The most forward default vocabulary of N famous person's name composition figure kind of log acquisition clicking rate and/or searching rate, wherein, the value of N can be with Determined according to concrete application, in the present embodiment and be not construed as limiting.
Step S106, while the target search word is searched for from internet, in structurized figure kind's information Hold in database and search the information matched with the target search word.
In the present embodiment, figure kind's information content database is generated as follows:
Step 1, collects the multiple UGC websites for figure kind, and captures from multiple UGC websites pre- with the figure kind If the related information of the default word of each figure kind in vocabulary.
In this step, UGC (User Gernerated Content, user's production content), it is also referred to as UCC (User Created Content, user creates content), can include the word content of user's creation, the picture that user shoots And video, the audio frequency of user's recording etc..Additionally, PGC (Professional Generated Content, professional production Content), it is the derivative concept of UGC, and the benefit of UGC is that user can freely upload content, enriches web site contents, but unfavorable Aspect is that the quality of content is very different.Compared with UGC, PGC classification is more professional, and content quality is also more guaranteed, its content Arrange and product edition is very professional.In fact, both UGC and PGC not contradiction, is not only mutually exclusive, and needs mutually auxiliary Coordinate.The internet content of one maturation is to product, no matter website or community, video platform, even audio platform, neomorph Under media, be required for depth and two aspects of range parallel.With reference to the characteristics of itself, UGC is responsible for content range, main contributions Flow and participation, and PGC maintenance content depths, main Branding, the creation of value, both are indispensable.Because PGC is The derivative concept of UGC, in embodiments of the present invention might as well using PGC as UGC a part.
In a particular application, the quality of the content for being provided due to UGC is very different, and the embodiment of the present invention is in order to increase personage The confidence level of class information content, when capturing figure kind's information from multiple UGC websites in this step, can be from multiple At least one high-quality UGC website is filtered out in UGC websites, and then figure kind's information is captured from least one high-quality UGC website Information.
Further, when at least one high-quality UGC website is filtered out from multiple UGC websites, can be by some weighing apparatuses Measure the factor to screen.Specifically, it is determined that one or more weigh the factor, weigh the factor according to one or more for determining and weigh out The quality condition of multiple UGC websites, and at least one UGC websites of the specified quality requirements of mass satisfaction are therefrom screened as excellent Matter UGC website.Here the measurement factor can the such as confidence level of website, number of users, the visit capacity of website registered on website Deng.
When the measurement factor includes multiple, when according to multiple measurement factors come the quality condition for weighing multiple UGC websites, A kind of optional scheme is embodiments provided, in this scenario, multiple measurement factors can be determined based on Weight Algorithm Respective weight, obtains the respective numerical value of multiple measurement factors of multiple UGC websites;Subsequently by multiple weighing apparatuses of multiple UGC websites The respective numerical value of the amount factor is weighted summation with weight, obtains comprehensive numerical value, and then according to the respective synthesis in multiple UGC websites Numerical value weighs out the quality condition of multiple UGC websites.
For example, multiple UGC websites are website 1, website 2, website 3, website 4 and website 5, and multiple factors of weighing are for website Number of users, the visit capacity of website registered in confidence level, website, the respective numerical value of multiple measurement factors of website 1 is respectively P11, p12, p13, the respective numerical value of multiple measurement factors of website 2 is respectively p21, p22, p23, multiple measurements of website 3 because The respective numerical value of son is respectively p31, p32, p33, and the respective numerical value of multiple measurement factors of website 4 is respectively p41, p42, p43, The respective numerical value of multiple measurement factors of website 5 is respectively p51, p52, p53.Determine that the respective weight of multiple measurement factors is W1, w2, w3, by the respective numerical value of multiple measurement factors of multiple UGC websites and weight summation is weighted, and obtains multiple UGC The comprehensive numerical value of website.Might as well be by taking website 1 and website 2 as an example, the comprehensive numerical value of website 1 is p11 × w1+p12 after weighted sum × w2+p13 × w3, the comprehensive numerical value of website 2 is p21 × w1+p22 × w2+p23 × w3, and website 3, website 4 and website 5 are with this Analogize, no longer repeat one by one herein.
In addition, in the present embodiment, different types of UGC websites can be directed to, using different crawl strategies.
For example, in an optional embodiment of the embodiment of the present invention, for the UGC nets of specialized information distribution platform class Stand, for example, top news number captures the information related to the default word of N number of figure kind, bag from the plurality of UGC websites Include:
Search inframe in the UGC websites of the specialized information distribution platform class is input into respectively N number of figure kind and presets Word, by issuing time the related information of the default word of each figure kind in the default word of N number of figure kind is captured;For example, may be used The default word of each figure kind is scanned in be input into the default vocabulary of figure kind respectively in the search box of top news number, during by issuing Between capture the related information of the default word of each figure kind;Or,
Figure kind's information is marked in the information of the UGC website orientations of the specialized information distribution platform class, from mark The information related to the default word of N number of figure kind is captured in figure kind's information of note.For example, can top news number on people Work marks the top news number of star's Eight Diagrams class, and in these top news number data grabber is carried out, then according to the information of crawl The name included in title is sorted out.
Again for example, in another optional embodiment of the embodiment of the present invention, for the UGC nets of network themes community class Stand, for example, interest clan or bean cotyledon etc. capture the money related to the default word of N number of figure kind from the plurality of UGC websites News information can include:For the default word of each figure kind in the default word of N number of figure kind, in the UGC of the theme class The theme community that the corresponding user of the default word of the figure kind is located is determined in website, from the default corresponding user institute of word of the figure kind Theme community in capture the information of title title or text comprising the default word of the figure kind in maximum theme community. For example, in interest clan, for the default word of each figure kind in the default vocabulary of figure kind, for example, Lee two, first position target person How many clan of thing, then selects maximum clan to be captured (for example, can be according to attention rate), title or article text bag Information containing keyword (for example, Lee two).
Again for example, in another optional embodiment of the embodiment of the present invention, for the UGC nets of network Ask-Answer Community class Stand, for example, know net, the information related to the default word of N number of figure kind is captured from the plurality of UGC websites can be with Including:Judge whether the classification of each problem delivered in the UGC websites of network Ask-Answer Community class is related to figure kind, such as Whether fruit is then to judge to include in the answer of the problem delivered and the problem to preset one in word with N number of figure kind Or multiple corresponding keywords, if it has, the answer for then capturing problem that this delivers and the problem is pre- as N number of figure kind If the related information of the default word of one or more figure kinds in word.For example, if the figure kind in the default vocabulary of figure kind Default word is related to amusement circles star, then can first judge whether the classification for knowing the problem of delivering is relevant with entertaining, if relevant, Then determine whether whether comprising the default word of figure kind in the default vocabulary of figure kind in the problem and answer, if comprising, The problem and answer are captured as the related information of the default word of correspondence figure kind.
Step 2, the information to capturing is processed, and is carried out according to the default word of the related figure kind of every information Classification, generates the described structurized figure kind's information content database with the default word of figure kind and information attribute.
I.e. in the present embodiment, after information is grabbed, first the information to capturing is believed according to every information The default word of figure kind that manner of breathing is closed is classified, and generates the structurized figure kind money with the default word of figure kind and information attribute News content data base.That is, figure kind's information content database can include three attribute columns:Figure kind presets word, information category Property and information content.Wherein, information attribute can include it is multinomial, for example, the issuing time of information, comment number of information etc., and The title (tittle) of information and the chained address of information can be included in information content.Table 1 is figure kind's money in the present embodiment A kind of example of the structure of news content data base.
Table 1
In an optional embodiment of the embodiment of the present invention, the figure kind's information to capturing is processed, raw During into structurized figure kind's information content database, the step can include:According to every information phase of crawl The default word of the figure kind of pass is classified, and is optimized sequence according to the information attribute of every information, is generated and is had people Described structurized figure kind's information content database of the default word of species and information attribute.Wherein, information attribute can include Issuing time and/or comment number, i.e., can be carried out in figure kind's information content database according to the ageing and/or temperature of information Sequence, to improve search efficiency.
A kind of optional scheme is embodiments provided, in this scenario, it may be determined that for the personage for capturing The default word of figure kind that class information is processed, and then based on a determination that the default word of figure kind from figure kind's information of crawl Corresponding property content is extracted in information.In the present embodiment, the default word of figure kind can be people's name, personage academic title or multiple Combination appellation of personage etc., embodiment of the present invention not limited to this.
Step S108, the information of the matching is polymerized to the corresponding search results pages of the target search word and is represented To user.
In the present embodiment, the result for finding from figure kind's information content database can be as search engine from mutual Scan for obtaining the supplement of Search Results in networking, therefore, in an optional embodiment of the embodiment of the present invention, step S108 may comprise steps of:
Step 1, represents from internet on the left of search results pages and searches for the result of the target search word;
Step 2, judges whether have in the information of the matching in the result represented with search results pages left side Identical information, if it has, then removing the identical information in the information of the matching;
Step 3, the information for removing the matching after the identical information is polymerized to the target and is searched The right side area of the corresponding search results pages of rope word is presented to user.
That is, in above-mentioned optional embodiment, two regions are included in search results pages:Left field and right side area, In the present embodiment, left field is used to represent the result that search engine is obtained in internet hunt target search word, for example, as The content that now the search results pages left side of the search engine such as baidu, google represents, right side area is used to be presented in figure kind The result that information content database search is arrived, such that it is able to the content of expanded search results page right side area, provides the user more Complete Search Results.
In an optional embodiment of the embodiment of the present invention, the information of the matching is being polymerized to the mesh When the right side area of the corresponding search results pages of mark search word is presented to user, as shown in Fig. 2 can be on the right side of search results pages Represent the related picture of the information of the matching on the upside of side region, the information of the matching is represented below picture Text Link.
In above-mentioned optional embodiment, in order to further such that the content that right side represents can meet the demand of user, In an optional embodiment of the embodiment of the present invention, the information of the matching is polymerized to the target search word pair The search results pages answered are presented to after user, and methods described also includes:Counting user in the search results pages for representing The matching information trigger action, obtain statistics;Being determined according to the statistics please in subsequent searches Seek the information for whether representing the matching in the corresponding page.Wherein, user is directed to what is represented in the search results pages The trigger action of the information of the matching can be CTR (the Click To of the information of the matching for representing Rate, clicking rate), i.e., the clicking rate determination after the information in figure kind's information content database represents subsequently is being searched When rope is to the information, if also represent the information on the right side of search page.
Further, in above-mentioned optional embodiment, determined in subsequent search request pair according to the statistics When whether representing the information of the matching in the page answered, it can be determined that whether the statistics is the trigger action Quantity be less than specified threshold, if it is, determine no longer represent the matching in the corresponding page of subsequent search request Information.
In a particular application, the CTR of information, and root can be judged according to (the such as 1 or 2 hour) cycle specified time It is judged that as a result being processed accordingly.
In addition, in the present embodiment, in the UGC websites that can be captured with periodic detection with the default vocabulary of figure kind in each Whether the related information of the default word of figure kind is renewal, if it has, then capturing new information in figure kind's information Hold database to be updated, after the updating, clicking rate CTR of every information in figure kind's information content data is carried out Empty, that is, after updating, then hit after the information in figure kind's information content database, no matter the information it Front CTR is high or low, and this all represents the information in search results pages, and counts each bar information again CTR, and the CTR of the information is at the appointed time judged after cycle arrival whether more than threshold value, and then judge follow-up Whether the information is represented in testing result.
In embodiments of the present invention, the target search related to figure kind that user is input on a search engine is being received During word, first judge whether target search word hits the default vocabulary of figure kind, if it is, in the data captured from UGC websites The information matched with target search word is searched in structurized figure kind's information content database of composition, and will be from structure The information found in figure kind's information content database of change is polymerized search results pages exhibition corresponding to target search word Now give user.As can be seen here, in technical scheme provided in an embodiment of the present invention, can be polymerized UGC nets in search results pages The figure kind's information stood, such that it is able to provide the user more fully information, widens content coverage rate.Also, by There are the structures of the default word of figure kind and information attribute in figure kind's information content database, with readability, can User is helped to be quickly found the information of needs.Further, figure kind's information content database, will from each UGC website Data in each UGC website are preposition to be represented in search results pages, goes to website to search by multi-pass operation without the need for user Relevent information information, reduces the retrieval cost of user.
It should be noted that in practical application, above-mentioned all optional embodiments can be with any group by the way of combining Close, form the alternative embodiment of the present invention, this is no longer going to repeat them.
The method of the figure kind's information that is polymerized in search results pages provided based on each embodiment above, based on same One inventive concept, the embodiment of the present invention additionally provides a kind of device of the figure kind's information that is polymerized in search results pages.
Fig. 3 shows the device of the figure kind's information that is polymerized in search results pages according to an embodiment of the invention Structural representation.As shown in figure 3, the device at least can include receiver module 310, judge module 320, search module 330 with And display module 340.
Now introduce the embodiment of the present invention in search results pages be polymerized figure kind's information device each composition or Annexation between the function and each several part of device:
Receiver module 310, for receiving the target search word related to figure kind that user is input on a search engine.
Judge module 320, for judging whether the target search word hits the default vocabulary of figure kind, wherein, the people The default word of N number of figure kind is have recorded in the default vocabulary of species, N is integer, and N is more than 1.
Wherein, the value of N can determine according to concrete application, in the present embodiment and be not construed as limiting.
Search module 330, for determining whether the target search word hits the default word of figure kind in the judge module In the case of table, while the target search word is searched for from internet, in structurized figure kind's information content data The information matched with the target search word is searched in storehouse, wherein, figure kind's information content database is by following step It is rapid to generate:Collect the multiple user generated content (UGC) websites for figure kind, and crawl and institute from the plurality of UGC websites State the related information of the default word of each figure kind in the default vocabulary of figure kind;At the information to crawl Reason, is classified according to the default word of the related figure kind of every information, is generated and is had the default word of figure kind and information attribute Described structurized figure kind's information content database.
Display module 340, ties for the information of the matching to be polymerized to the corresponding search of the target search word Fruit page is presented to user.
In an optional embodiment of the embodiment of the present invention, as shown in figure 4, the device can also include:Obtain mould Block 350, for obtaining reservations database in the default word of the most forward N number of figure kind of clicking rate and/or searching rate ranking, constitute institute State the default vocabulary of figure kind.In a particular application, acquisition module 350 can be with reference to 360 hot lists and search log acquisition clicking rate And/or the default vocabulary of N famous person's name composition figure kind that searching rate is most forward.
In an optional embodiment of the embodiment of the present invention, the information to capturing is processed, according to The default word of the related figure kind of every information is classified, and is generated described with the default word of figure kind and information attribute Structurized figure kind's information content database, including:Enter according to the default word of the related figure kind of every information of crawl Row classification, and sequence is optimized according to the information attribute of every information, generate and there is the default word of figure kind and information category Described structurized figure kind's information content database of property.Wherein, information attribute can include issuing time and/or comment Number, i.e., can be ranked up in figure kind's information content database according to the ageing and/or temperature of information, to improve search effect Rate.
In addition, in the present embodiment, different types of UGC websites can be directed to, using different crawl strategies.
In an optional embodiment of the embodiment of the present invention, for the UGC websites of specialized information distribution platform class, from The information related to the default word of N number of figure kind is captured in the plurality of UGC websites, including:
Search inframe in the UGC websites of the specialized information distribution platform class is input into respectively N number of figure kind and presets Word, by issuing time the related information of the default word of each figure kind in the default word of N number of figure kind is captured;For example, may be used The default word of each figure kind is scanned in be input into the default vocabulary of figure kind respectively in the search box of top news number, during by issuing Between capture the related information of the default word of each figure kind;Or,
Figure kind's information is marked in the information of the UGC website orientations of the specialized information distribution platform class, from mark The information related to the default word of N number of figure kind is captured in figure kind's information of note.For example, can top news number on people Work marks the top news number of star's Eight Diagrams class, and in these top news number data grabber is carried out, then according to the information of crawl The name included in title is sorted out.
In an optional embodiment of the embodiment of the present invention, for the UGC websites of network themes community class, from described The information related to the default word of N number of figure kind is captured in multiple UGC websites, including:It is pre- for N number of figure kind If the default word of each figure kind in word, determines the figure kind corresponding user of default word in the UGC websites of the theme class The theme community at place, name is captured in the theme community being located from the corresponding user of the default word of the figure kind in maximum theme community Claim the information of title or text comprising the default word of the figure kind.For example, in interest clan, for the default word of figure kind The default word of each figure kind in table, for example, Lee two, first position how many clan of target person, then select maximum clan to carry out The information of crawl (for example, can be according to attention rate), title or article text comprising keyword (for example, Lee two).
In an optional embodiment of the embodiment of the present invention, for the UGC websites of network Ask-Answer Community class, from described The information related to the default word of N number of figure kind is captured in multiple UGC websites, including:Judge the network question and answer society Whether the classification of each problem delivered is related to figure kind in the UGC websites of area's class, if it is, judging the problem delivered And whether include in the answer of the problem and one or more corresponding keywords in the default word of N number of figure kind, if Have, then capture the answer of problem that this delivers and the problem as one or more figure kinds in the default word of N number of figure kind The related information of default word.For example, if the default word of figure kind in the default vocabulary of figure kind is related to amusement circles star, Then can first judge whether the classification for knowing the problem of delivering is relevant with entertaining, if relevant, determine whether the problem and answers Whether comprising the default word of figure kind in the default vocabulary of figure kind in case, if comprising capturing the problem and answer as right The information for answering the default word of figure kind related.
In an optional embodiment of the embodiment of the present invention, display module 340 will be specifically in such a way will The information of the matching is polymerized to the corresponding search results pages of the target search word and is presented to user:
Represent the result that the target search word is searched for from internet on the left of the search results pages;
Judge whether there is identical in the result represented with search results pages left side in the information of the matching Information, if it has, then removing the identical information in the information of the matching;
The information for removing the matching after the identical information is polymerized to the target search word pair The right side area of the search results pages answered is presented to user.
That is, in above-mentioned optional embodiment, two regions are included in search results pages:Left field and right side area, In the present embodiment, left field is used to represent the result that search engine is obtained in internet hunt target search word, for example, as The content that now the search results pages left side of the search engine such as baidu, google represents, right side area is used to be presented in figure kind The result that information content database search is arrived, such that it is able to the content of expanded search results page right side area, provides the user more Complete Search Results.
In an optional embodiment of the embodiment of the present invention, as shown in figure 4, the device can also include:
Statistical module 360, for the information of the matching of the counting user for representing in the search results pages Trigger action, obtain statistics;
Determining module 370, for determining whether opened up in the corresponding page of subsequent search request according to the statistics The information of the existing matching.
Wherein, user can be for the trigger action of the information of the matching represented in the search results pages The CTR (Click To Rate, clicking rate) of the information of the matching for representing, i.e., according to figure kind's information content data Clicking rate after information in storehouse represents is determined subsequently when the information is searched, if also on the right side of search page Side represents the information.
Further, in above-mentioned optional embodiment, the determining module 370 is specifically for determining in such a way Whether represent the information of the matching in the corresponding page of subsequent search request:
If the quantity that the statistics is the trigger action is less than specified threshold, it is determined that in subsequent search request No longer represent the information of the matching in the corresponding page.
According to the combination of above-mentioned any one preferred embodiment or multiple preferred embodiments, the embodiment of the present invention can reach Following beneficial effect:
In embodiments of the present invention, the target search related to figure kind that user is input on a search engine is being received During word, first judge whether target search word hits the default vocabulary of figure kind, if it is, in the data captured from UGC websites The information matched with target search word is searched in structurized figure kind's information content database of composition, and will be from structure The information found in figure kind's information content database of change is polymerized search results pages exhibition corresponding to target search word Now give user.As can be seen here, in technical scheme provided in an embodiment of the present invention, can be polymerized UGC nets in search results pages The figure kind's information stood, such that it is able to provide the user more fully information, widens content coverage rate.Also, by There are the structures of the default word of figure kind and information attribute in figure kind's information content database, with readability, can User is helped to be quickly found the information of needs.Further, figure kind's information content database, will from each UGC website Data in each UGC website are preposition to be represented in search results pages, goes to website to search by multi-pass operation without the need for user Relevent information information, reduces the retrieval cost of user.
In specification mentioned herein, a large amount of details are illustrated.It is to be appreciated, however, that the enforcement of the present invention Example can be put into practice in the case of without these details.In some instances, known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify the disclosure and help understand one or more in each inventive aspect, exist Above in the description of the exemplary embodiment of the present invention, each feature of the present invention is grouped together into single enforcement sometimes In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor The more features of feature that the application claims ratio of shield is expressly recited in each claim.More precisely, such as following Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore, Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, wherein each claim itself All as the separate embodiments of the present invention.
Those skilled in the art are appreciated that can be carried out adaptively to the module in the equipment in embodiment Change and they are arranged in one or more equipment different from the embodiment.Can be the module or list in embodiment Unit or component are combined into a module or unit or component, and can be divided into addition multiple submodule or subelement or Sub-component.In addition at least some in such feature and/or process or unit is excluded each other, can adopt any Combine to all features disclosed in this specification (including adjoint claim, summary and accompanying drawing) and so disclosed Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification is (including adjoint power Profit is required, summary and accompanying drawing) disclosed in each feature can, equivalent identical by offers or similar purpose alternative features it is next Replace.
Although additionally, it will be appreciated by those of skill in the art that some embodiments described herein include other embodiments In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention Within the scope of and form different embodiments.For example, in detail in the claims, embodiment required for protection one of arbitrarily Can in any combination mode using.
The present invention all parts embodiment can be realized with hardware, or with one or more processor operation Software module realize, or with combinations thereof realization.It will be understood by those of skill in the art that can use in practice Microprocessor or digital signal processor (DSP) are realizing the personage that is polymerized in search results pages according to embodiments of the present invention The some or all functions of some or all parts in the device of class information.The present invention be also implemented as Perform some or all equipment or program of device (for example, computer program and the calculating of method as described herein Machine program product).Such program for realizing the present invention can be stored on a computer-readable medium, or can be with one Or the form of multiple signals.Such signal can be downloaded from internet website and obtained, or be provided on carrier signal, Or provide in any other form.
It should be noted that above-described embodiment the present invention will be described rather than limits the invention, and ability Field technique personnel can design without departing from the scope of the appended claims alternative embodiment.In the claims, Any reference symbol between bracket should not be configured to limitations on claims.Word "comprising" is not excluded the presence of not Element listed in the claims or step.Word "a" or "an" before element does not exclude the presence of multiple such Element.The present invention can come real by means of the hardware for including some different elements and by means of properly programmed computer It is existing.If in the unit claim for listing equipment for drying, several in these devices can be by same hardware branch To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and be run after fame Claim.
So far, although those skilled in the art will appreciate that detailed herein illustrate and describe multiple showing for the present invention Example property embodiment, but, without departing from the spirit and scope of the present invention, still can be direct according to present disclosure It is determined that or deriving many other variations or modifications for meeting the principle of the invention.Therefore, the scope of the present invention is understood that and recognizes It is set to and covers all these other variations or modifications.
Based on one aspect of the present invention, there is provided A1, a kind of figure kind's information that is polymerized in search results pages Method, including:
Receive the target search word related to figure kind that user is input on a search engine;
Judge whether the target search word hits the default vocabulary of figure kind, wherein, remember in the default vocabulary of the figure kind The default word of N number of figure kind is recorded, N is integer, and N is more than 1;
If so, while the target search word is searched for from internet, in structurized figure kind's information content number According to searching the information that matches with the target search word in storehouse, wherein, figure kind's information content database is by as follows Step is generated:Collect for figure kind multiple user generated content (UGC) websites, and from the plurality of UGC websites crawl with The related information of the default word of each figure kind in the default vocabulary of the figure kind;At the information to crawl Reason, is classified according to the default word of the related figure kind of every information, is generated and is had the default word of figure kind and information attribute Described structurized figure kind's information content database;
The information of the matching is polymerized to the corresponding search results pages of the target search word and is presented to user.
A2, the method according to A1, wherein, receive the mesh related to figure kind that user is input on a search engine Before mark search word, methods described also includes:
The default word of the most forward N number of figure kind of clicking rate and/or searching rate ranking in reservations database is obtained, composition is described Figure kind presets vocabulary.
A3, the method according to A1, wherein, the information to capturing is processed, and is believed according to every information The default word of figure kind that manner of breathing is closed is classified, and generates the described structurized personage with the default word of figure kind and information attribute Class information content database, including:
Classified according to the default word of the related figure kind of every information of crawl, and according to every information Information attribute is optimized sequence, generates in the described structurized figure kind's information with the default word of figure kind and information attribute Hold database.
A4, the method according to A3, wherein, the information attribute includes:Information dissemination time and/or the comment of information Number.
A5, the method according to any one of A1 to A4, wherein, for the UGC websites of specialized information distribution platform class, from The information related to the default word of N number of figure kind is captured in the plurality of UGC websites, including:
Search inframe in the UGC websites of the specialized information distribution platform class is input into respectively N number of figure kind and presets Word, by issuing time the related information of the default word of each figure kind in the default word of N number of figure kind is captured;Or,
Figure kind's information is marked in the information of the UGC website orientations of the specialized information distribution platform class, from mark The information related to the default word of N number of figure kind is captured in figure kind's information of note.
A6, the method according to any one of A1 to A4, wherein, for the UGC websites of network themes community class, from described The information related to the default word of N number of figure kind is captured in multiple UGC websites, including:
It is true in the UGC websites of the theme class for the default word of each figure kind in the default word of N number of figure kind The fixed figure kind presets the theme community that the corresponding user of word is located, from the theme that the corresponding user of the default word of the figure kind is located The information of title title or text comprising the default word of the figure kind is captured in community in maximum theme community.
A7, the method according to any one of A1 to A4, wherein, for the UGC websites of network Ask-Answer Community class, from described The information related to the default word of N number of figure kind is captured in multiple UGC websites, including:
Judge the classification of each problem delivered in the UGC websites of network Ask-Answer Community class whether with figure kind's phase Close, if it is, whether include in judging the answer of the problem delivered and the problem with the default word of N number of figure kind One or more corresponding keywords, if it has, then capturing the answer of problem that this delivers and the problem as N number of personage The related information of the default word of one or more figure kinds in the default word of class.
A8, the method according to any one of A1 to A4, wherein, the information of the matching is polymerized to the target The corresponding search results pages of search word are presented to user, including:
Represent the result that the target search word is searched for from internet on the left of the search results pages;
Judge whether there is identical in the result represented with search results pages left side in the information of the matching Information, if it has, then removing the identical information in the information of the matching;
The information for removing the matching after the identical information is polymerized to the target search word pair The right side area of the search results pages answered is presented to user.
A9, the method according to any one of A1 to A4, wherein, the information of the matching is polymerized to the target The corresponding search results pages of search word are presented to after user, and methods described also includes:
Counting user is directed to the trigger action of the information of the matching represented in the search results pages, is united Meter result;
The information for whether representing the matching in the corresponding page of subsequent search request is determined according to the statistics Information.
A10, the method according to A9, wherein, determined in the corresponding page of subsequent search request according to the statistics Whether the information of the matching is represented in face, including:
If the quantity that the statistics is the trigger action is less than specified threshold, it is determined that in subsequent search request pair No longer represent the information of the matching in the page answered.
Based on another aspect of the present invention, the personage present invention also offers B11, one kind are polymerized in search results pages The device of class information, including:
Receiver module, for receiving the target search word related to figure kind that user is input on a search engine;
Judge module, for judging whether the target search word hits the default vocabulary of figure kind, wherein, the figure kind The default word of N number of figure kind is have recorded in default vocabulary, N is integer, and N is more than 1;
Search module, for determining whether the target search word hits the default vocabulary of figure kind in the judge module In the case of, while the target search word is searched for from internet, in structurized figure kind's information content database The information that lookup is matched with the target search word, wherein, figure kind's information content database is given birth to as follows Into:The multiple user generated content (UGC) websites for figure kind are collected, and is captured from the plurality of UGC websites and the people The related information of the default word of each figure kind in the default vocabulary of species;The information to capturing is processed, and is pressed Classified according to the default word of the related figure kind of every information, generated described with the default word of figure kind and information attribute Structurized figure kind's information content database;
Display module, for the information of the matching to be polymerized search results pages corresponding to the target search word It is presented to user.
B12, the device according to B11, wherein, also include:
Acquisition module, for obtaining reservations database in the most forward N number of figure kind of clicking rate and/or searching rate ranking it is pre- If word, the default vocabulary of the figure kind is constituted.
B13, the device according to B11, wherein, the information to capturing is processed, according to every information The default word of the related figure kind of information is classified, and generates the described structurized people with the default word of figure kind and information attribute Species information content database, including:
Classified according to the default word of the related figure kind of every information of crawl, and according to every information Information attribute is optimized sequence, generates in the described structurized figure kind's information with the default word of figure kind and information attribute Hold database.
B14, the device according to any one of B11 to B13, wherein, for the UGC nets of specialized information distribution platform class Stand, the information related to the default word of N number of figure kind is captured from the plurality of UGC websites, including:
Search inframe in the UGC websites of the specialized information distribution platform class is input into respectively N number of figure kind and presets Word, by issuing time the related information of the default word of each figure kind in the default word of N number of figure kind is captured;Or,
Figure kind's information is marked in the information of the UGC website orientations of the specialized information distribution platform class, from mark The information related to the default word of N number of figure kind is captured in figure kind's information of note.
B15, the device according to any one of B11 to B13, wherein, for the UGC websites of network themes community class, from The information related to the default word of N number of figure kind is captured in the plurality of UGC websites, including:
It is true in the UGC websites of the theme class for the default word of each figure kind in the default word of N number of figure kind The fixed figure kind presets the theme community that the corresponding user of word is located, from the theme that the corresponding user of the default word of the figure kind is located The information of title title or text comprising the default word of the figure kind is captured in community in maximum theme community.
B16, the device according to any one of B11 to B13, wherein, for the UGC websites of network Ask-Answer Community class, from The information related to the default word of N number of figure kind is captured in the plurality of UGC websites, including:
Judge the classification of each problem delivered in the UGC websites of network Ask-Answer Community class whether with figure kind's phase Close, if it is, whether include in judging the answer of the problem delivered and the problem with the default word of N number of figure kind One or more corresponding keywords, if it has, then capturing the answer of problem that this delivers and the problem as N number of personage The related information of the default word of one or more figure kinds in the default word of class.
B17, the device according to any one of B11 to B13, wherein, the display module is specifically for according to lower section The information of the matching is polymerized to the corresponding search results pages of the target search word and is presented to user by formula:
Represent the result that the target search word is searched for from internet on the left of the search results pages;
Judge whether there is identical in the result represented with search results pages left side in the information of the matching Information, if it has, then removing the identical information in the information of the matching;
The information for removing the matching after the identical information is polymerized to the target search word pair The right side area of the search results pages answered is presented to user.
B18, the device according to any one of B11 to B13, wherein, also include:
Statistical module, for counting user touching for the information of the matching that represents in the search results pages Operation is sent out, statistics is obtained;
Determining module, for determining whether represent institute in the corresponding page of subsequent search request according to the statistics State the information of matching.
B19, the device according to B18, wherein, the determining module is specifically for determining in such a way follow-up Whether the information of the matching is represented in the corresponding page of searching request:
If the quantity that the statistics is the trigger action is less than specified threshold, it is determined that in subsequent search request No longer represent the information of the matching in the corresponding page.

Claims (10)

1. it is a kind of in search results pages be polymerized figure kind's information method, including:
Receive the target search word related to figure kind that user is input on a search engine;
Judge whether the target search word hits the default vocabulary of figure kind, wherein, have recorded N in the default vocabulary of the figure kind Individual figure kind presets word, and N is integer, and N is more than 1;
If so, while the target search word is searched for from internet, in structurized figure kind's information content database It is middle to search the information matched with the target search word, wherein, figure kind's information content database is as follows Generate:Collect for figure kind multiple user generated content (UGC) websites, and capture from the plurality of UGC websites with it is described The related information of the default word of each figure kind in the default vocabulary of figure kind;The information to capturing is processed, Classified according to the default word of the related figure kind of every information, generated the institute with the default word of figure kind and information attribute State structurized figure kind's information content database;
The information of the matching is polymerized to the corresponding search results pages of the target search word and is presented to user.
2. method according to claim 1, wherein, receive that user is input on a search engine is related to figure kind Before target search word, methods described also includes:
The default word of the most forward N number of figure kind of clicking rate and/or searching rate ranking in reservations database is obtained, the personage is constituted Class presets vocabulary.
3. method according to claim 1, wherein, the information to capturing is processed, according to every information The default word of the related figure kind of information is classified, and generates the described structurized people with the default word of figure kind and information attribute Species information content database, including:
Classified according to the default word of the related figure kind of every information of crawl, and according to the information of every information Attribute is optimized sequence, generates the described structurized figure kind's information content number with the default word of figure kind and information attribute According to storehouse.
4. method according to claim 3, wherein, the information attribute includes:Information dissemination time and/or information are commented By number.
5. the method according to any one of Claims 1-4, wherein, for the UGC websites of specialized information distribution platform class, The information related to the default word of N number of figure kind is captured from the plurality of UGC websites, including:
Search inframe in the UGC websites of the specialized information distribution platform class is input into respectively the default word of N number of figure kind, presses Issuing time captures the related information of the default word of each figure kind in the default word of N number of figure kind;Or,
Figure kind's information is marked in the information of the UGC website orientations of the specialized information distribution platform class, from mark The information related to the default word of N number of figure kind is captured in figure kind's information.
6. the method according to any one of Claims 1-4, wherein, for the UGC websites of network themes community class, from institute State and the information related to the default word of N number of figure kind is captured in multiple UGC websites, including:
For the default word of each figure kind in the default word of N number of figure kind, determining in the UGC websites of the theme class should The theme community that the corresponding user of the default word of figure kind is located, from the theme community that the corresponding user of the default word of the figure kind is located The information of title title or text comprising the default word of the figure kind is captured in middle maximum theme community.
7. the method according to any one of Claims 1-4, wherein, for the UGC websites of network Ask-Answer Community class, from institute State and the information related to the default word of N number of figure kind is captured in multiple UGC websites, including:
Judge whether the classification of each problem delivered in the UGC websites of network Ask-Answer Community class is related to figure kind, such as Whether fruit is then to judge to include in the answer of the problem delivered and the problem to preset one in word with N number of figure kind Or multiple corresponding keywords, if it has, the answer for then capturing problem that this delivers and the problem is pre- as N number of figure kind If the related information of the default word of one or more figure kinds in word.
8. the method according to any one of Claims 1-4, wherein, the information of the matching is polymerized to the mesh The corresponding search results pages of mark search word are presented to user, including:
Represent the result that the target search word is searched for from internet on the left of the search results pages;
Judge whether there is identical information in the result represented with search results pages left side in the information of the matching Information, if it has, then removing the identical information in the information of the matching;
The information for removing the matching after the identical information is polymerized corresponding to the target search word The right side area of search results pages is presented to user.
9. the method according to any one of Claims 1-4, wherein, the information of the matching is polymerized to the mesh The corresponding search results pages of mark search word are presented to after user, and methods described also includes:
Counting user is directed to the trigger action of the information of the matching represented in the search results pages, obtains statistics knot Really;
The information for whether representing the matching in the corresponding page of subsequent search request is determined according to the statistics.
10. it is a kind of in search results pages be polymerized figure kind's information device, including:
Receiver module, for receiving the target search word related to figure kind that user is input on a search engine;
Judge module, for judging whether the target search word hits the default vocabulary of figure kind, wherein, the figure kind presets The default word of N number of figure kind is have recorded in vocabulary, N is integer, and N is more than 1;
Search module, for determining whether the target search word hits the situation of the default vocabulary of figure kind in the judge module Under, while the target search word is searched for from internet, search in structurized figure kind's information content database The information matched with the target search word, wherein, figure kind's information content database is generated as follows:Receive Collection is directed to multiple user generated content (UGC) websites of figure kind, and captures from the plurality of UGC websites pre- with the figure kind If the related information of the default word of each figure kind in vocabulary;The information to capturing is processed, according to per bar The default word of the related figure kind of information is classified, and generates the structuring with the default word of figure kind and information attribute Figure kind's information content database;
Display module, represents for the information of the matching to be polymerized to the corresponding search results pages of the target search word To user.
CN201611213441.5A 2016-12-23 2016-12-23 Method and device for aggregating personage information message in search engine result page Pending CN106649738A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611213441.5A CN106649738A (en) 2016-12-23 2016-12-23 Method and device for aggregating personage information message in search engine result page

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611213441.5A CN106649738A (en) 2016-12-23 2016-12-23 Method and device for aggregating personage information message in search engine result page

Publications (1)

Publication Number Publication Date
CN106649738A true CN106649738A (en) 2017-05-10

Family

ID=58827740

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611213441.5A Pending CN106649738A (en) 2016-12-23 2016-12-23 Method and device for aggregating personage information message in search engine result page

Country Status (1)

Country Link
CN (1) CN106649738A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107301212A (en) * 2017-06-08 2017-10-27 微梦创科网络科技(中国)有限公司 One kind polymerization dynamic method and device of personage
CN108920610A (en) * 2018-06-28 2018-11-30 上海连尚网络科技有限公司 A kind of novel indexing means and equipment
CN109033286A (en) * 2018-07-12 2018-12-18 北京猫眼文化传媒有限公司 Data statistical approach and device
CN109033326A (en) * 2018-07-17 2018-12-18 深圳市嘀哒知经科技有限责任公司 A kind of splitting and reorganizing method and device of knowledge expertise
CN110188301A (en) * 2019-04-30 2019-08-30 北京百度网讯科技有限公司 Information aggregation method and device for website
CN110633406A (en) * 2018-06-06 2019-12-31 北京百度网讯科技有限公司 Event topic generation method and device, storage medium and terminal equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103092877A (en) * 2011-11-04 2013-05-08 百度在线网络技术(北京)有限公司 Method and device for recommending keyword
CN103164449A (en) * 2011-12-15 2013-06-19 腾讯科技(深圳)有限公司 Search result showing method and search result showing device
CN105354227A (en) * 2015-09-30 2016-02-24 北京奇虎科技有限公司 Search-based method and apparatus for providing high-quality comment information
CN105404699A (en) * 2015-12-29 2016-03-16 广州神马移动信息科技有限公司 Method, device and server for searching articles of finance and economics
CN105574176A (en) * 2015-12-21 2016-05-11 北京奇虎科技有限公司 Hot word recommending method and device with combination of multiple data sources

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103092877A (en) * 2011-11-04 2013-05-08 百度在线网络技术(北京)有限公司 Method and device for recommending keyword
CN103164449A (en) * 2011-12-15 2013-06-19 腾讯科技(深圳)有限公司 Search result showing method and search result showing device
CN105354227A (en) * 2015-09-30 2016-02-24 北京奇虎科技有限公司 Search-based method and apparatus for providing high-quality comment information
CN105574176A (en) * 2015-12-21 2016-05-11 北京奇虎科技有限公司 Hot word recommending method and device with combination of multiple data sources
CN105404699A (en) * 2015-12-29 2016-03-16 广州神马移动信息科技有限公司 Method, device and server for searching articles of finance and economics

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107301212A (en) * 2017-06-08 2017-10-27 微梦创科网络科技(中国)有限公司 One kind polymerization dynamic method and device of personage
CN107301212B (en) * 2017-06-08 2020-04-03 微梦创科网络科技(中国)有限公司 Method and device for aggregating character dynamics
CN110633406A (en) * 2018-06-06 2019-12-31 北京百度网讯科技有限公司 Event topic generation method and device, storage medium and terminal equipment
CN108920610A (en) * 2018-06-28 2018-11-30 上海连尚网络科技有限公司 A kind of novel indexing means and equipment
CN108920610B (en) * 2018-06-28 2021-07-16 上海连尚网络科技有限公司 Novel indexing method and device
CN109033286A (en) * 2018-07-12 2018-12-18 北京猫眼文化传媒有限公司 Data statistical approach and device
CN109033326A (en) * 2018-07-17 2018-12-18 深圳市嘀哒知经科技有限责任公司 A kind of splitting and reorganizing method and device of knowledge expertise
WO2020015217A1 (en) * 2018-07-17 2020-01-23 深圳市嘀哒知经科技有限责任公司 Method and device for splitting and reorganizing knowledge and skills
CN109033326B (en) * 2018-07-17 2020-05-05 深圳市嘀哒知经科技有限责任公司 Knowledge skill splitting and recombining method and device
CN110188301A (en) * 2019-04-30 2019-08-30 北京百度网讯科技有限公司 Information aggregation method and device for website

Similar Documents

Publication Publication Date Title
CN106649738A (en) Method and device for aggregating personage information message in search engine result page
CN106777206A (en) Movie and television play class keywords search for exhibiting method and device
CN105701216B (en) A kind of information-pushing method and device
CN108197330B (en) Data digging method and device based on social platform
CN102968413B (en) A kind of method and apparatus for being used to provide search result
CN110321424B (en) AIDS (acquired immune deficiency syndrome) personnel behavior analysis method based on deep learning
CN107437038A (en) A kind of detection method and device of webpage tamper
CN105468598A (en) Friend recommendation method and device
CN101479728A (en) Visual and multi-dimensional search
CN106682212A (en) Social relations classification method based on user movement behavior and device
CN103744887B (en) It is a kind of for the method for people search, device and computer equipment
CN107229645A (en) Information processing method, service platform and client
WO2018113673A1 (en) Method and apparatus for pushing search result of variety show query
CN106033445A (en) Method and device for obtaining article association degree data
CN103942275A (en) Video identification method and device
CN106649737A (en) Pushing method and pushing device for search result of variety query
CN102053960B (en) Method and system for constructing quick and accurate Internet of things and Internet search engine according to group requirement characteristics
CN110347701A (en) A kind of target type identification method of entity-oriented retrieval and inquisition
CN103955480B (en) A kind of method and apparatus for determining the target object information corresponding to user
CN109685090A (en) Training method, temperature evaluating method and the relevant device of temperature evaluation and test model
CN109561162A (en) Excavate the method and device that user accesses hobby
CN107291930A (en) The computational methods of weight number
CN106777205A (en) The searching method and device of game class search word
CN106919587A (en) Application program search system and method
CN106844488A (en) With reference to the stock class UGC data recommendation methods and device of search

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170510

RJ01 Rejection of invention patent application after publication