CN106649738A - Method and device for aggregating personage information message in search engine result page - Google Patents
Method and device for aggregating personage information message in search engine result page Download PDFInfo
- Publication number
- CN106649738A CN106649738A CN201611213441.5A CN201611213441A CN106649738A CN 106649738 A CN106649738 A CN 106649738A CN 201611213441 A CN201611213441 A CN 201611213441A CN 106649738 A CN106649738 A CN 106649738A
- Authority
- CN
- China
- Prior art keywords
- information
- word
- default
- search
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The invention provides a method and device for aggregating a personage information message in a search engine result page. The method comprises the steps of receiving a target search term relative to personage, wherein the search term is input by a user on a search engine; judging whether the target search term hits a preset personage word list; searching the internet for the target search term, and at the same time searching a structured personage information message content database for an information message matched with the target search term if the target search term hits the preset personage word list, wherein the personage information message content database is created by the steps of grabbing the information message relevant to each personage preset term in the preset personage word list from multiple UGC websites, conducting treatment on the grabbed information message to generate the structured personage information message content database with attributes of the personage preset terms and the information message; clustering the matched information message to the search engine result page corresponding to the target search term and presenting the message to the user. According to the method and device for aggregating the personage information message in the search engine result page, more information messages can be provided in the search engine result page, and thus the content coverage is broadened.
Description
Technical field
Figure kind's information letter the present invention relates to technical field of internet application, particularly one kind are polymerized in search results pages
The method and device of breath.
Background technology
With developing rapidly for information technology, today's society enters the information explosion epoch, people more and more by
Network come find oneself needs information, therefore, retrieval become people work, an indispensable part of living.
People are usually used search engine to enter line retrieval, and search engine is referred to according to certain strategy, with specific
Computer program collects information from internet, after information is organized and processed, provides the user retrieval service, will be with
The system that the related information of user search shows user.
Modern network has substantial amounts of user's contributing content, such as forum's note, wechat public number, top news number and interest clan note
Son etc., this kind of website is referred to as user's original content (UGC, User-generated Content) or professional production content
(PGC, Professionally-generated Content), in this application, this kind of website is referred to as UGC websites.At present,
Also there are many high-quality information in these UGC websites, but at present in each search engine products without in fully excavation UGC websites
Information, Search Results can not fully comprising the content in these UGC websites.
Inventor has found that the information on the UGC websites of some high-quality has in terms of content it during the present invention is found
The advantage of oneself, for example, 1, data it is exclusive:From individual;2nd, can strike a chord:Similar mhkc, a good model has
Many people's comments;3 and real Search Results it is complementary:Same query (search word), UGC data can supplement engine results, together
Time delay exhibition is readability.Especially for the query of figure kind, the information of some UGC websites can more meet the demand of user.
At present, for how to provide the user the problem of the Search Results including the figure kind's information on UGC websites,
Also without effective solution.
The content of the invention
In view of the above problems, it is proposed that the present invention so as to provide one kind overcome the problems referred to above or at least in part solve on
State the method and corresponding device of the figure kind's information that is polymerized in search results pages of problem.
According to an aspect of of the present present invention, there is provided a kind of method of the figure kind's information that is polymerized in search results pages,
Including:Receive the target search word related to figure kind that user is input on a search engine;Judge the target search word
Whether figure kind default vocabulary is hit, wherein, the default word of N number of figure kind is have recorded in the default vocabulary of the figure kind, N is whole
Number, and N is more than 1;If so, while the target search word is searched for from internet, in structurized figure kind's information
Hold in database and search the information matched with the target search word, wherein, figure kind's information content database is pressed
Following steps are generated:The multiple user generated content (UGC) websites for figure kind are collected, and is grabbed from the plurality of UGC websites
Take the information related to the default word of each figure kind in the default vocabulary of the figure kind;The information to capturing enters
Row is processed, and is classified according to the default word of the related figure kind of every information, is generated and is had the default word of figure kind and information
Described structurized figure kind's information content database of attribute;The information of the matching is polymerized to the target search
The corresponding search results pages of word are presented to user.
Alternatively, it is described before receiving the target search word related to figure kind that user is input on a search engine
Method also includes:Obtain the default word of the most forward N number of figure kind of clicking rate and/or searching rate ranking, composition in reservations database
The figure kind presets vocabulary.
Alternatively, the information to capturing is processed, and according to every information, related figure kind presets
Word is classified, and generates the described structurized figure kind's information content database with the default word of figure kind and information attribute,
Including:Classified according to the default word of the related figure kind of every information of crawl, and according to the money of every information
News attribute is optimized sequence, generates the described structurized figure kind's information content with the default word of figure kind and information attribute
Database.
Alternatively, the information attribute includes:Information dissemination time and/or the comment number of information.
Alternatively, for the UGC websites of specialized information distribution platform class, capture from the plurality of UGC websites and the N
The related information of the default word of individual figure kind, including:In the search inframe of the UGC websites of the specialized information distribution platform class
The default word of N number of figure kind is input into respectively, is captured each figure kind in the default word of N number of figure kind by issuing time and is preset
The related information of word;Or, mark people in the information of the UGC website orientations of the specialized information distribution platform class
Species information, captures the information related to the default word of N number of figure kind from figure kind's information of mark.
Alternatively, for the UGC websites of network themes community class, capture from the plurality of UGC websites and N number of people
The related information of the default word of species, including:Word is preset for each figure kind in the default word of N number of figure kind,
The theme community that the corresponding user of the default word of the figure kind is located is determined in the UGC websites of the theme class, it is pre- from the figure kind
If it is pre- comprising the figure kind to capture title title or text in maximum theme community in the theme community that the corresponding user of word is located
If the information of word.
Alternatively, for the UGC websites of network Ask-Answer Community class, capture from the plurality of UGC websites N number of with described
The related information of the default word of figure kind, including:Judge in the UGC websites of network Ask-Answer Community class that what each delivered asks
Whether whether the classification of topic is related to figure kind, if it is, including in judging the answer of the problem delivered and the problem
With one or more corresponding keywords in the default word of N number of figure kind, if it has, then capturing problem that this delivers and this is asked
The answer of topic is used as the related information of the default word of one or more figure kinds in the default word of N number of figure kind.
Alternatively, the information of the matching is polymerized to the corresponding search results pages of the target search word and is presented to
User, including:Represent the result that the target search word is searched for from internet on the left of the search results pages;Judge institute
Whether state has identical information in the result represented with search results pages left side in the information of matching, if
Have, then remove the identical information in the information of the matching;The identical information will be removed
The right side area that the information of the matching afterwards is polymerized to the corresponding search results pages of the target search word is presented to use
Family.
Alternatively, the information of the matching is polymerized to the corresponding search results pages of the target search word and is presented to
After user, methods described also includes:The information of the matching of the counting user for representing in the search results pages
Trigger action, obtain statistics;Determine whether opened up in the corresponding page of subsequent search request according to the statistics
The information of the existing matching.
Alternatively, determine whether represent the matching in the corresponding page of subsequent search request according to the statistics
Information, including:If the quantity that the statistics is the trigger action is less than specified threshold, it is determined that subsequently searching
Rope asks the information for no longer representing the matching in the corresponding page.
According to another aspect of the invention, there is provided a kind of dress of the figure kind's information that is polymerized in search results pages
Put, including:Receiver module, for receiving the target search word related to figure kind that user is input on a search engine;Sentence
Disconnected module, for judging whether the target search word hits the default vocabulary of figure kind, wherein, in the default vocabulary of the figure kind
The default word of N number of figure kind is have recorded, N is integer, and N is more than 1;Search module, for determining the mesh in the judge module
In the case that whether mark search word hits the default vocabulary of figure kind, while the target search word is searched for from internet,
The information matched with the target search word is searched in structurized figure kind's information content database, wherein, it is described
Figure kind's information content database is generated as follows:The multiple user generated content (UGC) websites for figure kind are collected, and
The information related to the default word of each figure kind in the default vocabulary of the figure kind is captured from the plurality of UGC websites;
The information to capturing is processed, and is classified according to the default word of the related figure kind of every information, is generated
Described structurized figure kind's information content database with the default word of figure kind and information attribute;Display module, for inciting somebody to action
The information of the matching is polymerized to the corresponding search results pages of the target search word and is presented to user.
Alternatively, also include:Acquisition module, for obtaining reservations database in clicking rate and/or searching rate ranking most lean on
The default word of front N number of figure kind, constitutes the default vocabulary of the figure kind.
Alternatively, the information to capturing is processed, and according to every information, related figure kind presets
Word is classified, and generates the described structurized figure kind's information content database with the default word of figure kind and information attribute,
Including:Classified according to the default word of the related figure kind of every information of crawl, and according to the money of every information
News attribute is optimized sequence, generates the described structurized figure kind's information content with the default word of figure kind and information attribute
Database.
Alternatively, for the UGC websites of specialized information distribution platform class, capture from the plurality of UGC websites and the N
The related information of the default word of individual figure kind, including:In the search inframe of the UGC websites of the specialized information distribution platform class
The default word of N number of figure kind is input into respectively, is captured each figure kind in the default word of N number of figure kind by issuing time and is preset
The related information of word;Or, mark people in the information of the UGC website orientations of the specialized information distribution platform class
Species information, captures the information related to the default word of N number of figure kind from figure kind's information of mark.
Alternatively, for the UGC websites of network themes community class, capture from the plurality of UGC websites and N number of people
The related information of the default word of species, including:Word is preset for each figure kind in the default word of N number of figure kind,
The theme community that the corresponding user of the default word of the figure kind is located is determined in the UGC websites of the theme class, it is pre- from the figure kind
If it is pre- comprising the figure kind to capture title title or text in maximum theme community in the theme community that the corresponding user of word is located
If the information of word.
Alternatively, for the UGC websites of network Ask-Answer Community class, capture from the plurality of UGC websites and N number of people
The related information of the default word of species, including:Judge the problem that each is delivered in the UGC websites of network Ask-Answer Community class
Classification it is whether related to figure kind, if it is, whether include in judging the answer of the problem delivered and the problem with
One or more corresponding keywords in the default word of N number of figure kind, if it has, then capturing problem and problem that this is delivered
Answer as the related information of the default word of one or more figure kinds in the default word of N number of figure kind.
Alternatively, the display module is specifically for being in such a way polymerized the information of the matching to described
The corresponding search results pages of target search word are presented to user:Represent on the left of the search results pages and searched for from internet
The result of the target search word;Judge whether to have in the information of the matching and represent with search results pages left sides
As a result middle identical information, if it has, then removing the identical information in the information of the matching;
The information for removing the matching after the identical information is polymerized search corresponding to the target search word
The right side area of result page is presented to user.
Alternatively, also include:Statistical module, for counting user the matching represented in the search results pages is directed to
Information trigger action, obtain statistics;Determining module, is subsequently searching for being determined according to the statistics
Rope asks the information for whether representing the matching in the corresponding page.
Alternatively, the determining module is specifically for determining in such a way in the corresponding page of subsequent search request
Whether the information of the matching is represented:If the quantity that the statistics is the trigger action is less than specified threshold,
It is determined that no longer representing the information of the matching in the corresponding page of subsequent search request.
In embodiments of the present invention, the target search related to figure kind that user is input on a search engine is being received
During word, first judge whether target search word hits the default vocabulary of figure kind, if it is, in the data captured from UGC websites
The information matched with target search word is searched in structurized figure kind's information content database of composition, and will be from structure
The information found in figure kind's information content database of change is polymerized search results pages exhibition corresponding to target search word
Now give user.As can be seen here, in technical scheme provided in an embodiment of the present invention, can be polymerized UGC nets in search results pages
The figure kind's information stood, such that it is able to provide the user more fully information, widens content coverage rate.Also, by
There are the structures of the default word of figure kind and information attribute in figure kind's information content database, with readability, can
User is helped to be quickly found the information of needs.Further, figure kind's information content database, will from each UGC website
Data in each UGC website are preposition to be represented in search results pages, goes to website to search by multi-pass operation without the need for user
Relevent information information, reduces the retrieval cost of user.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention,
And can be practiced according to the content of specification, and in order to allow the above and other objects of the present invention, feature and advantage can
Become apparent, below especially exemplified by the specific embodiment of the present invention.
According to the detailed description below in conjunction with accompanying drawing to the specific embodiment of the invention, those skilled in the art will be brighter
Above-mentioned and other purposes, the advantages and features of the present invention.
Description of the drawings
By the detailed description for reading hereafter preferred embodiment, various other advantages and benefit is common for this area
Technical staff will be clear from understanding.Accompanying drawing is only used for illustrating the purpose of preferred embodiment, and is not considered as to the present invention
Restriction.And in whole accompanying drawing, it is denoted by the same reference numerals identical part.In the accompanying drawings:
Fig. 1 shows the method for the figure kind's information that is polymerized in search results pages according to an embodiment of the invention
Flow chart;
Fig. 2 shows the signal of the search results pages for being polymerized and having figure kind's information according to another embodiment of the present invention
Figure;
Fig. 3 shows the device of the figure kind's information that is polymerized in search results pages according to an embodiment of the invention
Structural representation;And
Fig. 4 shows the device of the figure kind's information that is polymerized in search results pages according to another embodiment of the present invention
Structural representation.
Specific embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in accompanying drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure and should not be by embodiments set forth here
Limited.On the contrary, there is provided these embodiments are able to be best understood from the disclosure, and can be by the scope of the present disclosure
Complete conveys to those skilled in the art.
To solve above-mentioned technical problem, one kind polymerization figure kind's information in search results pages is embodiments provided
The method of information, the method can be applied on the terminal devices such as PC, smart mobile phone, panel computer.Fig. 1 shows root
According to the flow chart of the method for the figure kind's information that is polymerized in search results pages of one embodiment of the invention.As shown in figure 1, should
Method at least may comprise steps of S102 to step S108.
Step S102, receives the target search word related to figure kind that user is input on a search engine.
Step S104, judges whether the target search word hits the default vocabulary of figure kind, if it is, execution step
S106, otherwise, scans for according to normal search pattern, only the target search word is searched for from internet;Wherein, it is described
The default word of N number of figure kind is have recorded in the default vocabulary of figure kind, N is integer, and N is more than 1.
In an optional embodiment of the embodiment of the present invention, the default vocabulary of figure kind can be before step S102
Obtain, i.e., in the optional embodiment, before step S102, the method can also include:Obtain reservations database midpoint
The default word of the most forward N number of figure kind of rate and/or searching rate ranking is hit, the default vocabulary of the figure kind is constituted.Wherein, predetermined number
Can as the case may be specify according to storehouse, can be 360 hot lists and search daily record, you can with reference to 360 hot lists and search for example
The most forward default vocabulary of N famous person's name composition figure kind of log acquisition clicking rate and/or searching rate, wherein, the value of N can be with
Determined according to concrete application, in the present embodiment and be not construed as limiting.
Step S106, while the target search word is searched for from internet, in structurized figure kind's information
Hold in database and search the information matched with the target search word.
In the present embodiment, figure kind's information content database is generated as follows:
Step 1, collects the multiple UGC websites for figure kind, and captures from multiple UGC websites pre- with the figure kind
If the related information of the default word of each figure kind in vocabulary.
In this step, UGC (User Gernerated Content, user's production content), it is also referred to as UCC
(User Created Content, user creates content), can include the word content of user's creation, the picture that user shoots
And video, the audio frequency of user's recording etc..Additionally, PGC (Professional Generated Content, professional production
Content), it is the derivative concept of UGC, and the benefit of UGC is that user can freely upload content, enriches web site contents, but unfavorable
Aspect is that the quality of content is very different.Compared with UGC, PGC classification is more professional, and content quality is also more guaranteed, its content
Arrange and product edition is very professional.In fact, both UGC and PGC not contradiction, is not only mutually exclusive, and needs mutually auxiliary
Coordinate.The internet content of one maturation is to product, no matter website or community, video platform, even audio platform, neomorph
Under media, be required for depth and two aspects of range parallel.With reference to the characteristics of itself, UGC is responsible for content range, main contributions
Flow and participation, and PGC maintenance content depths, main Branding, the creation of value, both are indispensable.Because PGC is
The derivative concept of UGC, in embodiments of the present invention might as well using PGC as UGC a part.
In a particular application, the quality of the content for being provided due to UGC is very different, and the embodiment of the present invention is in order to increase personage
The confidence level of class information content, when capturing figure kind's information from multiple UGC websites in this step, can be from multiple
At least one high-quality UGC website is filtered out in UGC websites, and then figure kind's information is captured from least one high-quality UGC website
Information.
Further, when at least one high-quality UGC website is filtered out from multiple UGC websites, can be by some weighing apparatuses
Measure the factor to screen.Specifically, it is determined that one or more weigh the factor, weigh the factor according to one or more for determining and weigh out
The quality condition of multiple UGC websites, and at least one UGC websites of the specified quality requirements of mass satisfaction are therefrom screened as excellent
Matter UGC website.Here the measurement factor can the such as confidence level of website, number of users, the visit capacity of website registered on website
Deng.
When the measurement factor includes multiple, when according to multiple measurement factors come the quality condition for weighing multiple UGC websites,
A kind of optional scheme is embodiments provided, in this scenario, multiple measurement factors can be determined based on Weight Algorithm
Respective weight, obtains the respective numerical value of multiple measurement factors of multiple UGC websites;Subsequently by multiple weighing apparatuses of multiple UGC websites
The respective numerical value of the amount factor is weighted summation with weight, obtains comprehensive numerical value, and then according to the respective synthesis in multiple UGC websites
Numerical value weighs out the quality condition of multiple UGC websites.
For example, multiple UGC websites are website 1, website 2, website 3, website 4 and website 5, and multiple factors of weighing are for website
Number of users, the visit capacity of website registered in confidence level, website, the respective numerical value of multiple measurement factors of website 1 is respectively
P11, p12, p13, the respective numerical value of multiple measurement factors of website 2 is respectively p21, p22, p23, multiple measurements of website 3 because
The respective numerical value of son is respectively p31, p32, p33, and the respective numerical value of multiple measurement factors of website 4 is respectively p41, p42, p43,
The respective numerical value of multiple measurement factors of website 5 is respectively p51, p52, p53.Determine that the respective weight of multiple measurement factors is
W1, w2, w3, by the respective numerical value of multiple measurement factors of multiple UGC websites and weight summation is weighted, and obtains multiple UGC
The comprehensive numerical value of website.Might as well be by taking website 1 and website 2 as an example, the comprehensive numerical value of website 1 is p11 × w1+p12 after weighted sum
× w2+p13 × w3, the comprehensive numerical value of website 2 is p21 × w1+p22 × w2+p23 × w3, and website 3, website 4 and website 5 are with this
Analogize, no longer repeat one by one herein.
In addition, in the present embodiment, different types of UGC websites can be directed to, using different crawl strategies.
For example, in an optional embodiment of the embodiment of the present invention, for the UGC nets of specialized information distribution platform class
Stand, for example, top news number captures the information related to the default word of N number of figure kind, bag from the plurality of UGC websites
Include:
Search inframe in the UGC websites of the specialized information distribution platform class is input into respectively N number of figure kind and presets
Word, by issuing time the related information of the default word of each figure kind in the default word of N number of figure kind is captured;For example, may be used
The default word of each figure kind is scanned in be input into the default vocabulary of figure kind respectively in the search box of top news number, during by issuing
Between capture the related information of the default word of each figure kind;Or,
Figure kind's information is marked in the information of the UGC website orientations of the specialized information distribution platform class, from mark
The information related to the default word of N number of figure kind is captured in figure kind's information of note.For example, can top news number on people
Work marks the top news number of star's Eight Diagrams class, and in these top news number data grabber is carried out, then according to the information of crawl
The name included in title is sorted out.
Again for example, in another optional embodiment of the embodiment of the present invention, for the UGC nets of network themes community class
Stand, for example, interest clan or bean cotyledon etc. capture the money related to the default word of N number of figure kind from the plurality of UGC websites
News information can include:For the default word of each figure kind in the default word of N number of figure kind, in the UGC of the theme class
The theme community that the corresponding user of the default word of the figure kind is located is determined in website, from the default corresponding user institute of word of the figure kind
Theme community in capture the information of title title or text comprising the default word of the figure kind in maximum theme community.
For example, in interest clan, for the default word of each figure kind in the default vocabulary of figure kind, for example, Lee two, first position target person
How many clan of thing, then selects maximum clan to be captured (for example, can be according to attention rate), title or article text bag
Information containing keyword (for example, Lee two).
Again for example, in another optional embodiment of the embodiment of the present invention, for the UGC nets of network Ask-Answer Community class
Stand, for example, know net, the information related to the default word of N number of figure kind is captured from the plurality of UGC websites can be with
Including:Judge whether the classification of each problem delivered in the UGC websites of network Ask-Answer Community class is related to figure kind, such as
Whether fruit is then to judge to include in the answer of the problem delivered and the problem to preset one in word with N number of figure kind
Or multiple corresponding keywords, if it has, the answer for then capturing problem that this delivers and the problem is pre- as N number of figure kind
If the related information of the default word of one or more figure kinds in word.For example, if the figure kind in the default vocabulary of figure kind
Default word is related to amusement circles star, then can first judge whether the classification for knowing the problem of delivering is relevant with entertaining, if relevant,
Then determine whether whether comprising the default word of figure kind in the default vocabulary of figure kind in the problem and answer, if comprising,
The problem and answer are captured as the related information of the default word of correspondence figure kind.
Step 2, the information to capturing is processed, and is carried out according to the default word of the related figure kind of every information
Classification, generates the described structurized figure kind's information content database with the default word of figure kind and information attribute.
I.e. in the present embodiment, after information is grabbed, first the information to capturing is believed according to every information
The default word of figure kind that manner of breathing is closed is classified, and generates the structurized figure kind money with the default word of figure kind and information attribute
News content data base.That is, figure kind's information content database can include three attribute columns:Figure kind presets word, information category
Property and information content.Wherein, information attribute can include it is multinomial, for example, the issuing time of information, comment number of information etc., and
The title (tittle) of information and the chained address of information can be included in information content.Table 1 is figure kind's money in the present embodiment
A kind of example of the structure of news content data base.
Table 1
In an optional embodiment of the embodiment of the present invention, the figure kind's information to capturing is processed, raw
During into structurized figure kind's information content database, the step can include:According to every information phase of crawl
The default word of the figure kind of pass is classified, and is optimized sequence according to the information attribute of every information, is generated and is had people
Described structurized figure kind's information content database of the default word of species and information attribute.Wherein, information attribute can include
Issuing time and/or comment number, i.e., can be carried out in figure kind's information content database according to the ageing and/or temperature of information
Sequence, to improve search efficiency.
A kind of optional scheme is embodiments provided, in this scenario, it may be determined that for the personage for capturing
The default word of figure kind that class information is processed, and then based on a determination that the default word of figure kind from figure kind's information of crawl
Corresponding property content is extracted in information.In the present embodiment, the default word of figure kind can be people's name, personage academic title or multiple
Combination appellation of personage etc., embodiment of the present invention not limited to this.
Step S108, the information of the matching is polymerized to the corresponding search results pages of the target search word and is represented
To user.
In the present embodiment, the result for finding from figure kind's information content database can be as search engine from mutual
Scan for obtaining the supplement of Search Results in networking, therefore, in an optional embodiment of the embodiment of the present invention, step
S108 may comprise steps of:
Step 1, represents from internet on the left of search results pages and searches for the result of the target search word;
Step 2, judges whether have in the information of the matching in the result represented with search results pages left side
Identical information, if it has, then removing the identical information in the information of the matching;
Step 3, the information for removing the matching after the identical information is polymerized to the target and is searched
The right side area of the corresponding search results pages of rope word is presented to user.
That is, in above-mentioned optional embodiment, two regions are included in search results pages:Left field and right side area,
In the present embodiment, left field is used to represent the result that search engine is obtained in internet hunt target search word, for example, as
The content that now the search results pages left side of the search engine such as baidu, google represents, right side area is used to be presented in figure kind
The result that information content database search is arrived, such that it is able to the content of expanded search results page right side area, provides the user more
Complete Search Results.
In an optional embodiment of the embodiment of the present invention, the information of the matching is being polymerized to the mesh
When the right side area of the corresponding search results pages of mark search word is presented to user, as shown in Fig. 2 can be on the right side of search results pages
Represent the related picture of the information of the matching on the upside of side region, the information of the matching is represented below picture
Text Link.
In above-mentioned optional embodiment, in order to further such that the content that right side represents can meet the demand of user,
In an optional embodiment of the embodiment of the present invention, the information of the matching is polymerized to the target search word pair
The search results pages answered are presented to after user, and methods described also includes:Counting user in the search results pages for representing
The matching information trigger action, obtain statistics;Being determined according to the statistics please in subsequent searches
Seek the information for whether representing the matching in the corresponding page.Wherein, user is directed to what is represented in the search results pages
The trigger action of the information of the matching can be CTR (the Click To of the information of the matching for representing
Rate, clicking rate), i.e., the clicking rate determination after the information in figure kind's information content database represents subsequently is being searched
When rope is to the information, if also represent the information on the right side of search page.
Further, in above-mentioned optional embodiment, determined in subsequent search request pair according to the statistics
When whether representing the information of the matching in the page answered, it can be determined that whether the statistics is the trigger action
Quantity be less than specified threshold, if it is, determine no longer represent the matching in the corresponding page of subsequent search request
Information.
In a particular application, the CTR of information, and root can be judged according to (the such as 1 or 2 hour) cycle specified time
It is judged that as a result being processed accordingly.
In addition, in the present embodiment, in the UGC websites that can be captured with periodic detection with the default vocabulary of figure kind in each
Whether the related information of the default word of figure kind is renewal, if it has, then capturing new information in figure kind's information
Hold database to be updated, after the updating, clicking rate CTR of every information in figure kind's information content data is carried out
Empty, that is, after updating, then hit after the information in figure kind's information content database, no matter the information it
Front CTR is high or low, and this all represents the information in search results pages, and counts each bar information again
CTR, and the CTR of the information is at the appointed time judged after cycle arrival whether more than threshold value, and then judge follow-up
Whether the information is represented in testing result.
In embodiments of the present invention, the target search related to figure kind that user is input on a search engine is being received
During word, first judge whether target search word hits the default vocabulary of figure kind, if it is, in the data captured from UGC websites
The information matched with target search word is searched in structurized figure kind's information content database of composition, and will be from structure
The information found in figure kind's information content database of change is polymerized search results pages exhibition corresponding to target search word
Now give user.As can be seen here, in technical scheme provided in an embodiment of the present invention, can be polymerized UGC nets in search results pages
The figure kind's information stood, such that it is able to provide the user more fully information, widens content coverage rate.Also, by
There are the structures of the default word of figure kind and information attribute in figure kind's information content database, with readability, can
User is helped to be quickly found the information of needs.Further, figure kind's information content database, will from each UGC website
Data in each UGC website are preposition to be represented in search results pages, goes to website to search by multi-pass operation without the need for user
Relevent information information, reduces the retrieval cost of user.
It should be noted that in practical application, above-mentioned all optional embodiments can be with any group by the way of combining
Close, form the alternative embodiment of the present invention, this is no longer going to repeat them.
The method of the figure kind's information that is polymerized in search results pages provided based on each embodiment above, based on same
One inventive concept, the embodiment of the present invention additionally provides a kind of device of the figure kind's information that is polymerized in search results pages.
Fig. 3 shows the device of the figure kind's information that is polymerized in search results pages according to an embodiment of the invention
Structural representation.As shown in figure 3, the device at least can include receiver module 310, judge module 320, search module 330 with
And display module 340.
Now introduce the embodiment of the present invention in search results pages be polymerized figure kind's information device each composition or
Annexation between the function and each several part of device:
Receiver module 310, for receiving the target search word related to figure kind that user is input on a search engine.
Judge module 320, for judging whether the target search word hits the default vocabulary of figure kind, wherein, the people
The default word of N number of figure kind is have recorded in the default vocabulary of species, N is integer, and N is more than 1.
Wherein, the value of N can determine according to concrete application, in the present embodiment and be not construed as limiting.
Search module 330, for determining whether the target search word hits the default word of figure kind in the judge module
In the case of table, while the target search word is searched for from internet, in structurized figure kind's information content data
The information matched with the target search word is searched in storehouse, wherein, figure kind's information content database is by following step
It is rapid to generate:Collect the multiple user generated content (UGC) websites for figure kind, and crawl and institute from the plurality of UGC websites
State the related information of the default word of each figure kind in the default vocabulary of figure kind;At the information to crawl
Reason, is classified according to the default word of the related figure kind of every information, is generated and is had the default word of figure kind and information attribute
Described structurized figure kind's information content database.
Display module 340, ties for the information of the matching to be polymerized to the corresponding search of the target search word
Fruit page is presented to user.
In an optional embodiment of the embodiment of the present invention, as shown in figure 4, the device can also include:Obtain mould
Block 350, for obtaining reservations database in the default word of the most forward N number of figure kind of clicking rate and/or searching rate ranking, constitute institute
State the default vocabulary of figure kind.In a particular application, acquisition module 350 can be with reference to 360 hot lists and search log acquisition clicking rate
And/or the default vocabulary of N famous person's name composition figure kind that searching rate is most forward.
In an optional embodiment of the embodiment of the present invention, the information to capturing is processed, according to
The default word of the related figure kind of every information is classified, and is generated described with the default word of figure kind and information attribute
Structurized figure kind's information content database, including:Enter according to the default word of the related figure kind of every information of crawl
Row classification, and sequence is optimized according to the information attribute of every information, generate and there is the default word of figure kind and information category
Described structurized figure kind's information content database of property.Wherein, information attribute can include issuing time and/or comment
Number, i.e., can be ranked up in figure kind's information content database according to the ageing and/or temperature of information, to improve search effect
Rate.
In addition, in the present embodiment, different types of UGC websites can be directed to, using different crawl strategies.
In an optional embodiment of the embodiment of the present invention, for the UGC websites of specialized information distribution platform class, from
The information related to the default word of N number of figure kind is captured in the plurality of UGC websites, including:
Search inframe in the UGC websites of the specialized information distribution platform class is input into respectively N number of figure kind and presets
Word, by issuing time the related information of the default word of each figure kind in the default word of N number of figure kind is captured;For example, may be used
The default word of each figure kind is scanned in be input into the default vocabulary of figure kind respectively in the search box of top news number, during by issuing
Between capture the related information of the default word of each figure kind;Or,
Figure kind's information is marked in the information of the UGC website orientations of the specialized information distribution platform class, from mark
The information related to the default word of N number of figure kind is captured in figure kind's information of note.For example, can top news number on people
Work marks the top news number of star's Eight Diagrams class, and in these top news number data grabber is carried out, then according to the information of crawl
The name included in title is sorted out.
In an optional embodiment of the embodiment of the present invention, for the UGC websites of network themes community class, from described
The information related to the default word of N number of figure kind is captured in multiple UGC websites, including:It is pre- for N number of figure kind
If the default word of each figure kind in word, determines the figure kind corresponding user of default word in the UGC websites of the theme class
The theme community at place, name is captured in the theme community being located from the corresponding user of the default word of the figure kind in maximum theme community
Claim the information of title or text comprising the default word of the figure kind.For example, in interest clan, for the default word of figure kind
The default word of each figure kind in table, for example, Lee two, first position how many clan of target person, then select maximum clan to carry out
The information of crawl (for example, can be according to attention rate), title or article text comprising keyword (for example, Lee two).
In an optional embodiment of the embodiment of the present invention, for the UGC websites of network Ask-Answer Community class, from described
The information related to the default word of N number of figure kind is captured in multiple UGC websites, including:Judge the network question and answer society
Whether the classification of each problem delivered is related to figure kind in the UGC websites of area's class, if it is, judging the problem delivered
And whether include in the answer of the problem and one or more corresponding keywords in the default word of N number of figure kind, if
Have, then capture the answer of problem that this delivers and the problem as one or more figure kinds in the default word of N number of figure kind
The related information of default word.For example, if the default word of figure kind in the default vocabulary of figure kind is related to amusement circles star,
Then can first judge whether the classification for knowing the problem of delivering is relevant with entertaining, if relevant, determine whether the problem and answers
Whether comprising the default word of figure kind in the default vocabulary of figure kind in case, if comprising capturing the problem and answer as right
The information for answering the default word of figure kind related.
In an optional embodiment of the embodiment of the present invention, display module 340 will be specifically in such a way will
The information of the matching is polymerized to the corresponding search results pages of the target search word and is presented to user:
Represent the result that the target search word is searched for from internet on the left of the search results pages;
Judge whether there is identical in the result represented with search results pages left side in the information of the matching
Information, if it has, then removing the identical information in the information of the matching;
The information for removing the matching after the identical information is polymerized to the target search word pair
The right side area of the search results pages answered is presented to user.
That is, in above-mentioned optional embodiment, two regions are included in search results pages:Left field and right side area,
In the present embodiment, left field is used to represent the result that search engine is obtained in internet hunt target search word, for example, as
The content that now the search results pages left side of the search engine such as baidu, google represents, right side area is used to be presented in figure kind
The result that information content database search is arrived, such that it is able to the content of expanded search results page right side area, provides the user more
Complete Search Results.
In an optional embodiment of the embodiment of the present invention, as shown in figure 4, the device can also include:
Statistical module 360, for the information of the matching of the counting user for representing in the search results pages
Trigger action, obtain statistics;
Determining module 370, for determining whether opened up in the corresponding page of subsequent search request according to the statistics
The information of the existing matching.
Wherein, user can be for the trigger action of the information of the matching represented in the search results pages
The CTR (Click To Rate, clicking rate) of the information of the matching for representing, i.e., according to figure kind's information content data
Clicking rate after information in storehouse represents is determined subsequently when the information is searched, if also on the right side of search page
Side represents the information.
Further, in above-mentioned optional embodiment, the determining module 370 is specifically for determining in such a way
Whether represent the information of the matching in the corresponding page of subsequent search request:
If the quantity that the statistics is the trigger action is less than specified threshold, it is determined that in subsequent search request
No longer represent the information of the matching in the corresponding page.
According to the combination of above-mentioned any one preferred embodiment or multiple preferred embodiments, the embodiment of the present invention can reach
Following beneficial effect:
In embodiments of the present invention, the target search related to figure kind that user is input on a search engine is being received
During word, first judge whether target search word hits the default vocabulary of figure kind, if it is, in the data captured from UGC websites
The information matched with target search word is searched in structurized figure kind's information content database of composition, and will be from structure
The information found in figure kind's information content database of change is polymerized search results pages exhibition corresponding to target search word
Now give user.As can be seen here, in technical scheme provided in an embodiment of the present invention, can be polymerized UGC nets in search results pages
The figure kind's information stood, such that it is able to provide the user more fully information, widens content coverage rate.Also, by
There are the structures of the default word of figure kind and information attribute in figure kind's information content database, with readability, can
User is helped to be quickly found the information of needs.Further, figure kind's information content database, will from each UGC website
Data in each UGC website are preposition to be represented in search results pages, goes to website to search by multi-pass operation without the need for user
Relevent information information, reduces the retrieval cost of user.
In specification mentioned herein, a large amount of details are illustrated.It is to be appreciated, however, that the enforcement of the present invention
Example can be put into practice in the case of without these details.In some instances, known method, structure is not been shown in detail
And technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify the disclosure and help understand one or more in each inventive aspect, exist
Above in the description of the exemplary embodiment of the present invention, each feature of the present invention is grouped together into single enforcement sometimes
In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor
The more features of feature that the application claims ratio of shield is expressly recited in each claim.More precisely, such as following
Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore,
Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, wherein each claim itself
All as the separate embodiments of the present invention.
Those skilled in the art are appreciated that can be carried out adaptively to the module in the equipment in embodiment
Change and they are arranged in one or more equipment different from the embodiment.Can be the module or list in embodiment
Unit or component are combined into a module or unit or component, and can be divided into addition multiple submodule or subelement or
Sub-component.In addition at least some in such feature and/or process or unit is excluded each other, can adopt any
Combine to all features disclosed in this specification (including adjoint claim, summary and accompanying drawing) and so disclosed
Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification is (including adjoint power
Profit is required, summary and accompanying drawing) disclosed in each feature can, equivalent identical by offers or similar purpose alternative features it is next
Replace.
Although additionally, it will be appreciated by those of skill in the art that some embodiments described herein include other embodiments
In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention
Within the scope of and form different embodiments.For example, in detail in the claims, embodiment required for protection one of arbitrarily
Can in any combination mode using.
The present invention all parts embodiment can be realized with hardware, or with one or more processor operation
Software module realize, or with combinations thereof realization.It will be understood by those of skill in the art that can use in practice
Microprocessor or digital signal processor (DSP) are realizing the personage that is polymerized in search results pages according to embodiments of the present invention
The some or all functions of some or all parts in the device of class information.The present invention be also implemented as
Perform some or all equipment or program of device (for example, computer program and the calculating of method as described herein
Machine program product).Such program for realizing the present invention can be stored on a computer-readable medium, or can be with one
Or the form of multiple signals.Such signal can be downloaded from internet website and obtained, or be provided on carrier signal,
Or provide in any other form.
It should be noted that above-described embodiment the present invention will be described rather than limits the invention, and ability
Field technique personnel can design without departing from the scope of the appended claims alternative embodiment.In the claims,
Any reference symbol between bracket should not be configured to limitations on claims.Word "comprising" is not excluded the presence of not
Element listed in the claims or step.Word "a" or "an" before element does not exclude the presence of multiple such
Element.The present invention can come real by means of the hardware for including some different elements and by means of properly programmed computer
It is existing.If in the unit claim for listing equipment for drying, several in these devices can be by same hardware branch
To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and be run after fame
Claim.
So far, although those skilled in the art will appreciate that detailed herein illustrate and describe multiple showing for the present invention
Example property embodiment, but, without departing from the spirit and scope of the present invention, still can be direct according to present disclosure
It is determined that or deriving many other variations or modifications for meeting the principle of the invention.Therefore, the scope of the present invention is understood that and recognizes
It is set to and covers all these other variations or modifications.
Based on one aspect of the present invention, there is provided A1, a kind of figure kind's information that is polymerized in search results pages
Method, including:
Receive the target search word related to figure kind that user is input on a search engine;
Judge whether the target search word hits the default vocabulary of figure kind, wherein, remember in the default vocabulary of the figure kind
The default word of N number of figure kind is recorded, N is integer, and N is more than 1;
If so, while the target search word is searched for from internet, in structurized figure kind's information content number
According to searching the information that matches with the target search word in storehouse, wherein, figure kind's information content database is by as follows
Step is generated:Collect for figure kind multiple user generated content (UGC) websites, and from the plurality of UGC websites crawl with
The related information of the default word of each figure kind in the default vocabulary of the figure kind;At the information to crawl
Reason, is classified according to the default word of the related figure kind of every information, is generated and is had the default word of figure kind and information attribute
Described structurized figure kind's information content database;
The information of the matching is polymerized to the corresponding search results pages of the target search word and is presented to user.
A2, the method according to A1, wherein, receive the mesh related to figure kind that user is input on a search engine
Before mark search word, methods described also includes:
The default word of the most forward N number of figure kind of clicking rate and/or searching rate ranking in reservations database is obtained, composition is described
Figure kind presets vocabulary.
A3, the method according to A1, wherein, the information to capturing is processed, and is believed according to every information
The default word of figure kind that manner of breathing is closed is classified, and generates the described structurized personage with the default word of figure kind and information attribute
Class information content database, including:
Classified according to the default word of the related figure kind of every information of crawl, and according to every information
Information attribute is optimized sequence, generates in the described structurized figure kind's information with the default word of figure kind and information attribute
Hold database.
A4, the method according to A3, wherein, the information attribute includes:Information dissemination time and/or the comment of information
Number.
A5, the method according to any one of A1 to A4, wherein, for the UGC websites of specialized information distribution platform class, from
The information related to the default word of N number of figure kind is captured in the plurality of UGC websites, including:
Search inframe in the UGC websites of the specialized information distribution platform class is input into respectively N number of figure kind and presets
Word, by issuing time the related information of the default word of each figure kind in the default word of N number of figure kind is captured;Or,
Figure kind's information is marked in the information of the UGC website orientations of the specialized information distribution platform class, from mark
The information related to the default word of N number of figure kind is captured in figure kind's information of note.
A6, the method according to any one of A1 to A4, wherein, for the UGC websites of network themes community class, from described
The information related to the default word of N number of figure kind is captured in multiple UGC websites, including:
It is true in the UGC websites of the theme class for the default word of each figure kind in the default word of N number of figure kind
The fixed figure kind presets the theme community that the corresponding user of word is located, from the theme that the corresponding user of the default word of the figure kind is located
The information of title title or text comprising the default word of the figure kind is captured in community in maximum theme community.
A7, the method according to any one of A1 to A4, wherein, for the UGC websites of network Ask-Answer Community class, from described
The information related to the default word of N number of figure kind is captured in multiple UGC websites, including:
Judge the classification of each problem delivered in the UGC websites of network Ask-Answer Community class whether with figure kind's phase
Close, if it is, whether include in judging the answer of the problem delivered and the problem with the default word of N number of figure kind
One or more corresponding keywords, if it has, then capturing the answer of problem that this delivers and the problem as N number of personage
The related information of the default word of one or more figure kinds in the default word of class.
A8, the method according to any one of A1 to A4, wherein, the information of the matching is polymerized to the target
The corresponding search results pages of search word are presented to user, including:
Represent the result that the target search word is searched for from internet on the left of the search results pages;
Judge whether there is identical in the result represented with search results pages left side in the information of the matching
Information, if it has, then removing the identical information in the information of the matching;
The information for removing the matching after the identical information is polymerized to the target search word pair
The right side area of the search results pages answered is presented to user.
A9, the method according to any one of A1 to A4, wherein, the information of the matching is polymerized to the target
The corresponding search results pages of search word are presented to after user, and methods described also includes:
Counting user is directed to the trigger action of the information of the matching represented in the search results pages, is united
Meter result;
The information for whether representing the matching in the corresponding page of subsequent search request is determined according to the statistics
Information.
A10, the method according to A9, wherein, determined in the corresponding page of subsequent search request according to the statistics
Whether the information of the matching is represented in face, including:
If the quantity that the statistics is the trigger action is less than specified threshold, it is determined that in subsequent search request pair
No longer represent the information of the matching in the page answered.
Based on another aspect of the present invention, the personage present invention also offers B11, one kind are polymerized in search results pages
The device of class information, including:
Receiver module, for receiving the target search word related to figure kind that user is input on a search engine;
Judge module, for judging whether the target search word hits the default vocabulary of figure kind, wherein, the figure kind
The default word of N number of figure kind is have recorded in default vocabulary, N is integer, and N is more than 1;
Search module, for determining whether the target search word hits the default vocabulary of figure kind in the judge module
In the case of, while the target search word is searched for from internet, in structurized figure kind's information content database
The information that lookup is matched with the target search word, wherein, figure kind's information content database is given birth to as follows
Into:The multiple user generated content (UGC) websites for figure kind are collected, and is captured from the plurality of UGC websites and the people
The related information of the default word of each figure kind in the default vocabulary of species;The information to capturing is processed, and is pressed
Classified according to the default word of the related figure kind of every information, generated described with the default word of figure kind and information attribute
Structurized figure kind's information content database;
Display module, for the information of the matching to be polymerized search results pages corresponding to the target search word
It is presented to user.
B12, the device according to B11, wherein, also include:
Acquisition module, for obtaining reservations database in the most forward N number of figure kind of clicking rate and/or searching rate ranking it is pre-
If word, the default vocabulary of the figure kind is constituted.
B13, the device according to B11, wherein, the information to capturing is processed, according to every information
The default word of the related figure kind of information is classified, and generates the described structurized people with the default word of figure kind and information attribute
Species information content database, including:
Classified according to the default word of the related figure kind of every information of crawl, and according to every information
Information attribute is optimized sequence, generates in the described structurized figure kind's information with the default word of figure kind and information attribute
Hold database.
B14, the device according to any one of B11 to B13, wherein, for the UGC nets of specialized information distribution platform class
Stand, the information related to the default word of N number of figure kind is captured from the plurality of UGC websites, including:
Search inframe in the UGC websites of the specialized information distribution platform class is input into respectively N number of figure kind and presets
Word, by issuing time the related information of the default word of each figure kind in the default word of N number of figure kind is captured;Or,
Figure kind's information is marked in the information of the UGC website orientations of the specialized information distribution platform class, from mark
The information related to the default word of N number of figure kind is captured in figure kind's information of note.
B15, the device according to any one of B11 to B13, wherein, for the UGC websites of network themes community class, from
The information related to the default word of N number of figure kind is captured in the plurality of UGC websites, including:
It is true in the UGC websites of the theme class for the default word of each figure kind in the default word of N number of figure kind
The fixed figure kind presets the theme community that the corresponding user of word is located, from the theme that the corresponding user of the default word of the figure kind is located
The information of title title or text comprising the default word of the figure kind is captured in community in maximum theme community.
B16, the device according to any one of B11 to B13, wherein, for the UGC websites of network Ask-Answer Community class, from
The information related to the default word of N number of figure kind is captured in the plurality of UGC websites, including:
Judge the classification of each problem delivered in the UGC websites of network Ask-Answer Community class whether with figure kind's phase
Close, if it is, whether include in judging the answer of the problem delivered and the problem with the default word of N number of figure kind
One or more corresponding keywords, if it has, then capturing the answer of problem that this delivers and the problem as N number of personage
The related information of the default word of one or more figure kinds in the default word of class.
B17, the device according to any one of B11 to B13, wherein, the display module is specifically for according to lower section
The information of the matching is polymerized to the corresponding search results pages of the target search word and is presented to user by formula:
Represent the result that the target search word is searched for from internet on the left of the search results pages;
Judge whether there is identical in the result represented with search results pages left side in the information of the matching
Information, if it has, then removing the identical information in the information of the matching;
The information for removing the matching after the identical information is polymerized to the target search word pair
The right side area of the search results pages answered is presented to user.
B18, the device according to any one of B11 to B13, wherein, also include:
Statistical module, for counting user touching for the information of the matching that represents in the search results pages
Operation is sent out, statistics is obtained;
Determining module, for determining whether represent institute in the corresponding page of subsequent search request according to the statistics
State the information of matching.
B19, the device according to B18, wherein, the determining module is specifically for determining in such a way follow-up
Whether the information of the matching is represented in the corresponding page of searching request:
If the quantity that the statistics is the trigger action is less than specified threshold, it is determined that in subsequent search request
No longer represent the information of the matching in the corresponding page.
Claims (10)
1. it is a kind of in search results pages be polymerized figure kind's information method, including:
Receive the target search word related to figure kind that user is input on a search engine;
Judge whether the target search word hits the default vocabulary of figure kind, wherein, have recorded N in the default vocabulary of the figure kind
Individual figure kind presets word, and N is integer, and N is more than 1;
If so, while the target search word is searched for from internet, in structurized figure kind's information content database
It is middle to search the information matched with the target search word, wherein, figure kind's information content database is as follows
Generate:Collect for figure kind multiple user generated content (UGC) websites, and capture from the plurality of UGC websites with it is described
The related information of the default word of each figure kind in the default vocabulary of figure kind;The information to capturing is processed,
Classified according to the default word of the related figure kind of every information, generated the institute with the default word of figure kind and information attribute
State structurized figure kind's information content database;
The information of the matching is polymerized to the corresponding search results pages of the target search word and is presented to user.
2. method according to claim 1, wherein, receive that user is input on a search engine is related to figure kind
Before target search word, methods described also includes:
The default word of the most forward N number of figure kind of clicking rate and/or searching rate ranking in reservations database is obtained, the personage is constituted
Class presets vocabulary.
3. method according to claim 1, wherein, the information to capturing is processed, according to every information
The default word of the related figure kind of information is classified, and generates the described structurized people with the default word of figure kind and information attribute
Species information content database, including:
Classified according to the default word of the related figure kind of every information of crawl, and according to the information of every information
Attribute is optimized sequence, generates the described structurized figure kind's information content number with the default word of figure kind and information attribute
According to storehouse.
4. method according to claim 3, wherein, the information attribute includes:Information dissemination time and/or information are commented
By number.
5. the method according to any one of Claims 1-4, wherein, for the UGC websites of specialized information distribution platform class,
The information related to the default word of N number of figure kind is captured from the plurality of UGC websites, including:
Search inframe in the UGC websites of the specialized information distribution platform class is input into respectively the default word of N number of figure kind, presses
Issuing time captures the related information of the default word of each figure kind in the default word of N number of figure kind;Or,
Figure kind's information is marked in the information of the UGC website orientations of the specialized information distribution platform class, from mark
The information related to the default word of N number of figure kind is captured in figure kind's information.
6. the method according to any one of Claims 1-4, wherein, for the UGC websites of network themes community class, from institute
State and the information related to the default word of N number of figure kind is captured in multiple UGC websites, including:
For the default word of each figure kind in the default word of N number of figure kind, determining in the UGC websites of the theme class should
The theme community that the corresponding user of the default word of figure kind is located, from the theme community that the corresponding user of the default word of the figure kind is located
The information of title title or text comprising the default word of the figure kind is captured in middle maximum theme community.
7. the method according to any one of Claims 1-4, wherein, for the UGC websites of network Ask-Answer Community class, from institute
State and the information related to the default word of N number of figure kind is captured in multiple UGC websites, including:
Judge whether the classification of each problem delivered in the UGC websites of network Ask-Answer Community class is related to figure kind, such as
Whether fruit is then to judge to include in the answer of the problem delivered and the problem to preset one in word with N number of figure kind
Or multiple corresponding keywords, if it has, the answer for then capturing problem that this delivers and the problem is pre- as N number of figure kind
If the related information of the default word of one or more figure kinds in word.
8. the method according to any one of Claims 1-4, wherein, the information of the matching is polymerized to the mesh
The corresponding search results pages of mark search word are presented to user, including:
Represent the result that the target search word is searched for from internet on the left of the search results pages;
Judge whether there is identical information in the result represented with search results pages left side in the information of the matching
Information, if it has, then removing the identical information in the information of the matching;
The information for removing the matching after the identical information is polymerized corresponding to the target search word
The right side area of search results pages is presented to user.
9. the method according to any one of Claims 1-4, wherein, the information of the matching is polymerized to the mesh
The corresponding search results pages of mark search word are presented to after user, and methods described also includes:
Counting user is directed to the trigger action of the information of the matching represented in the search results pages, obtains statistics knot
Really;
The information for whether representing the matching in the corresponding page of subsequent search request is determined according to the statistics.
10. it is a kind of in search results pages be polymerized figure kind's information device, including:
Receiver module, for receiving the target search word related to figure kind that user is input on a search engine;
Judge module, for judging whether the target search word hits the default vocabulary of figure kind, wherein, the figure kind presets
The default word of N number of figure kind is have recorded in vocabulary, N is integer, and N is more than 1;
Search module, for determining whether the target search word hits the situation of the default vocabulary of figure kind in the judge module
Under, while the target search word is searched for from internet, search in structurized figure kind's information content database
The information matched with the target search word, wherein, figure kind's information content database is generated as follows:Receive
Collection is directed to multiple user generated content (UGC) websites of figure kind, and captures from the plurality of UGC websites pre- with the figure kind
If the related information of the default word of each figure kind in vocabulary;The information to capturing is processed, according to per bar
The default word of the related figure kind of information is classified, and generates the structuring with the default word of figure kind and information attribute
Figure kind's information content database;
Display module, represents for the information of the matching to be polymerized to the corresponding search results pages of the target search word
To user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611213441.5A CN106649738A (en) | 2016-12-23 | 2016-12-23 | Method and device for aggregating personage information message in search engine result page |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611213441.5A CN106649738A (en) | 2016-12-23 | 2016-12-23 | Method and device for aggregating personage information message in search engine result page |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106649738A true CN106649738A (en) | 2017-05-10 |
Family
ID=58827740
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611213441.5A Pending CN106649738A (en) | 2016-12-23 | 2016-12-23 | Method and device for aggregating personage information message in search engine result page |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106649738A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107301212A (en) * | 2017-06-08 | 2017-10-27 | 微梦创科网络科技(中国)有限公司 | One kind polymerization dynamic method and device of personage |
CN108920610A (en) * | 2018-06-28 | 2018-11-30 | 上海连尚网络科技有限公司 | A kind of novel indexing means and equipment |
CN109033286A (en) * | 2018-07-12 | 2018-12-18 | 北京猫眼文化传媒有限公司 | Data statistical approach and device |
CN109033326A (en) * | 2018-07-17 | 2018-12-18 | 深圳市嘀哒知经科技有限责任公司 | A kind of splitting and reorganizing method and device of knowledge expertise |
CN110188301A (en) * | 2019-04-30 | 2019-08-30 | 北京百度网讯科技有限公司 | Information aggregation method and device for website |
CN110633406A (en) * | 2018-06-06 | 2019-12-31 | 北京百度网讯科技有限公司 | Event topic generation method and device, storage medium and terminal equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103092877A (en) * | 2011-11-04 | 2013-05-08 | 百度在线网络技术(北京)有限公司 | Method and device for recommending keyword |
CN103164449A (en) * | 2011-12-15 | 2013-06-19 | 腾讯科技(深圳)有限公司 | Search result showing method and search result showing device |
CN105354227A (en) * | 2015-09-30 | 2016-02-24 | 北京奇虎科技有限公司 | Search-based method and apparatus for providing high-quality comment information |
CN105404699A (en) * | 2015-12-29 | 2016-03-16 | 广州神马移动信息科技有限公司 | Method, device and server for searching articles of finance and economics |
CN105574176A (en) * | 2015-12-21 | 2016-05-11 | 北京奇虎科技有限公司 | Hot word recommending method and device with combination of multiple data sources |
-
2016
- 2016-12-23 CN CN201611213441.5A patent/CN106649738A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103092877A (en) * | 2011-11-04 | 2013-05-08 | 百度在线网络技术(北京)有限公司 | Method and device for recommending keyword |
CN103164449A (en) * | 2011-12-15 | 2013-06-19 | 腾讯科技(深圳)有限公司 | Search result showing method and search result showing device |
CN105354227A (en) * | 2015-09-30 | 2016-02-24 | 北京奇虎科技有限公司 | Search-based method and apparatus for providing high-quality comment information |
CN105574176A (en) * | 2015-12-21 | 2016-05-11 | 北京奇虎科技有限公司 | Hot word recommending method and device with combination of multiple data sources |
CN105404699A (en) * | 2015-12-29 | 2016-03-16 | 广州神马移动信息科技有限公司 | Method, device and server for searching articles of finance and economics |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107301212A (en) * | 2017-06-08 | 2017-10-27 | 微梦创科网络科技(中国)有限公司 | One kind polymerization dynamic method and device of personage |
CN107301212B (en) * | 2017-06-08 | 2020-04-03 | 微梦创科网络科技(中国)有限公司 | Method and device for aggregating character dynamics |
CN110633406A (en) * | 2018-06-06 | 2019-12-31 | 北京百度网讯科技有限公司 | Event topic generation method and device, storage medium and terminal equipment |
CN108920610A (en) * | 2018-06-28 | 2018-11-30 | 上海连尚网络科技有限公司 | A kind of novel indexing means and equipment |
CN108920610B (en) * | 2018-06-28 | 2021-07-16 | 上海连尚网络科技有限公司 | Novel indexing method and device |
CN109033286A (en) * | 2018-07-12 | 2018-12-18 | 北京猫眼文化传媒有限公司 | Data statistical approach and device |
CN109033326A (en) * | 2018-07-17 | 2018-12-18 | 深圳市嘀哒知经科技有限责任公司 | A kind of splitting and reorganizing method and device of knowledge expertise |
WO2020015217A1 (en) * | 2018-07-17 | 2020-01-23 | 深圳市嘀哒知经科技有限责任公司 | Method and device for splitting and reorganizing knowledge and skills |
CN109033326B (en) * | 2018-07-17 | 2020-05-05 | 深圳市嘀哒知经科技有限责任公司 | Knowledge skill splitting and recombining method and device |
CN110188301A (en) * | 2019-04-30 | 2019-08-30 | 北京百度网讯科技有限公司 | Information aggregation method and device for website |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106649738A (en) | Method and device for aggregating personage information message in search engine result page | |
CN106777206A (en) | Movie and television play class keywords search for exhibiting method and device | |
CN105701216B (en) | A kind of information-pushing method and device | |
CN108197330B (en) | Data digging method and device based on social platform | |
CN102968413B (en) | A kind of method and apparatus for being used to provide search result | |
CN110321424B (en) | AIDS (acquired immune deficiency syndrome) personnel behavior analysis method based on deep learning | |
CN107437038A (en) | A kind of detection method and device of webpage tamper | |
CN105468598A (en) | Friend recommendation method and device | |
CN101479728A (en) | Visual and multi-dimensional search | |
CN106682212A (en) | Social relations classification method based on user movement behavior and device | |
CN103744887B (en) | It is a kind of for the method for people search, device and computer equipment | |
CN107229645A (en) | Information processing method, service platform and client | |
WO2018113673A1 (en) | Method and apparatus for pushing search result of variety show query | |
CN106033445A (en) | Method and device for obtaining article association degree data | |
CN103942275A (en) | Video identification method and device | |
CN106649737A (en) | Pushing method and pushing device for search result of variety query | |
CN102053960B (en) | Method and system for constructing quick and accurate Internet of things and Internet search engine according to group requirement characteristics | |
CN110347701A (en) | A kind of target type identification method of entity-oriented retrieval and inquisition | |
CN103955480B (en) | A kind of method and apparatus for determining the target object information corresponding to user | |
CN109685090A (en) | Training method, temperature evaluating method and the relevant device of temperature evaluation and test model | |
CN109561162A (en) | Excavate the method and device that user accesses hobby | |
CN107291930A (en) | The computational methods of weight number | |
CN106777205A (en) | The searching method and device of game class search word | |
CN106919587A (en) | Application program search system and method | |
CN106844488A (en) | With reference to the stock class UGC data recommendation methods and device of search |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170510 |
|
RJ01 | Rejection of invention patent application after publication |