CN103761232B - A kind of method and apparatus that web page media content information is provided - Google Patents

A kind of method and apparatus that web page media content information is provided Download PDF

Info

Publication number
CN103761232B
CN103761232B CN201310487602.XA CN201310487602A CN103761232B CN 103761232 B CN103761232 B CN 103761232B CN 201310487602 A CN201310487602 A CN 201310487602A CN 103761232 B CN103761232 B CN 103761232B
Authority
CN
China
Prior art keywords
media content
information
webpage
thumbnail
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310487602.XA
Other languages
Chinese (zh)
Other versions
CN103761232A (en
Inventor
侯小虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201310487602.XA priority Critical patent/CN103761232B/en
Publication of CN103761232A publication Critical patent/CN103761232A/en
Application granted granted Critical
Publication of CN103761232B publication Critical patent/CN103761232B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a kind of method for providing web page media content information, including step:Receive searching request;Whether detection searching request is associated with media content;In the case where searching request is associated with media content, the webpage that lookup is matched with searching request in text index storehouse set in advance and media content index storehouse;And respectively from text index storehouse and media content index storehouse extract webpage text information and media content information, as the Search Results of searching request.

Description

A kind of method and apparatus that web page media content information is provided
Technical field
The present invention relates to field of computer technology, more particularly to a kind of method and dress for providing web page media content information Put.
Background technology
With the development and popularization of computer technology, the demand of various media informations is obtained also increasingly by search engine net Increase.At present, nearly all media content, for example, picture, animation, Voice & Video are all to carry in the form of a web page.Cause This, mainly hits triggering related web page by being input into keyword, and related web page is included to obtain phase in Search Results Close media content information.Search Results are main to be presented in the form of word, for example, be displayed in net in the form of keyword general rise of prices of the stocks and other securities In page, as shown in figure 1, without be given in webpage whether comprising media content information and media content relevant information Prompting.Can there is problems with this mode:Only by the text information in Search Results, user cannot be recognized in each webpage The face media content information that how many is wanted oneself on earth, how is the degree of correlation, if has webpage cheating suspicion to gain click suspicion by cheating; User is in order to find media content, it is necessary to open each net by the keyword general rise of prices of the stocks and other securities situation point for observing each Search Results in webpage Page, is then screened again, inefficient;Due to not knowing the situation of each webpage behind media content, cause many forward Webpage click amount is higher, but actual result situation and is unsatisfactory for user's request;And main flow search engine has click to feed back at present Mechanism, the webpage ranking for finally causing these and being unsatisfactory for user's request is always very high, actual to have deviation with user's request, causes Information search is inefficient.
The content of the invention
In view of the above problems, it is proposed that the present invention, overcome above mentioned problem or solve at least in part to provide one kind The method and apparatus that the search engine of above mentioned problem provides web page media content information.
According to the first aspect of the present invention, there is provided a kind of method of offer web page media content information, including step:Connect Receive searching request;Whether detection searching request is associated with media content;In the situation that searching request is associated with media content Under, the webpage that lookup is matched with searching request in text index storehouse set in advance and media content index storehouse;And respectively The text information and media content information of webpage are extracted from text index storehouse and media content index storehouse, as searching request Search Results.
Alternatively, according to an embodiment of the invention provide web page media content information method in media content at least Including one of the following:Picture, animation, Voice & Video.
Alternatively, in the method for providing web page media content information according to an embodiment of the invention, respectively from word The text information and media content information of webpage are extracted in index database and media content index storehouse, as the search knot of searching request The step of fruit, includes:At least one of the following text information of webpage is extracted from text index storehouse:Title, summary and just Text, as the Search Results of searching request.
Alternatively, in the method for providing web page media content information according to an embodiment of the invention, respectively from word The text information and media content information of webpage are extracted in index database and media content index storehouse, as the search knot of searching request The step of fruit, includes:At least one of the following media content information of webpage is extracted from media content index storehouse:In media The title of appearance, quantity, the first thumbnail, author, a URL addresses of length and/size, form and each media content.
Alternatively, in the method for providing web page media content information according to an embodiment of the invention, respectively from word The text information and media content information of webpage are extracted in index database and media content index storehouse, as the search knot of searching request The step of fruit, also includes:For one or more media contents in webpage distribute the 2nd URL addresses, wherein the 2nd URL addresses refer to To the page of the second thumbnail for showing one or more media contents.
Alternatively, according to an embodiment of the invention provide web page media content information method in, wherein respectively from The text information and media content information of webpage are extracted in text index storehouse and media content index storehouse, as searching for searching request The step of hitch fruit, includes:Extracted from text index storehouse and media content index storehouse respectively in the text information and media of webpage Appearance information;And by the text information and media content information of predetermined way combination webpage, as the Search Results of searching request.
Alternatively, in the method for providing web page media content information according to an embodiment of the invention, by predetermined way Combine webpage text information and media content information, as searching request Search Results the step of include:From the matchmaker of webpage First thumbnail of media content is selected in body content information;And the of media content is shown in Search Results One thumbnail.
Alternatively, in the method for providing web page media content information according to an embodiment of the invention, by predetermined way Combine webpage text information and media content information, as searching request Search Results the step of include:From the matchmaker of webpage The first thumbnail of multiple media contents is selected in body content information;And the of multiple media contents is shown in Search Results One thumbnail.
Alternatively, in the method for providing web page media content information according to an embodiment of the invention, media content letter Breath includes word segment and thumbnail part, and word segment points to the page of the second thumbnail for showing one or more media contents Face.
According to the second aspect of the present invention, there is provided a kind of device for providing web page media content information, including:Please Receiver module is sought, is suitable to receive searching request;Whether related to media content request detection module, be adapted to detect for searching request Connection;Webpage searching module, is suitable in the case where searching request is associated with media content, in text index storehouse set in advance The webpage matched with searching request with lookup in media content index storehouse;And information extraction modules, it is suitable to respectively from word rope Draw the text information and media content information of extraction webpage in storehouse and media content index storehouse, as the search knot of searching request Really.
Alternatively, in the device for being used to provide web page media content information according to an embodiment of the invention, in media Holding at least includes one of the following:Picture, animation, Voice & Video.
Alternatively, in the device for being used to provide web page media content information according to an embodiment of the invention, information is carried Modulus block is suitable to be extracted from text index storehouse at least one of the following text information of webpage:Title, summary and text, make It is the Search Results of searching request.
Alternatively, in the device for being used to provide web page media content information according to an embodiment of the invention, information is carried Modulus block is suitable to be extracted from media content index storehouse at least one of the following media content information of webpage:Media content Title, quantity, the first thumbnail, author, a URL addresses of length and/size, form and each media content.
Alternatively, in the device for being used to provide web page media content information according to an embodiment of the invention, information is carried One or more media contents that modulus block is suitable in webpage distribute the 2nd URL addresses, wherein display is pointed in the 2nd URL addresses The page of the second thumbnail of one or more media contents.
Alternatively, in the device for being used to provide web page media content information according to an embodiment of the invention, information is carried Modulus block includes text information extraction unit and media content information extraction unit, is suitable to respectively from text index storehouse and media Hold the text information and media content information of extraction webpage in index database;And information combination unit, it is suitable to by predetermined way group The text information and media content information of webpage are closed, as the Search Results of searching request.
Alternatively, in the device for being used to provide web page media content information according to an embodiment of the invention, information group Unit is closed to be suitable to select first thumbnail of media content from the media content information of webpage;And in Search Results One the first thumbnail of media content of display.
Alternatively, in the device for being used to provide web page media content information according to an embodiment of the invention, information group Close the first thumbnail that unit is suitable to be selected from the media content information of webpage multiple media contents;And in Search Results First thumbnail of the multiple media contents of display.
Alternatively, in the device for being used to provide web page media content information according to an embodiment of the invention, in media Appearance information includes word segment and thumbnail part, and word segment points to the second thumbnail for showing one or more media contents The page.
The invention provides the method and apparatus that above-mentioned search engine obtains web page media content information.It is of the invention Embodiment, search engine obtain web page media content information method and apparatus for client provide more intuitively, be more readily understood Search media content information mode, allow users to substantially understand webpage in media content relevant information, help user The information of the Search Results degree of correlation is determined, so as to improve search efficiency.
Described above is only the general introduction of technical solution of the present invention, in order to better understand technological means of the invention, And can be practiced according to the content of specification, and in order to allow the above and other objects of the present invention, feature and advantage can Become apparent, below especially exemplified by specific embodiment of the invention.
Brief description of the drawings
By reading the detailed description of hereafter preferred embodiment, various other advantages and benefit is common for this area Technical staff will be clear understanding.Accompanying drawing is only used for showing the purpose of preferred embodiment, and is not considered as to the present invention Limitation.And in whole accompanying drawing, identical part is denoted by the same reference numerals.In the accompanying drawings:
Fig. 1 is the webpage schematic diagram for showing Search Results in the prior art;
Fig. 2 is the flow chart of the method for search engine collecting web page media content information according to an embodiment of the invention;
Fig. 3 is the exemplary view of the Web page picture information search result for providing according to an embodiment of the invention;
Fig. 4 is the exemplary view of the web-page audio information search result for providing according to an embodiment of the invention;
Fig. 5 is the flow chart of the method for providing web page media content information according to an embodiment of the invention;
Fig. 6 is the exemplary view of the Web page picture information search result of offer according to another embodiment of the invention;
Fig. 7 is the flow chart of the method for offer web page media content information according to another embodiment of the invention;
Fig. 8 is the structural representation of the device of search engine collecting web page media content information according to an embodiment of the invention Figure;
Fig. 9 is the structural representation of the device of search engine offer web page media content information according to an embodiment of the invention Figure;
Figure 10 is the structural representation of the device of offer web page media content information according to another embodiment of the invention Figure.
Specific embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in accompanying drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here Limited.Conversely, there is provided these embodiments are able to be best understood from the disclosure, and can be by the scope of the present disclosure Complete conveys to those skilled in the art.
In an embodiment of the present invention, search engine can be according to certain strategy, with specific computer program Information is collected from internet, after information is organized and processed, search service is provided the user, user is searched for related The information system that shows user.
Embodiment one
The method for introducing search engine collecting web page media content information first below, specifically includes:
Crawl info web;Whether detection info web includes the mark of the information of default media content;Detecting In the case of including mark in info web, the text information and media content information in info web are extracted;And based on text Word information and media content information, set up text index storehouse and media content index storehouse respectively.
Fig. 2 shows the method 100 of search engine collecting web page media content information according to an embodiment of the invention Flow chart.In an embodiment of the present invention, media content can at least include one of the following:Picture, animation, audio and regard Frequently.Of course it is to be understood that media content can also include other guide.
As shown in Fig. 2 in step S101, capturing info web.For example, can be from one or more Website server Crawl info web.
In one exemplary embodiment of the present invention, info web may include text information and media content information.Can Selection of land, text information may include at least one of the following:Title, summary and text.Alternatively, media content information may include At least one of the following:The title of media content, quantity, the first thumbnail, author, length and/or size, form and each First URL addresses of media content.
In one exemplary embodiment of the present invention, for the webpage for carrying picture, info web may include word Information and pictorial information.Alternatively, text information may include:Title(As indicated by arrow 3A), summary(As arrow 3B is signified Show), and/or webpage URL(As indicated by arrow 3C).Alternatively, pictorial information may include picture header(As arrow 3D is signified Show), picture number(As indicated by arrow 3E), picture the first thumbnail(As indicated by arrow 3F), picture author(Do not show Go out), picture size or resolution ratio(It is not shown), picture format(It is not shown)And/or the URL addresses of picture(It is not shown).Certainly It is appreciated that text information and pictorial information can also include other guide.
In one exemplary embodiment of the present invention, for the webpage for carrying audio, info web may include word Information and audio-frequency information.Alternatively, text information may include:Title(As indicated by arrow 4A), summary(As arrow 4B is signified Show)And/or the URL of webpage(As indicated by arrow 4C).Alternatively, audio-frequency information may include audio title(As arrow 4D is signified Show), audio thumbnail(As indicated by arrow 4E), audio author(As indicated by arrow 4F), audio size(As arrow 4G is signified Show), audio format(It is not shown)And/or the URL addresses of audio(It is not shown).Of course it is to be understood that text information and audio letter Breath can also include other guide.
In step s 103, whether detection info web includes the mark of the information of default media content.
In one exemplary embodiment of the present invention, crawl is judged by the mark of the information of default media content Info web in whether include specific media content.Alternatively, when the search keyword of user input can be specific with this Media content when matching, search engine can be provided and show the Search Results comprising the webpage.Of course it is to be understood that this hair The concrete form of the mark of above-mentioned default media content information is not limited in bright embodiment.
In step S105, in the case of in detecting info web comprising above-mentioned mark, in extraction info web Text information and media content information;
In one exemplary embodiment of the present invention, step S105 may include:Comprising upper in info web is detected In the case of stating mark, at least one of the following text information of webpage is extracted:Title, summary and text;And extract net At least one of the following media content information of page:The title of media content, quantity, the first thumbnail, author, length and/ Or a URL addresses of size, form and each media content.
In the exemplary embodiment of the invention shown in Fig. 3, for the webpage for carrying picture, webpage letter is being detected In breath in the case of the mark comprising default pictorial information, alternatively, at least one of the following word of the webpage is extracted Information:Title(As indicated by arrow 3A), summary(As indicated by arrow 3B), and webpage URL(As indicated by arrow 3C). Alternatively, at least one of the following pictorial information of the webpage is extracted:Picture header(As indicated by arrow 3D), picture number (As indicated by arrow 3E), picture the first thumbnail(As indicated by arrow 3F), picture author(It is not shown), picture size or Resolution ratio(It is not shown), picture format(It is not shown)With the URL addresses of picture(It is not shown).
In the exemplary embodiment of the invention shown in Fig. 4, for the webpage for carrying audio, webpage letter is being detected In breath in the case of the mark comprising default audio-frequency information, alternatively, at least one of the following word of the webpage is extracted Information:Title(As indicated by arrow 4A), summary(As indicated by arrow 4B)With the URL of webpage(As indicated by arrow 4C).Can Selection of land, extracts at least one of the following audio-frequency information of the webpage:Audio title(As indicated by arrow 4D), audio thumbnail (As indicated by arrow 4E), audio author(As indicated by arrow 4F), audio size(As indicated by arrow 4G), audio format (It is not shown), and audio URL addresses(It is not shown).
In one exemplary embodiment of the present invention, step S105 also includes:For webpage distributes the 2nd URL addresses, its In the 2nd URL addresses point to display webpage in one or more media contents the second thumbnail the page.
In the exemplary embodiment of the invention shown in Fig. 3, for the webpage for carrying picture, webpage letter is being detected Comprising the original URL addresses in the case of above-mentioned mark, extracting picture in webpage in breath, and for webpage distributes new URL addresses, Wherein the page of the thumbnail of one or more pictures in display webpage is pointed in the new URL addresses.Alternatively, this is new Point to the page of the thumbnail of whole pictures in display webpage in URL addresses.Alternatively, when user selects correspondence in Search Results In the option of the new URL addresses, the picture header in such as Fig. 3(As indicated by arrow 3D)When, jump to new corresponding to this The page of URL addresses(As indicated by arrow 3G), to show the thumbnail of whole pictures in the webpage to user.Alternatively, when User selects the thumbnail of each picture in the page(As indicated by arrow 3H), the original URL of the picture is jumped to, to carry For the details of the picture.
In step s 107, based on text information and media content information, text index storehouse and media content are set up respectively Index database.
In one exemplary embodiment of the present invention, step S107 includes:Make text information in text index storehouse with Media content information on same webpage in media content index storehouse is associated.
In the exemplary embodiment of the invention shown in Fig. 3, for the webpage for carrying picture, based on extracted to Few one of the following text information:Title(As indicated by arrow 3A), summary(As indicated by arrow 3B), and webpage URL(As indicated by arrow 3C), set up text index storehouse.Alternatively, based at least one of the following picture letter for being extracted Breath:Picture header(As indicated by arrow 3D), picture number(As indicated by arrow 3E), picture the first thumbnail(Such as arrow Indicated by 3F), picture author(It is not shown), picture size or resolution ratio(It is not shown), picture format(It is not shown)With picture URL addresses(It is not shown), set up picture indices storehouse.Alternatively, the above-mentioned text information and picture indices in text index storehouse are made The above-mentioned pictorial information on same webpage is associated in storehouse.
In the exemplary embodiment of the invention shown in Fig. 4, for the webpage for carrying audio, based on extracted to Few one of the following text information:Title(As indicated by arrow 4A), summary(As indicated by arrow 4B)With the URL of webpage (As indicated by arrow 4C), set up text index storehouse.Alternatively, based at least one of the following audio-frequency information for being extracted: Audio title(As indicated by arrow 4D), audio thumbnail(As indicated by arrow 4E), audio author(As indicated by arrow 4F)、 Audio size(As indicated by arrow 4G), audio format(It is not shown), and audio URL addresses(It is not shown), set up audio Index database.Alternatively, the above-mentioned text information in text index storehouse and the above-mentioned sound on same webpage in audio index storehouse are made Frequency information is associated.
In embodiments of the invention, the text information and media content information in info web are extracted, and based on word letter Breath and media content information, text index storehouse and media content index storehouse are set up respectively, can for client provide more intuitively, be more easy to In the mode of the search media content information for understanding, allow users to substantially understand the relevant information of media content in webpage, side User is helped to determine the information of the Search Results degree of correlation, so as to improve search efficiency.
It should be noted that the method shown in Fig. 2 is not limited carried out by the order of shown each step, can be according to need The sequencing of each step is adjusted, in addition, the step is also not limited to above-mentioned steps division, above-mentioned steps can be further Splitting into more multi-step can also be merged into less step.
Embodiment two
After search engine collecting web page media content information, the searching request of user can be based on, obtain search knot Really.The method that web page media content information is provided is described below, specifically may include:Receive searching request;Detect that the search please Seeking Truth is no to be associated with media content;In the case where searching request is associated with media content, in word rope set in advance Draw the webpage that lookup is matched with searching request in storehouse and media content index storehouse;And respectively from text index storehouse and media content The text information and media content information of webpage are extracted in index database, as the Search Results of searching request.
Fig. 5 shows the flow chart of the method 200 for providing web page media content information according to an embodiment of the invention. In embodiments of the invention, media content can at least include one of the following:Picture, animation, Voice & Video.It is of course possible to Understand, media content can also include other guide.
As shown in figure 5, in step s 201, receiving searching request.For example, can be from one or more ustomer premises access equipment Receive searching request.Alternatively, searching request can be the search keyword of user input.Of course it is to be understood that of the invention The concrete form of above-mentioned searching request is not limited in embodiment.
In step S203, whether detection searching request is associated with media content.Alternatively, when user input searches for crucial Word, judge user searching request whether the demand containing media content, for example whether containing picture demand, animation demand, regard Frequency demand or audio demand.
In step S205, in the case where searching request is associated with media content, in text index set in advance The webpage that lookup is matched with searching request in storehouse and media content index storehouse.
In one exemplary embodiment of the present invention, text index storehouse set in advance may include the word letter of webpage Breath, for example, the title of webpage, summary and/or text.Media content index storehouse set in advance may include media content information, For example, the title of media content, quantity, the first thumbnail, author, length and/or size, form and/or each media content A URL addresses.
In the exemplary embodiment of the invention shown in Fig. 3, in the case where searching request is associated with picture, pre- The webpage that lookup is matched with searching request in the text index storehouse for first setting and picture indices storehouse.Alternatively, text set in advance Word indexing storehouse may include at least one of the following text information:The title of webpage(As indicated by arrow 3A), summary(Such as arrow Indicated by 3B), and webpage URL(As indicated by arrow 3C).Alternatively, picture indices storehouse set in advance may include at least One of the following pictorial information:Picture header(As indicated by arrow 3D), picture number(As indicated by arrow 3E), picture First thumbnail(As indicated by arrow 3F), picture author(It is not shown), picture size or resolution ratio(It is not shown), picture format (It is not shown)With the URL addresses of picture(It is not shown).
In the exemplary embodiment of the invention shown in Fig. 4, in the case where searching request is associated with audio, pre- The webpage that lookup is matched with searching request in the text index storehouse for first setting and audio index storehouse.Alternatively, text set in advance Word indexing storehouse may include at least one of the following text information:Title(As indicated by arrow 4A), summary(As arrow 4B is signified Show)With the URL of webpage(As indicated by arrow 4C).Alternatively, audio index storehouse set in advance may include it is at least following in A kind of audio-frequency information:Audio title(As indicated by arrow 4D), audio thumbnail(As indicated by arrow 4E), audio author(Such as Indicated by arrow 4F), audio size(As indicated by arrow 4G), audio format(It is not shown), and audio URL addresses(Not Show).
In step S207, text information and the matchmaker of webpage are extracted from text index storehouse and media content index storehouse respectively Body content information, as the Search Results of searching request.Alternatively, the Search Results can be in one or more ustomer premises access equipment Upper display.
In one exemplary embodiment of the present invention, step S207 may include:Webpage is extracted from text index storehouse At least one of the following text information:Title, summary and text, as the Search Results of searching request.
In one exemplary embodiment of the present invention, step S207 may include:Net is extracted from media content index storehouse At least one of the following media content information of page:The title of media content, quantity, the first thumbnail, author, length and/ First URL addresses of size, form and each media content.
In one exemplary embodiment of the present invention, step S207 may include:It is one or more media in webpage The URL addresses of content assignment the 2nd, wherein the page of the second thumbnail for showing one or more media contents is pointed in the 2nd URL addresses Face.
In the exemplary embodiment of the invention shown in Fig. 3, in the case where searching request is associated with picture, respectively The text information and pictorial information of webpage are extracted from text index storehouse and picture indices storehouse, and is one or more in webpage Picture distributes new URL addresses, and the thumbnail of one or more pictures in display webpage is pointed in the wherein new URL addresses The page(As indicated by arrow 3G).Alternatively, it is that whole pictures in webpage distribute new URL addresses, the wherein new URL ground Point to the page of the thumbnail of whole pictures in display webpage in location.Alternatively, it is somebody's turn to do when user selects to correspond in Search Results During the option of new URL addresses, the picture header in such as Fig. 3(As indicated by arrow 3D)When, jump to corresponding to the new URL The page of address(As indicated by arrow 3G), to show the thumbnail of whole pictures in the webpage to user.
In one exemplary embodiment of the present invention, step S207 may include:Respectively from text index storehouse and media Hold the text information and media content information of extraction webpage in index database;And by predetermined way combination webpage text information and Media content information, as the Search Results of searching request.
In one exemplary embodiment of the present invention, combined in the text information and media of the webpage by predetermined way Appearance information, as searching request Search Results the step of include:Selected in a media from the media content information of webpage The first thumbnail for holding;And first thumbnail of media content is shown in Search Results.
In exemplary embodiment of the invention as shown in Figure 3, in the case where searching request is associated with picture, point The following text information of webpage is not extracted from text index storehouse:The title of webpage(As indicated by arrow 3A), summary(Such as arrow 3B It is indicated)And/or the URL of webpage(As indicated by arrow 3C), the following pictorial information of webpage is extracted from picture indices storehouse:Figure Piece title(As indicated by arrow 3D), picture number(As indicated by arrow 3E), picture the first thumbnail(As arrow 3F is signified Show), picture author(It is not shown), picture size or resolution ratio(It is not shown), picture format(It is not shown)And/or the URL of picture Address(It is not shown).Alternatively, first thumbnail is selected in the first thumbnail of the picture for being extracted(Such as arrow 3F institutes Indicate), to be displayed in Search Results.As shown in figure 3, each Search Results title including webpage, summary and/or URL, And first thumbnail of picture header, picture number and/or picture(As indicated by arrow 3F).
In one exemplary embodiment of the present invention, believe by the text information and media content of predetermined way combination webpage Breath, as searching request Search Results the step of include:Multiple media contents are selected from the media content information of webpage First thumbnail;And the first thumbnail of the plurality of media content is shown in Search Results.
In exemplary embodiment of the invention as shown in Figure 6, four are selected in the first thumbnail of the picture for being extracted Individual first thumbnail(As indicated by arrow 6E), to be displayed in Search Results.Of course it is to be understood that selected first breviary The quantity of figure is not limited to the quantity described in the embodiment of the present invention.In the Search Results shown in Fig. 6, each Search Results is equal Title including webpage, summary and URL, and picture header, picture number and picture four the first thumbnails(Such as arrow 6E It is indicated).
In one exemplary embodiment of the present invention, media content information includes word segment and thumbnail part, text Character segment points to the page of the second thumbnail for showing one or more media contents.
In exemplary embodiment of the invention as shown in Figure 3, in the case where searching request is associated with picture, figure Piece information includes word segment and thumbnail part.In Search Results, word segment may include picture header(Such as arrow 3D institutes Indicate), picture number(As indicated by arrow 3E), picture author(It is not shown), picture size or resolution ratio(It is not shown), picture Form(It is not shown)And/or the URL addresses of picture(It is not shown);Thumbnail part may include the first thumbnail of picture(Such as arrow Indicated by head 3F).Wherein, when user selects picture header(As indicated by arrow 3D), picture number(As indicated by arrow 3E) Or during other word segments, jump to the new page(As indicated by arrow 3G), the page shows the of one or more pictures Two thumbnails(As indicated by arrow 3H).Alternatively, the page shows the second thumbnail of whole pictures in webpage.
In exemplary embodiment of the invention as shown in Figure 6, in the case where searching request is associated with picture, figure Piece information includes word segment and thumbnail part.In Search Results, word segment may include picture header(It is not shown), figure Piece quantity(As indicated by arrow 6D), picture author(It is not shown), picture size or resolution ratio(It is not shown), picture format(Not Show), picture URL addresses(It is not shown), and/or other word segments(As indicated by arrow 6G "》”);Thumbnail part May include the first thumbnail of picture(As indicated by arrow 6E).Wherein, when user selects picture number(As arrow 6D is signified Show)Or during other word segments(As indicated by arrow 6G "》”), jump to the new page(As indicated by arrow 6H), the page Face shows the second thumbnail of one or more pictures(As indicated by arrow 6I).Alternatively, in page display webpage all Second thumbnail of picture.
In embodiments of the invention, in the case where searching request is associated with media content, in word set in advance The webpage that lookup is matched with searching request in index database and media content index storehouse;Respectively from text index storehouse and media content rope Draw the text information and media content information of extraction webpage in storehouse, as the Search Results of searching request, can be provided for client The mode of search media content information more directly perceived, being more readily understood, allows users to media content in substantially understanding webpage Relevant information, helps user to determine the information of the Search Results degree of correlation, so as to improve search efficiency.
It should be noted that the method shown in Fig. 5 is not limited carried out by the order of shown each step, can be according to need The sequencing of each step is adjusted, in addition, the step is also not limited to above-mentioned steps division, above-mentioned steps can be further Splitting into more multi-step can also be merged into less step.
Embodiment three
After search engine obtains web page media content information, the searching request of user can be based on, provided a user with Search Results.The method that web page media content information is provided is described below, specifically may include:It is default with webpage receiving During the searching request that the mark of media content information matches, the text information and media content information of webpage are extracted, as searching The Search Results of rope request;And in response to the selection of text information and media content information to webpage, there is provided Search Results.
Fig. 7 shows the flow chart of the method 300 for providing web page media content information according to an embodiment of the invention. In embodiments of the invention, media content can at least include one of the following:Picture, animation, Voice & Video.It is of course possible to Understand, media content can also include other guide.
As shown in fig. 7, in step S301, matching with the mark of default media content information in webpage receiving Searching request when, extract webpage text information and media content information, as the Search Results of searching request.For example, can Searching request is received with from one or more ustomer premises access equipment.Alternatively, searching request can be the search pass of user input Keyword.Of course it is to be understood that not limiting the concrete form of above-mentioned searching request in embodiments of the invention.
In an exemplary embodiment of the invention, step S301 may include:It is default with webpage receiving During the searching request that the mark of media content information matches, at least one of the following text information of webpage is extracted as searching The Search Results of rope request:Title, summary and text.
In an exemplary embodiment of the invention, step S301 may include:It is default with webpage receiving During the searching request that the mark of media content information matches, at least one of the following media content information of webpage is extracted: The title of media content, quantity, the first thumbnail, author, the URL ground of length and/size, form and each media content Location.
In exemplary embodiment of the invention as shown in Figure 3, receiving and default pictorial information in webpage During the searching request that mark matches, at least one of the following text information of webpage is extracted as the search knot of searching request Really:The title of webpage(As indicated by arrow 3A), summary(As indicated by arrow 3B), and webpage URL(As arrow 3C is signified Show).Alternatively, can extract at least one of the following pictorial information of webpage:Picture header(As indicated by arrow 3D), picture Quantity(As indicated by arrow 3E), picture the first thumbnail(As indicated by arrow 3F), picture author(It is not shown), picture it is big Small or resolution ratio(It is not shown), picture format(It is not shown)With the URL addresses of picture(It is not shown).
In the exemplary embodiment of the invention shown in Fig. 4, the mark with default audio-frequency information in webpage is being received During the searching request of sensible matching, at least one of the following text information of webpage is extracted as the search knot of searching request Really:Title(As indicated by arrow 4A), summary(As indicated by arrow 4B)With the URL of webpage(As indicated by arrow 4C).It is optional Ground, can extract at least one of the following audio-frequency information of webpage:Audio title(As indicated by arrow 4D), audio thumbnail (As indicated by arrow 4E), audio author(As indicated by arrow 4F), audio size(As indicated by arrow 4G), audio format (It is not shown), and audio URL addresses(It is not shown).
In an exemplary embodiment of the invention, step S301 may include to be extracted as the predistribution of each media content The 2nd URL addresses, wherein the second thumbnail of the one or more of media contents of display is pointed in the second URL addresses The page
In exemplary embodiment of the invention as shown in Figure 3, receiving and default pictorial information in webpage During the searching request that mark matches, the 2nd URL addresses of each picture predistribution are extracted as, wherein the 2nd URL addresses refer to To the page of the second thumbnail for showing one or more pictures.Alternatively, the whole pictures distribution in can extract as webpage is new URL addresses, the page of the thumbnail of whole pictures in display webpage is pointed in the wherein new URL addresses(As arrow 3G is signified Show).Alternatively, when user selects to correspond to the option of the new URL addresses in Search Results, such as picture header(Such as arrow Indicated by head 3D), picture number(As indicated by arrow 3E)Or during other word segments, jump to corresponding to the new URL ground The page of location(As indicated by arrow 3G), to show the thumbnail of whole pictures in the webpage to user.
In one exemplary embodiment of the present invention, step S301 may include:Receiving and default matchmaker in webpage During the searching request that the mark of body content information matches, the text information and media content information of webpage are extracted;And by pre- Text information and media content information that mode combines webpage are determined, as the Search Results of searching request.
In step S303, in response to the selection of text information and media content information to webpage, there is provided Search Results. For example, the Search Results can show on one or more ustomer premises access equipment.
In one exemplary embodiment of the present invention, in response to the selection to webpage text information, a URL is jumped to Address, to provide Search Results.For example, as shown in figure 4, in response to webpage text information(Webpage as indicated by arrow 4A Title)Selection, a URL addresses are jumped to, to provide the details of the media content(As indicated by arrow 4H).
In one exemplary embodiment of the present invention, in step S301, receiving and default media in webpage During the searching request that the mark of content information matches, the text information and media content information of webpage are extracted;And by following The text information and media content information of predetermined way combination webpage, as the Search Results of searching request:From the media of webpage First thumbnail of media content is selected in content information, and first contracting of media content is shown in Search Results Sketch map.Alternatively, in step S303, in response to a selection for the first thumbnail of media content, jumping to the 2nd URL Address, to obtain the page of the second thumbnail for showing one or more media contents.Alternatively, in response to the 2nd URL ground The selection of the second thumbnail of each media content shown in location, jumps to a URL addresses of the media content, to provide The information of the media content.
In exemplary embodiment of the invention as shown in Figure 3, in step S301, preset with webpage receiving Pictorial information mark match searching request when, extract webpage following text information:The title of webpage(Such as arrow 3A It is indicated), summary(As indicated by arrow 3B)And/or the URL of webpage(As indicated by arrow 3C), and extract webpage with figure below Piece information:Picture header(As indicated by arrow 3D), picture number(As indicated by arrow 3E), picture the first thumbnail(Such as Indicated by arrow 3F), picture author(It is not shown), picture size or resolution ratio(It is not shown), picture format(It is not shown)And/or The URL addresses of picture(It is not shown).Alternatively, the text information and pictorial information of webpage are combined by following predetermined way, as The Search Results of searching request:First thumbnail of picture is selected from the pictorial information of webpage, and is shown in the result One the first thumbnail of picture(As indicated by arrow 3F).Alternatively, in step S303, when user selects a picture Thumbnail(As indicated by arrow 3F)When, jump to the new page(As indicated by arrow 3G), the page shows one or many Second thumbnail of individual picture(As indicated by arrow 3H).Alternatively, the page shows the second breviary of whole pictures in webpage Figure.Alternatively, when user selects new interface(As indicated by arrow 3G)In each picture the second thumbnail(Such as arrow 3H It is indicated)When, a URL addresses of the picture are jumped to, to provide the details of the picture.
In another exemplary embodiment of the invention, in step S301, receiving and default matchmaker in webpage During the searching request that the mark of body content information matches, the text information and media content information of webpage are extracted;And by with The text information and media content information of lower predetermined way combination webpage, as the Search Results of searching request:From the matchmaker of webpage The first thumbnail of multiple media contents is selected in body content information, and the first of multiple media contents is shown in Search Results Thumbnail.Alternatively, in step S303, in response to the selection of the first thumbnail to each media content, the matchmaker is jumped to The URL addresses for holding in vivo, to provide the information of the media content.
In exemplary embodiment of the invention as shown in Figure 6, in step S301, preset with webpage receiving Pictorial information mark match searching request when, extract webpage following text information:Title(As arrow 6A is signified Show), summary(As indicated by arrow 6B)And/or the URL of webpage(As indicated by arrow 6C), and extract the following picture letter of webpage Breath:Picture header(It is not shown), picture number(As indicated by arrow 6D), picture the first thumbnail(As arrow 6E is signified Show), picture author(It is not shown), picture size or resolution ratio(It is not shown), picture format(It is not shown)And/or the URL of picture Address(It is not shown).Alternatively, the text information and pictorial information of webpage are combined by following predetermined way, as searching request Search Results:Four the first thumbnails are selected from the pictorial information of webpage, and shows four the first thumbnails in the result.When So it is appreciated that the embodiment of the present invention is not intended to limit selected and display picture number.Alternatively, in step S303, When user selects each in four picture thumbnails, the new page is jumped to(As indicated by arrow 6F), the page The details of the picture are provided.
In another exemplary embodiment of the invention, in step S301, receiving and default matchmaker in webpage During the searching request that the mark of body content information matches, the text information and media content information of webpage, wherein media are extracted Content information includes word segment and thumbnail part;And believe by the text information and media content of predetermined way combination webpage Breath, as the Search Results of searching request.Alternatively, in step S303, in response to the selection to word segment, is jumped to Two URL addresses, to obtain the page of the second thumbnail for showing one or more media contents.Alternatively, in response to second The selection of the second thumbnail of each media content shown in URL addresses, jumps to a URL addresses of the media content, To provide the information of the media content.
In exemplary embodiment of the invention as shown in Figure 3, the pictorial information in Search Results, including word segment With thumbnail part.Word segment may include picture header(As indicated by arrow 3D), picture number(As indicated by arrow 3E) And/or other words;Thumbnail part includes the first thumbnail of picture(As indicated by arrow 3F).Alternatively, when user selects Select picture header(As indicated by arrow 3D), picture number(As indicated by arrow 3E)Or during other words, jump to new page Face(As indicated by arrow 3G), the page shows the second thumbnail of one or more pictures(As indicated by arrow 3H).It is optional Ground, the second thumbnail of whole pictures in page display webpage.Alternatively, in response to the second thumbnail to each picture (As indicated by arrow 3H)Selection, a URL addresses of the picture are jumped to, to provide the details of the picture.
In exemplary embodiment of the invention as shown in Figure 6, the pictorial information in Search Results, including word segment With thumbnail part.Word segment may include picture header(It is not shown), picture number(As indicated by arrow 6D)And/or other Word(As indicated by arrow 6G in Fig. 6 "》”);Thumbnail part includes the first thumbnail of picture(As arrow 6E is signified Show).Alternatively, when user selects picture header(It is not shown), picture number(As indicated by arrow 6D)Or other words(As schemed In 6 indicated by arrow 6G "》”)When, jump to the new page(As indicated by arrow 6H), the page shows one or more figures Second thumbnail of piece(As indicated by arrow 6I).Alternatively, the page shows the second thumbnail of whole pictures in webpage.Can Selection of land, in response to the second thumbnail to each picture(As indicated by arrow 6I)Selection, jump to a URL of the picture Address, to provide the details of the picture.
In embodiments of the invention, the search that matches with the mark of default media content information in webpage is being received During request, the text information and media content information of webpage are extracted, as the Search Results of searching request;And in response to net Page text information and media content information selection, there is provided Search Results, can be provided in Search Results text information with Media content information, so as to provide the mode of search media content information more directly perceived, being more readily understood for client, makes user The relevant information of media content in webpage can substantially be understood, help user to determine the information of the Search Results degree of correlation, so as to carry Search efficiency high.
It should be noted that the method shown in Fig. 7 is not limited carried out by the order of shown each step, can be according to need The sequencing of each step is adjusted, in addition, the step is also not limited to above-mentioned steps division, above-mentioned steps can be further Splitting into more multi-step can also be merged into less step.
Example IV
Believing for search engine collecting web page media content for an exemplary embodiment of the invention is described below The device of breath.
Alternatively, the device is adapted for carrying out previously described method 100.
Fig. 8 shows a kind of device 400 for search engine collecting web page media content information of the invention Structural representation.In an embodiment of the present invention, the device 400 includes:Information scratching module 401, is suitable to capture info web; Label detection module 403, be adapted to detect for the info web whether the mark of the information comprising default media content;Information is carried Modulus block 405, be suitable in detecting info web comprising the mark in the case of, extract the word in the info web Information and media content information;Index database sets up module 407, is suitable to based on the text information and the media content information, Text index storehouse and media content index storehouse are set up respectively.
In an embodiment of the present invention, media content can at least include one of the following:Picture, animation, audio and regard Frequently.Of course it is to be understood that media content can also include other guide.
As shown in figure 8, device 400 includes information scratching module 401, it is suitable to capture info web.For example, as shown in figure 8, Information scratching module 401 can capture info web from one or more Website server.
In one exemplary embodiment of the present invention, info web may include text information and media content information.Can Selection of land, text information may include at least one of the following:Title, summary and text.Alternatively, media content information may include At least one of the following:The title of media content, quantity, the first thumbnail, author, length and/or size, form and each First URL addresses of media content.
In one exemplary embodiment of the present invention, for the webpage for carrying picture, info web may include word Information and pictorial information.Alternatively, text information may include:Title(As indicated by arrow 3A), summary(As arrow 3B is signified Show)And/or the URL of webpage(As indicated by arrow 3C).Alternatively, pictorial information may include picture header(As arrow 3D is signified Show), picture number(As indicated by arrow 3E), picture the first thumbnail(As indicated by arrow 3F), picture author(Do not show Go out), picture size or resolution ratio(It is not shown), picture format(It is not shown)And/or the URL addresses of picture(It is not shown).Certainly It is appreciated that text information and pictorial information can also include other guide.
In one exemplary embodiment of the present invention, for the webpage for carrying audio, info web may include word Information and audio-frequency information.Alternatively, text information may include:Title(As indicated by arrow 4A), summary(As arrow 4B is signified Show)And/or the URL of webpage(As indicated by arrow 4C).Alternatively, audio-frequency information may include audio title(As arrow 4D is signified Show), audio thumbnail(As indicated by arrow 4E), audio author(As indicated by arrow 4F), audio size(As arrow 4G is signified Show), audio format(It is not shown), and/or audio URL addresses(It is not shown).Of course it is to be understood that text information and audio letter Breath can also include other guide.
As shown in figure 8, device 400 includes label detection module 403, whether info web is adapted to detect for comprising default matchmaker The mark of the information held in vivo.
In one exemplary embodiment of the present invention, the information that label detection module 403 passes through default media content Mark come judge crawl info web in whether comprising specific media content.Alternatively, when the search of user input is closed When keyword can match with the specific media content, search engine can provide the Search Results comprising the webpage.Certainly may be used To understand, the concrete form of the mark of above-mentioned default media content information is not limited in embodiments of the invention.
As shown in figure 8, device 400 includes information extraction modules 405, it is suitable in info web is detected comprising mark In the case of, extract the text information and media content information in info web.
In one exemplary embodiment of the present invention, information extraction modules 405 are suitable to be wrapped in info web is detected In the case of mark, at least one of the following text information of webpage is extracted:Title, summary and text;And extract net At least one of the following media content information of page:The title of media content, quantity, the first thumbnail, author, length and/ Or a URL addresses of size, form and each media content.
In the exemplary embodiment of the invention shown in Fig. 3, for the webpage for carrying picture, in label detection module 403 detect in info web in the case of the mark comprising default pictorial information, and alternatively, information extraction modules 405 can Extract at least one of the following text information of the webpage:Title(As indicated by arrow 3A), summary(As arrow 3B is signified Show), and webpage URL(As indicated by arrow 3C).Alternatively, information extraction modules 405 can extract the webpage at least under A kind of pictorial information in row:Picture header(As indicated by arrow 3D), picture number(As indicated by arrow 3E), picture One thumbnail(As indicated by arrow 3F), picture author(It is not shown), picture size or resolution ratio(It is not shown), picture format (It is not shown)With the URL addresses of picture(It is not shown).
In the exemplary embodiment of the invention shown in Fig. 4, for the webpage for carrying audio, in label detection module 403 detect in info web in the case of the mark comprising default audio-frequency information, and alternatively, information extraction modules 405 can Extract at least one of the following text information of the webpage:Title(As indicated by arrow 4A), summary(As arrow 4B is signified Show)With the URL of webpage(As indicated by arrow 4C).Alternatively, information extraction modules 405 can extract the webpage it is at least following in A kind of audio-frequency information:Audio title(As indicated by arrow 4D), audio thumbnail(As indicated by arrow 4E), audio author (As indicated by arrow 4F), audio size(As indicated by arrow 4G), audio format(It is not shown), and audio URL addresses (It is not shown).
In one exemplary embodiment of the present invention, information extraction modules 405 are further adapted to webpage and distribute the 2nd URL ground The page of the second thumbnail of one or more media contents in display webpage is pointed in location, wherein the 2nd URL addresses.
In the exemplary embodiment of the invention shown in Fig. 3, for the webpage for carrying picture, in label detection module In the case that 403 detect in info web comprising above-mentioned mark, information extraction modules 405 can extract the original of picture in webpage URL addresses, and for webpage distributes new URL addresses, one or more figures in display webpage are pointed in the wherein new URL addresses The page of the thumbnail of piece.Alternatively, the page of the thumbnail of whole pictures in display webpage is pointed in the new URL addresses.Can Selection of land, when user selects to correspond to the option of the new URL addresses, the picture header in such as Fig. 3 in Search Results(Such as arrow Indicated by 3D)When, jump to the page corresponding to the new URL addresses(As indicated by arrow 3G), the net is shown with to user The thumbnail of whole pictures in page.Alternatively, when user selects the thumbnail of each picture in the page, the picture is jumped to Original URL, to provide the details of the picture.
As shown in figure 8, device 400 sets up module 407 including index database, it is suitable to believe based on text information and media content Breath, sets up text index storehouse and media content index storehouse respectively.
In one exemplary embodiment of the present invention, index database sets up module 407 and is suitable to make the text in text index storehouse Word information is associated with the media content information on same webpage in media content index storehouse.
In the exemplary embodiment of the invention shown in Fig. 3, for the webpage for carrying picture, index database sets up module 407 can be based at least one of the following text information that information extraction modules 405 are extracted:Title(As arrow 3A is signified Show), summary(As indicated by arrow 3B), and webpage URL(As indicated by arrow 3C), set up text index storehouse.Alternatively, Index database sets up module 407 and can be based at least one of the following pictorial information that information extraction modules 405 are extracted:Picture mark Topic(As indicated by arrow 3D), picture number(As indicated by arrow 3E), picture the first thumbnail(As indicated by arrow 3F)、 Picture author(It is not shown), picture size or resolution ratio(It is not shown), picture format(It is not shown)With the URL addresses of picture(Not Show), set up picture indices storehouse.Alternatively, index database sets up module 407 and may be adapted to make the above-mentioned word in text index storehouse to believe Breath is associated with the above-mentioned pictorial information on same webpage in picture indices storehouse.
In the exemplary embodiment of the invention shown in Fig. 4, for the webpage for carrying audio, index database sets up module 407 are suitable at least one of the following text information extracted based on information extraction modules 405:Title(As arrow 4A is signified Show), summary(As indicated by arrow 4B)With the URL of webpage(As indicated by arrow 4C), set up text index storehouse.Alternatively, rope Draw storehouse and set up module 407 and be suitable at least one of the following audio-frequency information extracted based on information extraction modules 405:Audio mark Topic(As indicated by arrow 4D), audio thumbnail(As indicated by arrow 4E), audio author(As indicated by arrow 4F), audio it is big It is small(As indicated by arrow 4G), audio format(It is not shown), and audio URL addresses(It is not shown), set up audio index storehouse. Alternatively, index database is set up module 407 and is suitable to make in above-mentioned text information in text index storehouse and audio index storehouse on same The above-mentioned audio-frequency information of one webpage is associated.
In embodiments of the invention, the device 400 of search engine collecting web page media content information can be provided for client The mode of search media content information more directly perceived, being more readily understood, allows users to media content in substantially understanding webpage Relevant information, helps user to determine the information of the Search Results degree of correlation, so as to improve search efficiency.
Embodiment five
The device for providing web page media content information of exemplary embodiment of the invention is described below.
Alternatively, the device is adapted for carrying out previously described method 200.
Fig. 9 shows the device 500 for providing web page media content information of exemplary embodiment of the invention Structural representation.In an embodiment of the present invention, device 500 includes:
Request receiving module 501, is suitable to receive searching request;
Request detection module 503, is adapted to detect for whether searching request is associated with media content;
Webpage searching module 505, is suitable in the case where searching request is associated with media content, in text set in advance The webpage that lookup is matched with searching request in word indexing storehouse and media content index storehouse;And
Information extraction modules 507, are suitable to be extracted from text index storehouse and media content index storehouse respectively the word of webpage Information and media content information, as the Search Results of searching request.
In an embodiment of the present invention, media content can at least include one of the following:Picture, animation, audio and regard Frequently.Of course it is to be understood that media content can also include other guide.
As shown in figure 9, device 500 includes request receiving module 501, it is suitable to receive searching request.For example, as shown in figure 9, Information extraction modules 507 can receive searching request from one or more ustomer premises access equipment.Alternatively, searching request can be The search keyword of user input.Of course it is to be understood that not limiting the specific of above-mentioned searching request in embodiments of the invention Form.
As shown in figure 9, device 500 include request detection module 503, be adapted to detect for searching request whether with media content phase Association.Alternatively, when user input search keyword, request detection module 503 judges whether the searching request of user contains matchmaker The demand held in vivo, for example whether containing picture demand, animation demand, video requirement or audio demand.
As shown in figure 9, device 500 includes Webpage searching module 505, it is suitable to what is be associated with media content in searching request In the case of, the webpage that lookup is matched with searching request in text index storehouse set in advance and media content index storehouse.
In one exemplary embodiment of the present invention, text index storehouse set in advance may include the word letter of webpage Breath, for example, the title of webpage, summary and/or text.Media content index storehouse set in advance may include media content information, For example, the title of media content, quantity, the first thumbnail, author, length and/or size, form and/or each media content A URL addresses.
In the exemplary embodiment of the invention shown in Fig. 3, searching request is detected with figure in request detection module 503 In the case that piece is associated, Webpage searching module 505 can be searched in text index storehouse set in advance and picture indices storehouse and The webpage of searching request matching.Alternatively, text index storehouse set in advance may include at least one of the following text information: The title of webpage(As indicated by arrow 3A), summary(As indicated by arrow 3B), and webpage URL(As arrow 3C is signified Show).Alternatively, picture indices storehouse set in advance may include at least one of the following pictorial information:Picture header(Such as arrow Indicated by 3D), picture number(As indicated by arrow 3E), picture the first thumbnail(As indicated by arrow 3F), picture author (It is not shown), picture size or resolution ratio(It is not shown), picture format(It is not shown)With the URL addresses of picture(It is not shown).
In the exemplary embodiment of the invention shown in Fig. 4, searching request and sound are detected in request detection module 503 In the case that frequency is associated, Webpage searching module 505 can be searched in text index storehouse set in advance and audio index storehouse and The webpage of searching request matching.Alternatively, text index storehouse set in advance may include at least one of the following text information: Title(As indicated by arrow 4A), summary(As indicated by arrow 4B)With the URL of webpage(As indicated by arrow 4C).Alternatively, Audio index storehouse set in advance may include at least one of the following audio-frequency information:Audio title(As indicated by arrow 4D)、 Audio thumbnail(As indicated by arrow 4E), audio author(As indicated by arrow 4F), audio size(As indicated by arrow 4G)、 Audio format(It is not shown), and audio URL addresses(It is not shown).
As shown in figure 9, device 500 includes information extraction modules 507, it is suitable to respectively from text index storehouse and media content rope Draw the text information and media content information of extraction webpage in storehouse, as the Search Results of searching request.Alternatively, the search knot Fruit can show on one or more ustomer premises access equipment.
In one exemplary embodiment of the present invention, information extraction modules 507 are suitable to extract net from text index storehouse At least one of the following text information of page:Title, summary and text, as the Search Results of searching request.
In one exemplary embodiment of the present invention, information extraction modules 507 are suitable to be carried from media content index storehouse Take at least one of the following media content information of webpage:The title of media content, quantity, the first thumbnail, author, length With/URL addresses of size, form and each media content.
In one exemplary embodiment of the present invention, information extraction modules 507 are suitable for one or more in webpage Media content distributes the 2nd URL addresses, wherein the second thumbnail for showing one or more media contents is pointed in the 2nd URL addresses The page.
In the exemplary embodiment of the invention shown in Fig. 3, searching request is detected with figure in request detection module 503 In the case that piece is associated, information extraction modules 507 can extract the text of webpage from text index storehouse and picture indices storehouse respectively Word information and pictorial information, and can be the new URL addresses of one or more pictures distribution in webpage, the wherein new URL ground Point to the page of the thumbnail of one or more pictures in display webpage in location(As indicated by arrow 3G).Alternatively, information is carried Modulus block 507 can be that the whole pictures in webpage distribute new URL addresses, wherein in new URL addresses sensing display webpage The page of the thumbnail of whole pictures.Alternatively, when user selects to correspond to the option of the new URL addresses in Search Results When, the picture header in such as Fig. 3(As indicated by arrow 3D)When, jump to the page corresponding to the new URL addresses(Such as arrow Indicated by 3G), to show the thumbnail of whole pictures in the webpage to user.
In one exemplary embodiment of the present invention, information extraction modules 507 include:Text information extraction unit and matchmaker In vivo hold information extraction unit be suitable to respectively from text index storehouse and media content index storehouse extract webpage text information and Media content information;And information combination unit, it is suitable to by the text information and media content information of predetermined way combination webpage, As the Search Results of searching request.
In one exemplary embodiment of the present invention, information combination unit is suitable to be selected from the media content information of webpage Select first thumbnail of media content;And first thumbnail of media content is shown in Search Results.
In exemplary embodiment of the invention as shown in Figure 3, request detection module 503 detect searching request with In the case that picture is associated, text information extraction unit and media content information extraction unit are extracted from text index storehouse respectively The following text information of webpage:The title of webpage(As indicated by arrow 3A), summary(As indicated by arrow 3B)And/or webpage URL(As indicated by arrow 3C), the following pictorial information of webpage is extracted from picture indices storehouse:Picture header(Such as arrow 3D institutes Indicate), picture number(As indicated by arrow 3E), picture the first thumbnail(As indicated by arrow 3F), picture author(Do not show Go out), picture size or resolution ratio(It is not shown), picture format(It is not shown)And/or the URL addresses of picture(It is not shown).It is optional Ground, information combination unit selects first thumbnail in the first thumbnail of the picture for being extracted(As arrow 3F is signified Show), to be displayed in Search Results.As shown in figure 3, each Search Results title including webpage, summary and URL, and One the first thumbnail of picture header, picture number and picture(As indicated by arrow 3F).
In one exemplary embodiment of the present invention, information combination unit is suitable to be selected from the media content information of webpage Select the first thumbnail of multiple media contents;And the first thumbnail of the plurality of media content is shown in Search Results.
In exemplary embodiment of the invention as shown in Figure 6, information combination unit can be in the first thumbnail of picture Four the first thumbnails of picture of selection(As indicated by arrow 6E), to be displayed in Search Results.Of course it is to be understood that information The quantity of selected first thumbnail of assembled unit is not limited to the quantity described in the embodiment of the present invention.Shown in Fig. 6 In Search Results, each Search Results title including webpage, summary and URL, and picture header, picture number and picture Four the first thumbnails(As indicated by arrow 6E).
In one exemplary embodiment of the present invention, media content information includes word segment and thumbnail part, text Character segment points to the page of the second thumbnail for showing one or more media contents.
In exemplary embodiment of the invention as shown in Figure 3, request detection module 503 detect searching request with In the case that picture is associated, pictorial information includes word segment and thumbnail part.In Search Results, word segment can be wrapped Include picture header(As indicated by arrow 3D), picture number(As indicated by arrow 3E), picture author(It is not shown), picture size Or resolution ratio(It is not shown), picture format(It is not shown)And/or the URL addresses of picture(It is not shown);Thumbnail part may include First thumbnail of picture(As indicated by arrow 3F).Wherein, when user selects picture header(As indicated by arrow 3D), picture Quantity(As indicated by arrow 3E)Or during other word segments, jump to the new page(As indicated by arrow 3G), the page show Show the second thumbnail of one or more pictures(As indicated by arrow 3H).Alternatively, whole pictures in page display webpage The second thumbnail.
In exemplary embodiment of the invention as shown in Figure 6, request detection module 503 detect searching request with In the case that picture is associated, pictorial information includes word segment and thumbnail part.In Search Results, word segment can be wrapped Include picture header(It is not shown), picture number(As indicated by arrow 6D), picture author(It is not shown), picture size or resolution ratio (It is not shown), picture format(It is not shown), picture URL addresses(It is not shown)And/or other word segments(As arrow 6G is signified Show "》”);Thumbnail part may include the first thumbnail of picture(As indicated by arrow 6E).Wherein, when user selects picture Quantity(As indicated by arrow 6D)Or during other word segments(As indicated by arrow 6G "》”), jump to the new page(Such as arrow Indicated by head 6H), the page shows the second thumbnail of one or more pictures(As indicated by arrow 6I).Alternatively, the page Second thumbnail of whole pictures in face display webpage.
In embodiments of the invention, the device 500 for providing web page media content information can provide more straight for client The mode of the search media content information see, being more readily understood, allows users to substantially understand the correlation of media content in webpage Information, helps user to determine the information of the Search Results degree of correlation, so as to improve search efficiency.
Embodiment six
The device for providing web page media content information of exemplary embodiment of the invention is described below.
Alternatively, the device is adapted for carrying out previously described method 300.
Figure 10 shows the device 600 for providing web page media content information of exemplary embodiment of the invention Structural representation.
In an embodiment of the present invention, device 600 includes:
Information extraction modules 601, are suitable to match with the mark of default media content information in webpage receiving During searching request, the text information and media content information of webpage are extracted, as the Search Results of searching request;And
Search Results provide module 603, are adapted for the selection to the text information and media content information of webpage, carry For Search Results.
In an embodiment of the present invention, media content can at least include one of the following:Picture, animation, audio and regard Frequently.Of course it is to be understood that media content can also include other guide.
As shown in Figure 10, device 600 includes information extraction modules 601, is suitable to receiving and default media in webpage During the searching request that the mark of content information matches, the text information and media content information of webpage are extracted, please as search The Search Results asked.For example, as shown in Figure 10, it is default with webpage being received from one or more ustomer premises access equipment During the searching request that the mark of media content information matches, information extraction modules 601 can extract webpage text information and Media content information, as the Search Results of searching request.
Alternatively, searching request can be the search keyword of user input.Of course it is to be understood that embodiments of the invention In do not limit the concrete form of above-mentioned searching request.
In an exemplary embodiment of the invention, information extraction modules 601 be suitable to receive it is pre- with webpage If media content information mark match searching request when, extract webpage at least one of the following text information make It is the Search Results of searching request:Title, summary and text.
In an exemplary embodiment of the invention, information extraction modules 601 be suitable to receive it is pre- with webpage If media content information mark match searching request when, extract webpage at least one of the following media content letter Breath:The title of media content, quantity, the first thumbnail, author, the first of length and/size, form and each media content URL addresses.
In exemplary embodiment of the invention as shown in Figure 3, receiving and default pictorial information in webpage During the searching request that mark matches, at least one of the following text information that information extraction modules 601 can extract webpage is made It is the Search Results of searching request:The title of webpage(As indicated by arrow 3A), summary(As indicated by arrow 3B), and webpage URL(As indicated by arrow 3C).Alternatively, information extraction modules 601 can extract at least one of the following picture of webpage Information as searching request Search Results:Picture header(As indicated by arrow 3D), picture number(As indicated by arrow 3E)、 First thumbnail of picture(As indicated by arrow 3F), picture author(It is not shown), picture size or resolution ratio(It is not shown), figure Piece form(It is not shown)With the URL addresses of picture(It is not shown).
In the exemplary embodiment of the invention shown in Fig. 4, the mark with default audio-frequency information in webpage is being received During the searching request of sensible matching, information extraction modules 601 can extract at least one of the following text information conduct of webpage The Search Results of searching request:Title(As indicated by arrow 4A), summary(As indicated by arrow 4B)With the URL of webpage(Such as arrow Indicated by head 4C).Alternatively, information extraction modules 601 can extract at least one of the following audio-frequency information of webpage as searching The Search Results of rope request:Audio title(As indicated by arrow 4D), audio thumbnail(As indicated by arrow 4E), audio author (As indicated by arrow 4F), audio size(As indicated by arrow 4G), audio format(It is not shown), and audio URL addresses (It is not shown).
In an exemplary embodiment of the invention, information extraction modules 601 can extract as each media content is pre- 2nd URL addresses of distribution, wherein the second thumbnail of the one or more of media contents of display is pointed in the 2nd URL addresses The page
In exemplary embodiment of the invention as shown in Figure 3, receiving and default pictorial information in webpage During the searching request that mark matches, it is the 2nd URL addresses of each picture predistribution that information extraction modules 601 can extract, its In the 2nd URL addresses point to show one or more pictures the second thumbnail the page.Alternatively, information extraction modules 601 can extract as webpage in whole pictures distribute new URL addresses, wherein the new URL addresses are pointed to complete in display webpage The page of the thumbnail of portion's picture(As indicated by arrow 3G).Alternatively, when user selects new corresponding to this in Search Results URL addresses option when, such as picture header(As indicated by arrow 3D), picture number(As indicated by arrow 3E)Or other texts During character segment, the page corresponding to the new URL addresses is jumped to(As indicated by arrow 3G), in showing the webpage to user The thumbnail of whole pictures.
In one exemplary embodiment of the present invention, information extraction modules 601 include text information extraction unit and matchmaker Hold information extraction unit in vivo, be suitable to ask receiving the search that matches with the mark of default media content information in webpage When asking, the text information and media content information of webpage are extracted;And information combination unit, it is suitable to combine webpage by predetermined way Text information and media content information, as the Search Results of searching request.
As shown in Figure 10, device 600 includes that Search Results provide module 603, is adapted for the text information to webpage With the selection of media content information, there is provided Search Results.For example, as shown in Figure 10, the Search Results can be at one or more Shown on individual ustomer premises access equipment.
In one exemplary embodiment of the present invention, Search Results provide module 603 and may be in response to believe webpage word The selection of breath, jumps to a URL addresses, to provide Search Results.For example, as shown in figure 4, Search Results provide module 603 May be in response to webpage text information(Web page title as indicated by arrow 4A)Selection, execution jump to a URL addresses The step of, to provide the details of the video(As indicated by arrow 4H).
In another exemplary embodiment of the invention, text information extraction unit and media content information extraction unit When the searching request matched with the mark of default media content information in webpage is received, the word letter of webpage can be extracted Breath and media content information;And information combination unit can be believed by the text information of following predetermined way combination webpage and media content Breath, as the Search Results of searching request:First thumbnail of media content is selected from the media content information of webpage, And first thumbnail of media content is shown in Search Results.Alternatively, Search Results provide module 603 and may be in response to To a selection for the first thumbnail of media content, the step of execution jumps to the 2nd URL addresses, with obtain display one or The page of the second thumbnail of multiple media contents.Alternatively, Search Results provide module 603 and may be in response to the 2nd URL ground The selection of the second thumbnail of each media content shown in location, execution jumps to a URL addresses of the media content Step, to provide the information of the media content.
In exemplary embodiment of the invention as shown in Figure 3, text information extraction unit and media content information are carried Taking unit can extract the following of webpage when the searching request matched with the mark of default pictorial information in webpage is received Text information:The title of webpage(As indicated by arrow 3A), summary(As indicated by arrow 3B)And/or the URL of webpage(Such as arrow Indicated by 3C), and extract the following pictorial information of webpage:Picture header(As indicated by arrow 3D), picture number(Such as arrow 3E It is indicated), picture the first thumbnail(As indicated by arrow 3F), picture author(It is not shown), picture size or resolution ratio(Not Show), picture format(It is not shown)And/or the URL addresses of picture(It is not shown).Alternatively, information combination unit can be by following Predetermined way combines the text information and pictorial information of webpage, used as the Search Results of searching request:From the pictorial information of webpage One the first thumbnail of picture of middle selection, and first thumbnail of picture is shown in the result(As arrow 3F is signified Show).Alternatively, when user selects a thumbnail for picture(As indicated by arrow 3F)When, Search Results provide module 603 It is executable to jump to the new page(As indicated by arrow 3G)The step of, the page shows the second breviary of one or more pictures Figure(As indicated by arrow 3H).Alternatively, the page shows the second thumbnail of whole pictures in webpage.Alternatively, user is worked as The new interface of selection(As indicated by arrow 3G)In each picture the second thumbnail(As indicated by arrow 3H)When, search knot Fruit provides the step of module 603 is executable to jump to a URL addresses of the picture, to show the details of the picture.
In another exemplary embodiment of the invention, text information extraction unit and media content information extraction unit When the searching request matched with the mark of default media content information in webpage is received, the word letter of webpage can be extracted Breath and media content information;And information combination unit can be combined the text information and media content of webpage by following predetermined way Information, as the Search Results of searching request:The first breviary of multiple media contents is selected from the media content information of webpage Figure, and the first thumbnail of multiple media contents is shown in Search Results.Alternatively, Search Results provide module 603 and can ring Selections of the Ying Yu to the first thumbnail of each media content, performs the step of jumping to a URL addresses of the media content, To provide the information of the media content.
In exemplary embodiment of the invention as shown in Figure 6, text information extraction unit and media content information are carried Taking unit can extract the following of webpage when the searching request matched with the mark of default pictorial information in webpage is received Text information:Title(As indicated by arrow 6A), summary(As indicated by arrow 6B)And/or the URL of webpage(As arrow 6C is signified Show), and extract the following pictorial information of webpage:Picture header(It is not shown), picture number(As indicated by arrow 6D), picture First thumbnail(As indicated by arrow 6E), picture author(It is not shown), picture size or resolution ratio(It is not shown), picture format (It is not shown)And/or the URL addresses of picture(It is not shown).Alternatively, information combination unit can be by following predetermined way combinational network The text information and pictorial information of page, as the Search Results of searching request:Four first are selected from the pictorial information of webpage Thumbnail, and four the first thumbnails are shown in the result.Of course it is to be understood that the embodiment of the present invention be not intended to limit it is selected With the picture number of display.Alternatively, when user selects each in four picture thumbnails, Search Results provide mould Block 603 is executable to jump to the new page(As indicated by arrow 6F)The step of, the page shows the details of the picture.
In another exemplary embodiment of the invention, text information extraction unit and media content information extraction unit When the searching request matched with the mark of default media content information in webpage is received, the word letter of webpage can be extracted Breath and media content information, wherein media content information include word segment and thumbnail part;And information combination unit can By the text information and media content information of predetermined way combination webpage, as the Search Results of searching request.Alternatively, search for Result offer module 603 may be in response to the selection to word segment, perform the step of jumping to the 2nd URL addresses, aobvious to obtain Show the page of the second thumbnail of one or more media contents.Alternatively, Search Results provide module 603 and may be in response to the The selection of the second thumbnail of each media content shown in two URL addresses, execution jumps to a URL of the media content The step of address, to provide the information of the media content.
In exemplary embodiment of the invention as shown in Figure 3, the pictorial information in Search Results, including word segment With thumbnail part.Word segment may include picture header(As indicated by arrow 3D), picture number(As indicated by arrow 3E) And/or other words;Thumbnail part includes the first thumbnail of picture(As indicated by arrow 3F).Alternatively, when user selects Select picture header(As indicated by arrow 3D)Or picture number(As indicated by arrow 3E)When, Search Results provide module 603 can Execution jumps to the new page(As indicated by arrow 3G)The step of, the page shows the second thumbnail of one or more pictures (As indicated by arrow 3H).Alternatively, the page shows the second thumbnail of whole pictures in webpage.Alternatively, in response to right Second thumbnail of each picture(As indicated by arrow 3H)Selection, Search Results provide that module 603 is executable to jump to this The step of first URL addresses of picture, to provide the details of the picture.
In exemplary embodiment of the invention as shown in Figure 3, the pictorial information in Search Results, including word segment With thumbnail part.Word segment may include picture header(It is not shown), picture number(As indicated by arrow 6D)And/or other Word(As indicated by arrow 6G in Fig. 6 "》”);Thumbnail part includes the first thumbnail of picture(As arrow 6E is signified Show).Alternatively, when user selects picture header(It is not shown), picture number(As indicated by arrow 6D), or other words(Such as In Fig. 6 indicated by arrow 6G "》”)When, Search Results offer module 603 is executable to jump to the new page(Such as arrow 6H institutes Indicate)The step of, the page shows the second thumbnail of one or more pictures(As indicated by arrow 6I).Alternatively, the page Second thumbnail of whole pictures in face display webpage.Alternatively, in response to the second thumbnail to each picture(Such as arrow 6I It is indicated)Selection, Search Results provide the step of module 603 is executable to jump to a URL addresses of the picture, to provide The details of the picture.
In embodiments of the invention, the device 600 for providing web page media content information can be carried in Search Results For text information and media content information, so that for client provides search media content information more directly perceived, being more readily understood Mode, allow users to substantially understand webpage in media content relevant information, help user determine the Search Results degree of correlation Information, so as to improve search efficiency.
Method and apparatus are not inherently related to any certain computer, virtual system or miscellaneous equipment provided herein. Various general-purpose systems can also be used together with based on teaching in this.As described above, construct required by this kind of device Structure be obvious.Additionally, the present invention is not also directed to any certain programmed language.It is understood that, it is possible to use it is various Programming language realizes the content of invention described herein, and the description done to language-specific above is to disclose this hair Bright preferred forms.
In specification mentioned herein, numerous specific details are set forth.It is to be appreciated, however, that implementation of the invention Example can be put into practice in the case of without these details.In some instances, known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify one or more that the disclosure and helping understands in each inventive aspect, exist Above to the description of exemplary embodiment of the invention in, each feature of the invention is grouped together into single implementation sometimes In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor The application claims of shield features more more than the feature being expressly recited in each claim.More precisely, such as right As claim reflects, inventive aspect is all features less than single embodiment disclosed above.Therefore, it then follows tool Thus claims of body implementation method are expressly incorporated in the specific embodiment, wherein each claim conduct in itself Separate embodiments of the invention.
Those skilled in the art are appreciated that can be carried out adaptively to the module in the device in embodiment Change and they are arranged in one or more devices different from the embodiment.Can be some modules in embodiment A module or unit or component are combined into, and multiple submodule or subelement or sub-component can be divided into addition. In addition at least some in such feature and/or process or module exclude each other, any combinations pair can be used This specification(Including adjoint claim, summary and accompanying drawing)Disclosed in all features and so disclosed any method Or all processes or unit of equipment are combined.Unless expressly stated otherwise, this specification(Will including adjoint right Ask, make a summary and accompanying drawing)Disclosed in each feature can be replaced by providing identical, equivalent or similar purpose alternative features.
Although additionally, it will be appreciated by those of skill in the art that some embodiments described herein include other embodiments In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention Within the scope of and form different embodiments.For example, in detail in the claims, embodiment required for protection it is one of any Mode can use in any combination.
Each device embodiment of the invention can be realized with hardware, or be run with one or more processor Software module realize, or with combinations thereof realize.It will be understood by those of skill in the art that can use in practice Microprocessor or digital signal processor(DSP)To realize some or all moulds in device according to embodiments of the present invention The some or all functions of block.The present invention is also implemented as the part or complete for performing method as described herein The program of device in portion(For example, computer program and computer program product).It is such to realize that program of the invention be stored On a computer-readable medium, or can have one or more signal form.Such signal can be from internet Downloaded on website and obtained, or provided on carrier signal, or provided in any other form.
It should be noted that above-described embodiment the present invention will be described rather than limiting the invention, and ability Field technique personnel can design alternative embodiment without departing from the scope of the appended claims.In the claims, Any reference symbol being located between bracket should not be configured to limitations on claims.Word "comprising" is not excluded the presence of not Element listed in the claims or step.Word "a" or "an" before element is not excluded the presence of as multiple Element.The present invention can come real by means of the hardware for including some different elements and by means of properly programmed computer It is existing.If in the unit claim for listing equipment for drying, several in these devices can be by same hardware branch To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and run after fame Claim.

Claims (16)

1. a kind of method that web page media content information is provided, including step:
Receive searching request;
Detect whether the searching request is associated with the media content;
In the case where the searching request is associated with the media content, in text index storehouse set in advance and media The webpage matched with the searching request is searched in appearance index database;And
In the text information and media for extracting the webpage from the text index storehouse and the media content index storehouse respectively Appearance information, as the Search Results of the searching request;
Text information and the matchmaker for wherein extracting the webpage from the text index storehouse and the media content index storehouse respectively Body content information, as the searching request Search Results the step of also include:
For one or more media contents in webpage distribute the 2nd URL addresses, wherein display institute is pointed in the second URL addresses State the page of the second thumbnail of one or more media contents.
2. the method for claim 1, wherein the media content at least includes one of the following:Picture, animation, sound Frequency and video.
3. the method for claim 1, wherein being carried from the text index storehouse and the media content index storehouse respectively Take the text information and media content information of the webpage, as the searching request Search Results the step of include:
Text information described at least one of the following of the webpage is extracted from the text index storehouse:Title, summary and Text, as the Search Results of the searching request.
4. the method as described in claim any one of 1-3, wherein respectively from the text index storehouse and the media content rope Draw the text information and media content information that the webpage is extracted in storehouse, as the searching request Search Results the step of wrap Include:
Media content information described at least one of the following of the webpage is extracted from the media content index storehouse:Media The title of content, quantity, the first thumbnail, author, a URL addresses of length and/size, form and each media content.
5. the method as described in claim any one of 1-3, wherein respectively from the text index storehouse and the media content rope Draw the text information and media content information that the webpage is extracted in storehouse, as the searching request Search Results the step of wrap Include:
In the text information and media for extracting the webpage from the text index storehouse and the media content index storehouse respectively Appearance information;And
The text information and media content information of the webpage are combined by predetermined way, as the search knot of the searching request Really.
6. method as claimed in claim 5, wherein believing by the text information and media content that predetermined way combines the webpage Breath, as the searching request Search Results the step of include:
First thumbnail of media content is selected from the media content information of the webpage;And
The first thumbnail of one media content is shown in the Search Results.
7. method as claimed in claim 5, wherein believing by the text information and media content that predetermined way combines the webpage Breath, as the searching request Search Results the step of include:
The first thumbnail of multiple media contents is selected from the media content information of the webpage;And
The first thumbnail of the multiple media content is shown in the Search Results.
8. the method as any one of claim 1-3, wherein the media content information includes word segment and breviary Figure part, the word segment points to the page of the second thumbnail of the one or more of media contents of display.
9. a kind of device for providing web page media content information, including:
Request receiving module, is suitable to receive searching request;
Request detection module, is adapted to detect for whether the searching request is associated with the media content;
Webpage searching module, is suitable in the case where the searching request is associated with the media content, set in advance The webpage matched with the searching request is searched in text index storehouse and media content index storehouse;And
Information extraction modules, are suitable to extract the webpage from the text index storehouse and the media content index storehouse respectively Text information and media content information, as the Search Results of the searching request;
Wherein described information extraction module is suitable to:
For one or more media contents in webpage distribute the 2nd URL addresses, wherein display institute is pointed in the second URL addresses State the page of the second thumbnail of one or more media contents.
10. device as claimed in claim 9, wherein the media content at least includes one of the following:Picture, animation, Voice & Video.
11. devices as claimed in claim 9, wherein described information extraction module is suitable to:
Text information described at least one of the following of the webpage is extracted from the text index storehouse:Title, summary and Text, as the Search Results of the searching request.
12. device as described in claim any one of 9-11, wherein described information extraction module is suitable to:
Media content information described at least one of the following of the webpage is extracted from the media content index storehouse:Media The title of content, quantity, the first thumbnail, author, a URL addresses of length and/size, form and each media content.
13. device as described in claim any one of 9-11, wherein described information extraction module includes:
Text information extraction unit and media content information extraction unit, are suitable to respectively from the text index storehouse and the media The text information and media content information of the webpage are extracted in content indexing storehouse;And
Information combination unit, is suitable to be combined by predetermined way the text information and media content information of the webpage, as described The Search Results of searching request.
14. devices as claimed in claim 13, wherein described information assembled unit is suitable to:
First thumbnail of media content is selected from the media content information of the webpage;And
The first thumbnail of one media content is shown in the Search Results.
15. devices as claimed in claim 13, wherein described information assembled unit is suitable to:
The first thumbnail of multiple media contents is selected from the media content information of the webpage;And
The first thumbnail of the multiple media content is shown in the Search Results.
16. device as any one of claim 9-11, wherein the media content information includes word segment and contracting Sketch map part, the word segment points to the page of the second thumbnail of the one or more of media contents of display.
CN201310487602.XA 2013-10-17 2013-10-17 A kind of method and apparatus that web page media content information is provided Active CN103761232B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310487602.XA CN103761232B (en) 2013-10-17 2013-10-17 A kind of method and apparatus that web page media content information is provided

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310487602.XA CN103761232B (en) 2013-10-17 2013-10-17 A kind of method and apparatus that web page media content information is provided

Publications (2)

Publication Number Publication Date
CN103761232A CN103761232A (en) 2014-04-30
CN103761232B true CN103761232B (en) 2017-07-11

Family

ID=50528472

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310487602.XA Active CN103761232B (en) 2013-10-17 2013-10-17 A kind of method and apparatus that web page media content information is provided

Country Status (1)

Country Link
CN (1) CN103761232B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10289700B2 (en) * 2016-03-01 2019-05-14 Baidu Usa Llc Method for dynamically matching images with content items based on keywords in response to search queries
US10496698B2 (en) * 2016-08-24 2019-12-03 Baidu Usa Llc Method and system for determining image-based content styles
CN108268486B (en) * 2016-12-30 2022-04-19 中兴通讯股份有限公司 Multimedia content association and playing method and device, and terminal
CN108595640A (en) * 2018-04-25 2018-09-28 河南职业技术学院 Computer based picture display method and picture display device
CN109547862A (en) * 2018-12-06 2019-03-29 武汉微梦文化科技有限公司 A kind of new media live broadcast system
CN110377797A (en) * 2019-07-31 2019-10-25 重庆大司空信息科技有限公司 A kind of occupational qualification search method and system

Also Published As

Publication number Publication date
CN103761232A (en) 2014-04-30

Similar Documents

Publication Publication Date Title
CN103761232B (en) A kind of method and apparatus that web page media content information is provided
US10140378B2 (en) Providing search results based on execution of applications
US10353947B2 (en) Relevancy evaluation for image search results
US11847124B2 (en) Contextual search on multimedia content
US9569541B2 (en) Evaluating preferences of content on a webpage
US9659278B2 (en) Methods, systems, and computer program products for displaying tag words for selection by users engaged in social tagging of content
US9218414B2 (en) System, method, and user interface for a search engine based on multi-document summarization
KR101132509B1 (en) Mobile system, search system and search result providing method for mobile search
CN103559286B (en) Processing method and device for video searching results
CN104991962B (en) A kind of method and device generating recommendation information
CN104008180B (en) Association method of structural data with picture, association device thereof
CN104462575B (en) The implementation method and device of music synthesis search
CN108959586A (en) Text vocabulary is identified in response to visual query
CN105786875B (en) Question and answer are provided to the method and apparatus of data search result
JPWO2014178219A1 (en) Information processing apparatus and information processing method
WO2015003664A1 (en) Method, device, server, and client device for download processing
CN106599285A (en) News searching-based searching result providing method and apparatus
CN103761231A (en) Method and device for providing media content information of page by search engine
CN105808623B (en) A kind of page access event correlation methodology and device based on search
CN105786871B (en) Question and answer class search result rendering method and device based on search term
CN107562954A (en) Recommendation searching method, device and mobile terminal based on mobile terminal
CN103761230A (en) Method and device for capturing media content information of webpage by search engine
WO2015143911A1 (en) Method and device for pushing webpages containing time-relevant information
CN104536968B (en) A kind of method and apparatus for Optimizing Search result
CN103473357B (en) A kind of method and device of search engine offer open type summary information of webpage

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220725

Address after: Room 801, 8th floor, No. 104, floors 1-19, building 2, yard 6, Jiuxianqiao Road, Chaoyang District, Beijing 100015

Patentee after: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Address before: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park)

Patentee before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Patentee before: Qizhi software (Beijing) Co.,Ltd.