CN1845100A - Image extraction feedback method in web search - Google Patents

Image extraction feedback method in web search Download PDF

Info

Publication number
CN1845100A
CN1845100A CN200610040316.9A CN200610040316A CN1845100A CN 1845100 A CN1845100 A CN 1845100A CN 200610040316 A CN200610040316 A CN 200610040316A CN 1845100 A CN1845100 A CN 1845100A
Authority
CN
China
Prior art keywords
user
webpage
feedback
image extraction
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN200610040316.9A
Other languages
Chinese (zh)
Other versions
CN100481079C (en
Inventor
周志华
薛晓冰
张仲非
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CNB2006100403169A priority Critical patent/CN100481079C/en
Publication of CN1845100A publication Critical patent/CN1845100A/en
Application granted granted Critical
Publication of CN100481079C publication Critical patent/CN100481079C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an image summary feedback method used in Web search, which comprises: (1) the page processor receives the inquiry formed by the keywords referred by user, using Web search method based on the keyword to search the Web; (2), using the method that based on image summary to feedback, to extract the user inquiry; (3), feeding back demanded search result. The invention can utilize the word information and the image information in the website, to effectively attain the information of user to improve the ability of Web search.

Description

Image extraction feedback method in the Web search
One, technical field
The present invention relates to page processor, particularly a kind of image extraction feedback method that is applied to the Web search.
Two, background technology
Along with developing rapidly of multimedia technology and internet, Web becomes people most important and information source and the most frequently used information exchange platform efficiently gradually.Because the quantity of information of Web itself is huge and content is complicated, user's searching information from the internet is very difficult, and therefore, designing effective Web search technique becomes an important job.In the Web search procedure, after search engine feeds back to the user with Search Results, if the user is dissatisfied to Search Results, usually will provide feedback according to Search Results, it is relevant with ferret out for example to indicate which result, search engine utilizes these feedback informations further to search for again, to produce better Search Results.User feedback technology in the present Web search has only been used the Word message in the webpage, and does not utilize a large amount of image informations that exist in the webpage.
Three, summary of the invention
1, goal of the invention: fundamental purpose of the present invention is that the user feedback technology in searching at present Web is not utilized this problem of picture material in the webpage well, provide a kind of user feedback method of utilizing Word message and image information simultaneously, by the performance of the auxiliary Web of the raising search of the information requirement of obtaining the user effectively.
2, technical scheme: for realizing purpose of the present invention, a kind of image extraction feedback method that is applicable in the Web search of the present invention, may further comprise the steps: the inquiry of forming by keyword that (1) page processor accepts that the user submits to, use is searched for Web based on the Web searching method of keyword; (2) method of feeding back based on the image extraction user inquiring of refining; (3) finally return customer satisfaction system Search Results.
Wherein the method for feeding back based on the image extraction user inquiring of refining may further comprise the steps: (11) obtain user's searching keyword; (12) utilize Web searching method to obtain relevant webpage based on keyword; (13) literal that generates webpage from the related web page that searches is made a summary and image extraction, and these information are submitted to the user, understands Search Results for the user; (14) user judges the webpage that searches, if webpage satisfies user's demand, then goes to (16), and whole process finishes; Otherwise, go to (15); (15) obtain user's feedback information, generate new searching keyword, go to (12); (16) finish.
3, beneficial effect: method provided by the present invention is used a large amount of image informations that exist in Word message in the webpage and the webpage simultaneously, the performance of the greatly auxiliary Web of the raising search of information requirement by obtaining the user effectively.
Below in conjunction with accompanying drawing most preferred embodiment is elaborated.
Four, description of drawings
Fig. 1 is the page processor workflow diagram.
Fig. 2 is the process flow diagram of the inventive method.
Fig. 3 is the process flow diagram that obtains literal summary and image extraction.
Fig. 4 is the process flow diagram that obtains first kind image extraction.
Fig. 5 is the process flow diagram that obtains the second class image extraction.
Fig. 6 obtains the process flow diagram that field feedback generates new searching keyword.
Fig. 7 is the process flow diagram that the user selects to carry out first kind feedback.
Fig. 8 is the process flow diagram that the user selects to carry out second class feedback.
Five, embodiment
As shown in Figure 1, the inquiry of being made up of keyword that page processor accepts that the user submits to uses the Web search technique based on keyword of classics that Web is searched for.Use the technology of describing among Fig. 2 of feeding back based on the image extraction user inquiring of refining, finally return customer satisfaction system Search Results.
Technology of the present invention as shown in Figure 2.Step 10 is initial actuatings.Step 11 is obtained user's searching keyword.Step 12 utilizes the classical Web search technique based on keyword to obtain relevant webpage.Step 13 generates the literal summary and the image extraction of webpage from the related web page that searches, and these information are submitted to the user, understands Search Results for the user.The literal summary is meant the content of text in the webpage is concentrated the resulting literal that can reflect web page contents to a certain extent in back that classical Web search technique utilizes the literal summary to come to the user feedback Search Results exactly.Image extraction is the technology that this patent proposes, specifically be meant from webpage, obtain with the maximally related image of user inquiring, these image feedback will be helped the user to the user is faster to understand Search Results more accurately.Step 14 user judges the webpage that searches, if webpage satisfies user's demand, then goes to step 16, and whole process finishes; Otherwise, go to step 15, obtain user's feedback information, utilize these information to generate new searching keyword, and new searching keyword is submitted to system, go to step 12, repeat above process.
Fig. 3 describes step 13 in detail, promptly obtains the text snippet and the image extraction of webpage.Step 130 is initial steps.Step 131 uses the method for classical generation literal summary to generate the literal summary of webpage.Step 132 generates the image extraction of the first kind, and the image extraction of this class is corresponding one by one with webpage, is used for auxiliary literal summary displayed web page content and is used for user feedback afterwards.Step 133 generates the image extraction of second class, the feedback of the user after such image extraction is used for.In the Web search interface, first kind image extraction should be submitted to the user together with the literal summary of corresponding webpage, and the second class image extraction does not then need to put together with the literal summary.Step 134 finishes.
Fig. 4 describes step 132 in detail, promptly obtains first kind image extraction.Step 1320 is initial steps.Step 1321 is changed to 1 with counter i.Step 1322 is judged the number M of the i webpage whether great-than search arrives.If i greater than M, goes to step 1328, finish; Otherwise go to step 1323.Step 1323 utilizes classical webpage block algorithm that i webpage carried out piecemeal.Step 1324 utilizes the importance method of discrimination in the classical Web search technique to judge that whether important each piecemeal is.Step 1325 is selected the piecemeal the most similar to searching keyword from the important piecemeal that comprises piece image at least.Here adopt classical literal method for measuring similarity to measure similarity between the text description of searching keyword and piecemeal.Step 1326 is selected the image extraction of the image of appearance at first as i webpage from the most similar piecemeal.Step 1327 adds 1 with i, goes to step 1322.
Fig. 5 describes step 133 in detail, promptly obtains the second class image extraction.Step 1330 is initial steps.Step 1331 will be gathered IMG and put sky.Step 1332 is changed to 1 with counter i.Step 1333 is judged the number M of the i webpage whether great-than search arrives.If i greater than M, goes to step 1338; Otherwise go to step 1334.Step 1334 pair utilizes the webpage method of partition in the classical Web search technique that i webpage carried out piecemeal.Step 1335 utilizes the importance method of discrimination in the classical Web search technique to judge that whether important each piecemeal is.Step 1336 adds set IMG with the image in the important piecemeal.Step 1337 adds 1 with i, goes to step 1333.Step 1338 sorts to the image among the IMG according to the text description of image among the IMG and the similarity degree of searching keyword, selects the image extraction of the most similar j width of cloth image as second class.The measuring similarity here adopts classical literal method for measuring similarity.The text description of image is made up of three parts, the ALT field of image in html source code, the text description of the title of image place webpage and image place piecemeal.Step 1339 finishes.
Fig. 6 describes step 15 in detail, promptly obtains user's feedback information, generates new searching keyword with the inquiry of refining.Step 150 is initial steps.Which kind of feedback is step 151 selected to carry out by the user.Step 152, the user selects to carry out first kind feedback, according to user's the new searching keyword of feedback information generation.First kind feedback is based on that the image extraction of the first kind finishes.In the feedback of this class, the user judges the webpage phase WICCON whether in cus toms clearance or not that searches in conjunction with the literal summary and the image extraction that provide.Step 153, the user selects to carry out second class feedback, according to user's the new searching keyword of feedback information generation.Second class feedback is based on that the image extraction of second class finishes.In the feedback of this class, the user judges directly whether the image extraction that provides is relevant.Step 154 finishes.
Fig. 7 describes step 152 in detail, promptly generates new searching keyword in first kind feedback.Step 1520 is initial steps.Step 1521 is put set TERM for empty.It is 1 that step 1522 is put counter i.Step 1523 judges that i is whether less than the number F of the webpage of user feedback.If i greater than F, goes to step 1526; Otherwise go to step 1524.Step 1524 adds set TERM (appeared in the searching keyword except) with the speech that occurs in i the webpage.Step 1525 adds 1 with counter i, goes to step 1523.Step 1526 according to the score value of each speech among formula (1) the calculating TERM, selects k the highest speech of score value as newly-increased searching keyword.In the formula (1), Score (t) is the score value of speech t, r tBe chosen as the number that comprises t in the relevant webpage, n by the user tBe the number that comprises t in the webpage that searches, R is chosen as the number of relevant webpage by the user, and N is the number of the webpage that searches.Step 1527 finishes.
Score ( t ) = log r t / ( R - r t ) ( n t - r t ) / ( N - n t - R + r t ) × ( r t R - n t - r t N - R ) - - - ( 1 )
Fig. 8 describes step 153 in detail, promptly generates new searching keyword in second class feedback.Step 1530 is initial steps.Step 1531 is put set TERM for empty.It is 1 that step 1532 is put counter i.Step 1533 judges that i is whether less than the number F of the image of user feedback.If i greater than F, goes to step 1536; Otherwise go to step 1534.Step 1534, the speech that occurs in the textual description with i image add set TERM (appeared in the searching keyword except).Step 1535 adds 1 with counter i, goes to step 1533.Step 1536 according to the score value of each speech among formula (1) the calculating TERM, selects k the highest speech of score value as newly-increased searching keyword.Note in formula this moment (1) r tBe chosen as the number that comprises the image of t in the textual description of relevant image by the user, n tBe that Fig. 5 gathers the number that comprises the image of t in the textual description of the image among the IMG, R is chosen as the number of relevant image by the user, and N is the number that Fig. 5 gathers the image among the IMG.

Claims (7)

1, the image extraction feedback method in a kind of Web search is characterized in that this method may further comprise the steps:
(1) the page processor inquiry of forming by keyword of accepting that the user submits to, use is searched for Web based on the Web searching method of keyword;
(2) method of feeding back based on the image extraction user inquiring of refining;
(3) finally return customer satisfaction system Search Results.
Wherein the method for feeding back based on the image extraction user inquiring of refining may further comprise the steps:
(11) obtain user's searching keyword;
(12) utilize Web searching method to obtain relevant webpage based on keyword;
(13) literal that generates webpage from the related web page that searches is made a summary and image extraction, and these information are submitted to the user, understands Search Results for the user;
(14) user judges the webpage that searches, if webpage satisfies user's demand, then goes to (16), and whole process finishes; Otherwise, go to (15);
(15) obtain user's feedback information, generate new searching keyword, go to (12);
(16) finish.
2, the image extraction feedback method in the Web search according to claim 1 is characterized in that the text snippet and the image extraction that obtain webpage in (13) may further comprise the steps:
(131) use the literal of the method generation webpage of known generation literal summary to make a summary;
(132) image extraction of the generation first kind, the image extraction of this class is corresponding one by one with webpage, is used for auxiliary literal summary displayed web page content and is used for user feedback afterwards;
(133) image extraction of generation second class, the feedback of the user after such image extraction is used for;
(134) finish.
3, the image extraction feedback method in the Web search according to claim 2 is characterized in that obtaining in (132) first kind image extraction and may further comprise the steps:
(1321) counter i is changed to 1;
(1322) judge the number M of the i webpage whether great-than search arrives,, otherwise go to (1323) if i, goes to (1328) greater than M;
(1323) utilize known webpage block algorithm that i webpage carried out piecemeal;
(1324) utilize the importance method of discrimination in the Web searching method to judge that whether important each piecemeal is;
(1325) from the important piecemeal that comprises piece image at least, select the piecemeal the most similar to searching keyword;
(1326) from the most similar piecemeal, select the image extraction of the image of appearance at first as i webpage;
(1327) i is added 1, go to step 1322;
(1328) finish.
4, the image extraction feedback method in the Web search according to claim 2 is characterized in that the step of obtaining the second class image extraction in (133) is:
(1331) will gather IMG and put sky;
(1332) counter i is changed to 1;
(1333) judge the number M of the i webpage whether great-than search arrives, if i, goes to (1338) greater than M; Otherwise go to (1334);
(1334) utilize the webpage method of partition in the Web searching method that i webpage carried out piecemeal;
(1335) utilize the importance method of discrimination in the Web searching method to judge that whether important each piecemeal is;
(1336) image in the important piecemeal is added set IMG;
(1337) i is added 1, go to (1333);
(1338) according to the text description of image among the IMG and the similarity degree of searching keyword the image among the IMG is sorted, select the image extraction of the most similar j width of cloth image as second class;
(1339) finish.
5, the image extraction feedback method in the Web according to claim 1 search is characterized in that obtaining in (15) user's feedback information, generates new searching keyword and inquires about to refine and may further comprise the steps;
(151) select to carry out which kind of feedback by the user;
(152) user selects to carry out first kind feedback, according to user's the new searching keyword of feedback information generation;
(153) user selects to carry out second class feedback;
(154) finish.
6, the image extraction feedback method in the Web search according to claim 5 is characterized in that generating new searching keyword in (152) in first kind feedback may further comprise the steps:
(1521) put set TERM for empty;
(1522) putting counter i is 1;
(1523) judge that i is whether less than the number F of the webpage of user feedback, if i, goes to (1526) greater than F; Otherwise go to (1524);
(1524) speech that occurs in i the webpage is added set TERM, appeared in the searching keyword except;
(1525) counter i is added 1, go to (1523);
(1526), select k the highest newly-increased searching keyword of speech conduct of score value according to the score value of each speech among the following formula calculating TERM:
Score ( t ) = log r t / ( R - r t ) ( n t - r t ) / ( N - n t - R + r t ) × ( r t R - n t - r t N - R )
In the formula, Score (t) is the score value of speech t, r tBe chosen as the number that comprises t in the relevant webpage, n by the user tBe the number that comprises t in the webpage that searches, R is chosen as the number of relevant webpage by the user, and N is the number of the webpage that searches;
(1527) finish.
7, the image extraction feedback method in the Web search according to claim 5 is characterized in that generating new searching keyword in (153) in second class feedback may further comprise the steps:
(1531) put set TERM for empty;
(1532) putting counter i is 1;
(1533) judge that i is whether less than the number F of the image of user feedback, if i, goes to (1536) greater than F; Otherwise go to (1534);
(1534) speech that occurs in the textual description with i image adds set TERM, appeared in the searching keyword except;
(1535) counter i is added 1, go to (1533);
(1536), select k the highest newly-increased searching keyword of speech conduct of score value according to the score value of each speech among the following formula calculating TERM:
Score ( t ) = log r t / ( R - r t ) ( n t - r t ) / ( N - n t - R + r t ) × ( r t R - n t - r t N - R )
(1537) finish.
CNB2006100403169A 2006-05-15 2006-05-15 Image extraction feedback method in web search Expired - Fee Related CN100481079C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2006100403169A CN100481079C (en) 2006-05-15 2006-05-15 Image extraction feedback method in web search

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006100403169A CN100481079C (en) 2006-05-15 2006-05-15 Image extraction feedback method in web search

Publications (2)

Publication Number Publication Date
CN1845100A true CN1845100A (en) 2006-10-11
CN100481079C CN100481079C (en) 2009-04-22

Family

ID=37064028

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006100403169A Expired - Fee Related CN100481079C (en) 2006-05-15 2006-05-15 Image extraction feedback method in web search

Country Status (1)

Country Link
CN (1) CN100481079C (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102214185A (en) * 2010-04-07 2011-10-12 腾讯科技(深圳)有限公司 Webpage searching method and webpage searching system
CN102855313A (en) * 2012-08-24 2013-01-02 北京壹人壹本信息科技有限公司 Webpage browsing equipment, webpage abstract generating method and webpage opening method
CN103761231A (en) * 2013-10-17 2014-04-30 北京奇虎科技有限公司 Method and device for providing media content information of page by search engine
CN104376114A (en) * 2014-12-01 2015-02-25 百度在线网络技术(北京)有限公司 Search result displaying method and device
CN105512159A (en) * 2014-12-22 2016-04-20 哈尔滨安天科技股份有限公司 Focus event search customizing method and apparatus

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7599987B2 (en) * 2000-12-06 2009-10-06 Sony Corporation Information processing device for obtaining high-quality content

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102214185A (en) * 2010-04-07 2011-10-12 腾讯科技(深圳)有限公司 Webpage searching method and webpage searching system
CN102214185B (en) * 2010-04-07 2013-10-23 腾讯科技(深圳)有限公司 Webpage searching method and webpage searching system
CN102855313A (en) * 2012-08-24 2013-01-02 北京壹人壹本信息科技有限公司 Webpage browsing equipment, webpage abstract generating method and webpage opening method
CN103761231A (en) * 2013-10-17 2014-04-30 北京奇虎科技有限公司 Method and device for providing media content information of page by search engine
CN104376114A (en) * 2014-12-01 2015-02-25 百度在线网络技术(北京)有限公司 Search result displaying method and device
CN104376114B (en) * 2014-12-01 2018-01-30 百度在线网络技术(北京)有限公司 A kind of search result methods of exhibiting and device
CN105512159A (en) * 2014-12-22 2016-04-20 哈尔滨安天科技股份有限公司 Focus event search customizing method and apparatus

Also Published As

Publication number Publication date
CN100481079C (en) 2009-04-22

Similar Documents

Publication Publication Date Title
US7617205B2 (en) Estimating confidence for query revision models
US7647314B2 (en) System and method for indexing web content using click-through features
AU2011201646B2 (en) Integration of multiple query revision models
CN100481077C (en) Visual method and device for strengthening search result guide
CA2572588C (en) Enhanced document browsing with automatically generated links based on user information and context
US9940398B1 (en) Customization of search results for search queries received from third party sites
US7844599B2 (en) Biasing queries to determine suggested queries
US8645395B2 (en) System and methods for evaluating feature opinions for products, services, and entities
CN102122295B (en) Document search engine including highlighting of confident results
US20130254189A1 (en) Using Anchor Text to Provide Context
CN102722499B (en) Search engine and implementation method thereof
CN102722501B (en) Search engine and realization method thereof
US20070239676A1 (en) Method and system for providing focused search results
US9251206B2 (en) Generalized edit distance for queries
CN101639857B (en) Method, device and system for establishing knowledge questioning and answering sharing platform
CN1902627A (en) Systems and methods for direct navigation to specific portion of target document
CN102200975A (en) Vertical search engine system and method using semantic analysis
Gregurec et al. Search Engine Optimization (SEO): Website analysis of selected faculties in Croatia
CN1845100A (en) Image extraction feedback method in web search
EP1160686A2 (en) A method of searching the internet and an internet search engine
US8190602B1 (en) Searching a database of selected and associated resources
AU2011247862B2 (en) Integration of multiple query revision models
Arbelaitz et al. SAHN with SEP/COP and SPADE, to build a general web navigation adaptation system using server log information
Huang12 et al. Semantic focused crawling for retrieving E-commerce information
Yoon Study on the improvement of Search Engine Optimization

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090422

Termination date: 20120515