Embodiment
For make purpose of the present invention, technical scheme, and advantage clearer, below with reference to the accompanying drawing embodiment that develops simultaneously, the present invention is described in more detail.
The device 10 of setting up forum's clue database that the embodiment of the invention provides comprises as shown in Figure 1:
Original web page acquiring unit 101 is used to obtain untreated original web page.
Original web page is meant the still untreated webpage that grasps from network, the acquisition process of original web page is same as the prior art, detailed process is as follows: collecting web page device 11 is kept at the webpage that grasps in the original web page database 13 by webpage capture programs such as Web Spider traversal web space; Wherein, the extracting process of collecting web page device is subjected to collection controller 12 controls;
Thereby when needs obtain original web page, can directly from the original web page database, obtain.
Forum's clue template recognition unit 102 is used to use the forum's clue template base 14 that presets to identify forum's clue template of original web page correspondence.
Present embodiment has only been described the situation of forum's clue template that can identify the original web page correspondence, the situation that also may occur not discern in actual applications, if can not discern, then need this original web page is done corresponding processing, for example can directly abandon, perhaps it is analyzed, obtain its corresponding forum's clue template, and the forum's clue template that obtains is saved in forum's clue template base 14; Because original web page all has its corresponding design feature, thereby it all has forum's clue template of unique correspondence.
Preserved predefined forum clue template in forum's clue template base, the possible list item form of a kind of forum clue template is as shown in table 1:
Table 1, forum's clue template table
Forum's sign |
Network address URL |
Original forum clue sign is extracted sign |
Sign is extracted in forum's clue paging |
Model contents extraction sign |
..... |
Forum (Forum) 1 |
http://bbs.t est01.com/ |
read.php?tid=??&fp age=0&toread=&pa ge=×× |
read.php?tid=××&f page=0&toread=&p age=?? |
××× |
...... |
Forum2 |
http://bbs.t est02.com/ |
??/ShowPost.aspx? PageIndex=×× |
××/ShowPost.aspx? PageIndex=?? |
××× |
...... |
...... |
...... |
...... |
...... |
...... |
...... |
As shown in table 1, preserve information such as forum's sign, network address URL, original forum clue sign extraction sign, forum's clue paging extraction sign, model contents extraction sign in forum's clue template table, extract sign by these and can from original web page, extract corresponding information, wherein original forum clue sign is the sign that each heterogeneous networks forum distributes the forum's clue under it, does not have repetition in same forum.
When discerning, need from original web page, to extract earlier the information of describing in forum's clue template table, for example can extract the network address URL of original web page etc., go coupling according to the information of having preserved in the information of extracting and the forum's clue template table then; Different forums are owing to represent the parameter difference of structure organization, and content of pages is distinguished the form difference, so need set up different pattern match information to different forum's contents, the system that makes can obtain relevant content information according to predefined mode parameter; Is a kind of feasible implementation by the URL address of original web page being analyzed whether the forum of coupling clue template is arranged, supposing that URL is http://bbs.test01.com/read.php? tid=48395﹠amp; Fpage=0﹠amp; Toread=﹠amp; Page=2 matches the forum that forum in the predefined pattern is designated Foruml by therefrom extracting bbs.test01.com, and promptly can identify its corresponding forum's clue template is forum's clue template that Foruml represents;
Information extraction unit 103 is used for extracting the information that forum's clue template is identified from original web page, identifies comprising forum;
After identifying forum's clue template of original web page correspondence, then according to the forum's clue template that matches, therefrom extract the forum's clue that this forum's webpage comprises and the related data information of model, wherein, the information of extracting is to identify in forum's clue template, because only the information that identifies in forum's clue template just can have corresponding list item in database, the information of only extracting forum's clue template identification can guarantee that the information of extracting can preserve at database; Concrete information extraction is the analysis of basis to the original web page of forum's webpage, the tectonic information marking structure comes according to the corresponding data of different structure extraction, this message identification structure is different and different according to the specific implementation language of webpage, for example realize to use html tag tree structure, realize to use xml mark structure etc. with the xml language with the html language; For example, the possible form of the html tag tree structure that provides of the embodiment of the invention is as described below:
A kind of tag tree of possible extraction model content is as follows:
<DIV?id=main>
<FORM?name=delatc?action=masingle.php?action=delatc?method=post>
<DIV?class=″tt2″>
<TR?class=trl>
<TH?class=r_one>
<DIV?class=tpc_content>......</DIV>
</TH>
</TR>
</DIV>
</FORM>
</DIV>
Wherein,<DIV class=tpc_content ...</DIV in content be the model content;
A kind of tag tree that judges whether that theme pastes is as follows:
<DIV?id=main>
<FORM?name=delatc?action=masingle.php?action=delatc?method=post>
<A?name=tpc></A>
<DIV?class=″t?t2″>
......
</DIV>
</FORM>
</DIV>
If<A name=? the A of〉</〉 in the value of name be tpc, then<DIV class=" t t2 " ...
</DIV〉represented model content is exactly theme card content; Otherwise be exactly to reply card;
After extracting information, the information of extracting is handled, be filtered when for example replying obedient content less than a preset value, the model of conductively-closed is filtered etc., then each model is created the model attributes object, produce into a model attributes object collection that comprises this forum's webpage model content; The related data information that the model attributes object comprises is including but not limited to following content: the model sign, affiliated forum clue sign, the model content, model form (representing that this model is the theme card or replys card), theme card type (elite theme for example, original, change and paste, comment, recommend, bulletin, knowledge, ballot, other, activity etc.), theme card title, the user profile of posting (user ID for example, user gradation), affiliated topic floor (expression is which the answer card in forum's clue, if theme as 0 layer), other additional attribute (for example whether top set, whether add essence etc.); Is a kind of possible mode to obtain original forum clue sign by the URL adress analysis to original web page to suppose that the URL of original web page is http://bbs.test01.com/read.php? tid=48395﹠amp; Fpage=0﹠amp; Toread=﹠amp; Page=2 is designated 48395 by therefrom extracting original forum clue;
Certainly, specifically obtaining which information can be set according to concrete needs by system, comprise forum's sign in the information of choosing, it is unique that forum is identified in forum's clue database, just can determine the forum corresponding position of information in forum's clue database of sign by forum's sign;
Information is preserved unit 104, is used in forum's clue database 15 and the corresponding list item preservation information of forum's sign;
After having obtained the information that forum's clue template identified, the information of obtaining is saved in the forum clue database list item corresponding with described forum sign;
In actual applications, because forum is bigger, the corresponding a plurality of list items of forum's sign meeting, be saved in information in a definite list item in order to guarantee this moment, need further obtain the original forum clue sign of original web page correspondence, thereby can guarantee directly to find the list item record corresponding with original web page, this is because original forum clue sign is the sign that each heterogeneous networks forum distributes the forum's clue under it, does not have repetition in same forum; After finding the list item corresponding with forum sign, need further in these list items, search and a corresponding list item of original forum clue sign, if find, identify renewal preservation information in the corresponding list item with original forum clue what existed; If search less than, the corresponding list item of newly-built and original forum clue sign in forum's clue database, and in this newly-built list item preservation information;
In actual applications, can also be each forum of clue identification distribution of original forum clue sign, forum's clue sign is distributed automatically by system, some original forum clue sign under a certain forum of can be in the system unique sign sign, thereby can search corresponding information by forum's clue sign, and do not need to search corresponding information by forum's sign and original forum two signs of clue sign, can improve the treatment effeciency of forum's clue database;
In forum's clue database, a kind of possible situation is to comprise forum's threaded list and model attribute list (also these two tables can be combined in actual applications certainly), and wherein a kind of possibility form of expression of forum's threaded list is as shown in table 2:
Table 2, forum's threaded list
Forum's clue sign |
Forum's sign |
Original forum clue sign |
...... |
Thread1 |
Forum1 |
48395 |
...... |
Thread2 |
Forum2 |
2766592 |
...... |
...... |
...... |
...... |
...... |
By the described forum of table 2 threaded list, can find corresponding forum's clue sign by forum's sign and original forum clue sign, also can search its corresponding forum sign and original forum clue sign according to forum's clue sign;
A kind of possibility form of expression of model attribute list is as shown in table 3:
Table 3, model attribute list
The model sign |
Forum's clue sign |
The model content |
The model form |
Theme card type |
Theme card title |
The user ID of posting |
Affiliated topic floor |
...... |
1 |
Thead1 |
×× |
The theme card |
Original |
×× |
User01 |
0 |
...... |
2 |
Thead1 |
×× |
Reply card |
Do not have |
Do not have |
User02 |
1 |
...... |
...... |
...... |
...... |
...... |
...... |
...... |
...... |
...... |
...... |
By the described model attribute list of table 3, can identify some information of searching its corresponding model by forum's clue;
Because now on the network forum, the model that somebody's gas is high has a lot of answer cards, and these reply obedient being distributed in probably on the different web pages of a model, but no matter what webpages a model has, its all only corresponding forum's clue, and present embodiment uses forum's clue as process object, and can will not belong to a plurality of webpage separate processes of same forum clue, and the Search Results when making with forum's clue as object search is more accurate.
The present invention further provides the device embodiment two that sets up forum's clue database, as shown in Figure 2, the device 20 of setting up forum's clue database comprises:
Original web page acquiring unit 201 is used to obtain untreated original web page;
Forum's clue template recognition unit 202 is used to use the forum's clue template base 14 that presets to identify forum's clue template of original web page correspondence;
Information extraction unit 203 is used for extracting the information that forum's clue template is identified from original web page, and information comprises forum's sign;
Original forum clue sign acquiring unit 204 is used for from the original forum clue sign of original web page extraction original web page correspondence;
List item is searched unit 205, is used for from forum's clue database 15 and forum's sign and the corresponding list item of original forum clue sign;
Information is preserved unit 206, is used for preserving described information at the list item corresponding with forum's sign and original forum clue sign;
In the present embodiment, by the original forum clue sign acquiring unit that increases, can obtain the original forum clue sign of original web page correspondence, by original forum clue sign, the original web page information of extracting can be saved in the list item of its corresponding forum's clue, thereby when a plurality of forums clue being arranged, can handle respectively each forum's clue in a forum, thereby when inquiry, can only find corresponding information, system handles efficient is provided by forum's clue sign.
In actual applications, possible certain original forum corresponding list item of clue sign does not exist, need increase a list item and set up the unit this moment in the device embodiment that sets up forum's clue database, be used at the corresponding list item of the newly-built and original forum of clue database of forum clue sign; Further, if in forum's clue database, do not identify corresponding list item, also can newly-builtly in forum's clue database identify corresponding list item with forum with certain forum.
The device of setting up index data base 31 that the embodiment of the invention provides comprises as shown in Figure 3:
Forum's clue acquiring unit 311 is used for obtaining the corresponding forum's clue of forum's clue sign from forum's clue database 15;
Forum's clue acquiring unit is by sending the message of request forum clue to forum's clue database, forum's clue database after receiving this message, though but return the information that does not have forum's clue of having upgraded behind indexed mistake or the indexed mistake index to forum's clue acquiring unit; The quantity of forum's clue of wherein, specifically returning can specifically be provided with according to concrete needs;
Forum's clue database can be set up by the described device of setting up forum's clue database of Fig. 1;
Set of keywords acquiring unit 312 is used for forum's clue is carried out pre-service, obtains the set of keywords of the corresponding forum's clue of expression forum clue sign;
Pre-service includes but not limited to word segmentation and/or filtration, and carrying out word segmentation is in order to remove nonsensical words, as " " etc.; Some responsive word is that law or miscellaneous stipulations institute are unallowed, so also need to filter; Thereby obtain to represent some key words of this forum's clue; Most importantly to carry out aforesaid operations to the model content;
Information is preserved unit 313, is used for forum's clue, set of keywords are saved to index data base 32;
Carry out word segmentation and filtration by information to original web page, can obtain to identify the key word of forum's clue content, thereby when providing Webpage search for the user, can be according to keyword search to corresponding forum's clue, thereby can be with a plurality of webpage separate processes of a model, the Search Results when making with forum's clue as object search is more accurate.
In actual applications, in order to make the information of preserving in the index data base more perfect, thereby provide more information during for search and webpage, can in setting up the device of index data base, further increase:
The co-occurrence frequency statistic unit and/or be used for that is used for adding up the co-occurrence frequency of set of keywords key word is added up single text vocabulary frequency statistics unit of single text vocabulary frequency of set of keywords key word, and information is preserved the unit and preserve co-occurrence frequency and/or single text vocabulary frequency accordingly in index data base;
Wherein co-occurrence frequency is at the distributing position of key word in forum's clue, adds up its appearance situation in a plurality of models; For example, a kind of mode of simple statistics key word co-occurrence frequency can be like this: for each model, as long as key word occurs therein, no matter how many times appears, all be defined as 1, if all occurred in certain key word five models therein like this, then defining its co-occurrence frequency is 5, even it has all occurred in each model 3 times; Certainly, this is the simplest a kind of statistical, and in actual applications, position and frequency difference according to the key word appearance, different weights can be set respectively, for example appear at the weights of theme in pasting than appearing at the weights height of replying in the card, the number of times that occurs in forum's clue weights more at most is high more;
In index data base, increase co-occurrence frequency and/or single text vocabulary frequency of preserving key word, can sort according to co-occurrence frequency and/or single text vocabulary frequency and return Search Results to the user, make forum's clue that more can meet the user inquiring speech preceding, thereby make the user can obtain it faster and want the content obtained, satisfy user's needs, improve user satisfaction.
A kind of index data base that the embodiment of the invention provides comprises forum's clue forward concordance list and forum's clue inverted index table; Forum's clue forward table is as shown in table 4:
Table 4, forum's clue forward concordance list
As shown in table 4, forum's clue forward concordance list is an index with forum's clue, and writes down the set of keywords of each forum's index respectively, has also write down information such as single text vocabulary frequency of each key word, co-occurrence frequency in the set of keywords;
Forum's clue inverted index table is as shown in table 5:
Table 5, forum's clue inverted index table
As shown in table 5, forum's clue inverted index table is index with the key word, and writes down which forum's index respectively this key word is arranged, and in this forum's index the information such as single text vocabulary frequency, co-occurrence frequency of this key word;
Table 4 and table 5 have just been described a kind of mode that realizes index data base, may only need one of them table in actual applications, perhaps also can make up more table.
The present invention further provides the method embodiment one of search and webpage, as shown in Figure 4, having comprised:
Step 401, acquisition user inquiring speech;
When the user need inquire about a content, can import corresponding query word by the interface that search engine provides;
Step 402, from index data base, search the forum clue corresponding with the user inquiring speech;
Wherein, index data base can be set up by the described flow process of Fig. 2;
After obtaining the user inquiring speech, just can in index data base, search corresponding forum's clue as key word with the user inquiring speech;
Further, in actual applications, because the user inquiring speech of user's input may not meet the requirement of key word, thereby need before from index data base, searching the user inquiring speech of user's input is carried out word segmentation and/or filtration, it is in order to remove words nonsensical in the user inquiring speech that the user inquiring speech is carried out word segmentation, as " " etc., and the user inquiring speech is carried out word segmentation can obtain the word identical, make search more accurate with key word; Some responsive word is that law or miscellaneous stipulations institute are unallowed, so also need the user inquiring speech is filtered;
Step 403, the forum's clue that inquires is formatd processing, the forum's clue after the output format processing;
In order to make the user can understand the information of each forum's clue in the Search Results, need carry out certain format to forum's clue handles, as show some model contents, with wherein highlighted demonstration of key word etc., make the user can not open corresponding web page interlinkage and just can know content corresponding, thereby allow the user find as soon as possible to want the content of searching for;
The technical scheme of using present embodiment to provide, can return the forum index corresponding to the user according to user's query word with query word, thereby it is the Query Result of unit that the user is obtained with forum's index, and can be, thereby make the Query Result that returns to the user more accurate with a plurality of webpage separate processes of forum's index.
The present invention also provides the method embodiment two of search and webpage, as shown in Figure 5, comprising:
Step 501, acquisition user inquiring speech;
Step 502, the user inquiring speech is carried out pre-service, obtain key word of the inquiry;
Step 503, from index data base, search the forum clue corresponding, obtain the sequencing information of key word of the inquiry with key word of the inquiry;
Step 504, the forum's clue that inquires is formatd processing, the forum's clue after format is handled is according to the sequencing information output of sorting;
In actual applications, this sequencing information can be co-occurrence frequency and/or single text vocabulary frequency and/or a kind of or its combination in any wherein such as some other for example link quality, user's click volume information, if be that the value that obtains after a kind of can directly the processing according to the value of information or to it sorts, if combination, can be worth accordingly according to presetting algorithm computation, be sorted according to the value that calculates; Forum's clue is sorted, be convenient to the information that the user better obtains Search Results;
For example, if when only obtaining single text vocabulary frequency, need the contrary text frequency of the single text vocabulary frequency correspondence of statistics, the ratio that adopts single text vocabulary frequency and contrary text frequency then is as the foundation that sorts; The ratio of single text vocabulary frequency and contrary text frequency is the more information of using in the existing Webpage search technology, represent the key word that occurs in certain webpage to account for the weight degree of this web page contents, this value is high more, the weight that this key word accounts for this web page contents is big more, can represent the content of this webpage more; To be the number of times that occurs with key word in certain webpage obtain divided by the total number of word of this webpage wherein single text vocabulary frequency (TF:Term Frequency); Contrary text frequency (IDF:Inverse Document Frequency) expression " inverse document frequency " supposes that a key word w occurred in Dw webpage, Dw is big more so, and the weight of w is more little, and vice versa; Its computing formula is log (D/Dw), and wherein D is whole webpage numbers;
If only obtain co-occurrence frequency, then can be directly according to the numerical ordering of co-occurrence frequency;
If when obtaining TF, also obtain co-occurrence frequency, to handle TF earlier, obtain the value of TF/IDF, then TF/IDF and two values of co-occurrence frequency are handled, thereby obtain the relevance degree that can represent key word and forum's clue content; A kind of feasible method is to calculate according to the different weights of two values, and the weight of supposing TF/IDF is w
1, the weight of co-occurrence frequency is w
2(w
1+ w
2=1), then can pass through w
1* TF/IDF+w
2* co-occurrence frequency calculates relevance degree;
Each forum's clue all has the co-occurrence frequency of corresponding key word, and the co-occurrence frequency of key word is the degree of correlation that can reflect forum's clue and key word to a certain extent, so forum's clue is sorted according to co-occurrence frequency, the row front that degree of correlation is high can allow the user find it to want the information of looking for faster; When the degree of correlation of several forums clue is identical, can be randomly ordered to this several forums clue, perhaps by its sequencing ordering in the clue database, also can adopt other method;
Equally,, also comprise, weight is set for each sequencing information, adopt corresponding algorithm computation to go out relevance degree as information such as link quality, user's click volumes if the sequencing information that obtains had both comprised TF and co-occurrence frequency;
In the technical scheme that present embodiment provides, further the degree of correlation according to forum's index and user inquiring speech sorts to forum's index, thereby make with the corresponding more forum clue row of user inquiring speech more before, be that the user can find it to think information inquiring as soon as possible, improve user's satisfaction.
In order more clearly to describe the implementation procedure of the technical scheme that the embodiment of the invention provides, the embodiment of the invention further provides the method embodiment three of search and webpage, this embodiment has described from obtaining original web page, whole flow processs to the output Webpage searching result, shown in figure six, comprising:
Step 601, obtain untreated original web page;
Forum's clue template base that step 602, use are preset identifies forum's clue template of this original web page correspondence;
Step 603, forum's clue that the corresponding forum's clue template of extraction is identified from this original web page;
In actual applications, this information can be saved to forum's clue database after having extracted forum's clue;
Step 604, forum's clue is carried out word segmentation and filtration, obtain the set of keywords of the described forum of expression clue;
The TF and the co-occurrence frequency of the key word in step 605, the statistics set of keywords;
Step 606, the TF and the co-occurrence frequency of the key word in forum's clue, the set of keywords, key word is saved to index data base;
Step 607, acquisition user inquiring speech;
Step 608, the user inquiring speech is carried out word segmentation and filtration, obtain key word of the inquiry;
Step 609, from index data base, search the forum clue corresponding with key word of the inquiry;
Step 610, the forum's clue that inquires is formatd processing;
Step 611, the TF that from index data base, obtains key word of the inquiry and co-occurrence frequency;
The IDF of step 612, statistical query key word calculates TF/IDF, uses TF/IDF and co-occurrence frequency to calculate the relevance degree of key word of the inquiry and forum's clue;
What forum's clues IDF has comprise this key word of the inquiry in the current whole index data base of statistics;
Step 613, press the forum's clue after the processing of relevance degree ordering output format;
Use present embodiment, can be after obtaining original web page, determine forum's clue of original web page correspondence, extract corresponding information, obtain the set of keywords of expression forum clue, the TF and the co-occurrence frequency of the key word in the statistics set of keywords, key word in user inquiring key word and this set of keywords is at once, can determine that this forum's clue meets user's needs, certainly in index data base, have a lot of the forum's clues that meet user's needs, thereby obtain the relevance degree of each forum's clue and user inquiring key word according to TF/IDF and co-occurrence frequency, then according to relevance degree with forum's clue ordering output; Make the user obtain the forum clue relevant, and forum's clue is according to relevance degree ordering with the user inquiring key word, relevance degree is high more come more before, make the user find it to think information inquiring as soon as possible, thereby improve user satisfaction.
The embodiment of the invention provides the device 70 of search and webpage, as shown in Figure 7, comprising:
User inquiring speech acquiring unit 701 is used to obtain the user inquiring speech;
Forum's clue is searched unit 702, is used for searching the forum clue corresponding with the user inquiring speech from index data base 32;
Forum's clue output unit 703 is used for the forum's clue that inquires is formatd processing, and forum's clue that will format after handling is exported to the user;
The technical scheme of using present embodiment to provide, can return the forum index corresponding to the user according to user's query word with query word, thereby it is the Query Result of unit that the user is obtained with forum's index, and can be, thereby make the Query Result that returns to the user more accurate with a plurality of webpage separate processes of forum's index.
Further, in actual applications,, thereby can in the device embodiment of search and webpage, further comprise because the user inquiring speech of user's input may not meet the requirement of key word:
The user inquiring speech is carried out the key word of the inquiry acquiring unit of word segmentation and filtration treatment, thereby obtain key word of the inquiry;
Forum's clue is searched the unit, just can search the forum clue corresponding with key word of the inquiry from index data base according to key word of the inquiry; Because of key word of the inquiry obtains by the user inquiring speech, thereby forum's clue of searching is also corresponding with the user inquiring speech;
Further, can find the information that it is wanted as soon as possible, can forum's clue of output be sorted, thereby can also in the device embodiment of search and webpage, comprise in order to make the user:
Be used for obtaining the sequencing information acquiring unit of forum's clue key word of the inquiry sequencing information;
Sequencing information can be TF and/or co-occurrence frequency etc., after having obtained information such as TF information, co-occurrence frequency, forum's clue output unit, according to the TF/IDF value that calculates or co-occurrence frequency value or the relevance degree that calculates forum's clue is sorted, and forum's clue is exported to the user according to ranking results; Thereby make with the corresponding more forum clue row of user inquiring speech more before, make the user find it to think information inquiring as soon as possible, improve user's satisfaction.
The system of the search and webpage that the embodiment of the invention is mentioned comprises as shown in Figure 8:
Set up the device 801 of forum's clue database, be used to obtain untreated original web page; Forum's clue template base that use is preset identifies forum's clue template of original web page correspondence; Extract the information that forum's clue template is identified from original web page, information comprises forum's sign; Identify the described information of preservation in the corresponding list item at forum's clue database with forum;
Set up the device 802 of index data base, be used for obtaining the corresponding forum's clue of forum's clue sign from forum's clue database; Forum's clue is carried out word segmentation and filter operation, obtain the set of keywords of expression forum clue; Forum's clue, set of keywords are saved to index data base;
The device 803 of search and webpage is used to obtain the user inquiring speech; From index data base, search the forum clue corresponding with described user inquiring speech; The described forum clue that inquires is formatd processing, and will format the forum's clue output after handling.
The technical scheme of using present embodiment to provide, can return the forum index corresponding to the user according to user's query word with query word, thereby it is the Query Result of unit that the user is obtained with forum's index, and can be, thereby make the Query Result that returns to the user more accurate with a plurality of webpage separate processes of forum's index.
Be understandable that, method, the Apparatus and system of the search and webpage that the embodiment of the invention can be provided are applied in the web page search engine, this web page search engine can be single forum's search engine, it also can be the comprehensive search engine, thereby make search engine when forum's webpage is searched for, use forum's clue to handle as unit, improve the accuracy of search engine institute return message, user satisfaction is provided.
One of ordinary skill in the art will appreciate that all or part of step that realizes in the foregoing description method is to instruct relevant hardware to finish by program, described program can be stored in a kind of computer-readable recording medium, this program comprises the steps: when carrying out
Obtain the user inquiring speech;
From index data base, search the forum clue corresponding with described user inquiring speech;
The described forum clue that inquires is formatd processing, the forum's clue after the output format processing;
The above-mentioned storage medium of mentioning can be a ROM (read-only memory), disk or CD etc.
More than method, the Apparatus and system of the search and webpage that the embodiment of the invention provided and the device of setting up index data base are described in detail, the explanation of above embodiment just is used for helping to understand method of the present invention and thought thereof; Simultaneously, for one of ordinary skill in the art, according to thought of the present invention, the part that all can change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention.