CN101650742A - System and method for prompting search condition during English search - Google Patents

System and method for prompting search condition during English search Download PDF

Info

Publication number
CN101650742A
CN101650742A CN200910171271A CN200910171271A CN101650742A CN 101650742 A CN101650742 A CN 101650742A CN 200910171271 A CN200910171271 A CN 200910171271A CN 200910171271 A CN200910171271 A CN 200910171271A CN 101650742 A CN101650742 A CN 101650742A
Authority
CN
China
Prior art keywords
similar word
retrieval
search
term
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN200910171271A
Other languages
Chinese (zh)
Other versions
CN101650742B (en
Inventor
卢建
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing hi tech Enterprise Incubator Co., Ltd.
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN200910171271.2A priority Critical patent/CN101650742B/en
Publication of CN101650742A publication Critical patent/CN101650742A/en
Priority to PCT/CN2010/072737 priority patent/WO2011022995A1/en
Application granted granted Critical
Publication of CN101650742B publication Critical patent/CN101650742B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a system and a method for prompting search condition during English search. The method comprises the following steps: dividing the search condition into one or a plurality of search words; searching similar words of the search words from an index database of a search log; and calculating the prompted content according to the result of the search similar words and a collocated strategy. The invention not only adds a prompting function of a search engine during English search, but also ensures that the prompted content is more reasonable; meanwhile, the generation and themaintenance of the search log are also based on the basic index and the search technology of the search engine, thereby not only reducing the consumption of continuously added search logs to a systemresource, but also taking good advantages of the search engine on processing the index and the search to the great extent to improve the performance of prompting the search condition during English search.

Description

A kind of system and method that search condition during English search is pointed out
Technical field
The present invention relates to searching engine field, relate in particular to a kind of system and method that search condition during English search is pointed out.
Background technology
Be accompanied by the continuous development and progress in computerized information epoch, search engine has all obtained using widely in the internet hunt field or at the search field of enterprises no matter be, search engine returns associated result's the basic function except having according to the search condition of user's input, some intelligent functions of search engine are also constantly emerging in large numbers, when the user of search engine imports an English condition and retrieves, traditional search engine can not carry out any inspection to the condition of retrieval, no matter whether user's input is wrong, the perhaps condition situation of at all coming to nothing or the like, search engine can not carry out the relevant prompting of any search condition.
The prompt facility that does not generally have English condition in the present search engine system, also just simple application diary method of the prompting of the search engine system that has or dictionary method are pointed out, simple diary method mainly is that each search condition is all noted, utilize the condition in search condition and the daily record storehouse to mate during then to new retrieval request, if coupling then do not point out fully, if not fully coupling the most similar condition of prompting; Dictionary method mainly is by word being indexed in the dictionary library in advance as the daily record storehouse, user's search condition is not carried out record, judge whether that the process of pointing out is the same with diary method, also be to judge whether this search condition is arranged in the dictionary library earlier, if have then do not point out, if not then that the most similar speech of prompting.
Because the application of search engine system is more and more wider now, the user is also increasing to the dependence of search engine, return the basic function of accordingly result according to initial conditions except satisfying the user, also need system that more intelligentized function is arranged, when vicious word in the user search condition or imported when one condition and search condition such as come to nothing at situation not too accurately, require search engine system can carry out some rational promptings, so there are some defectives in prior art:
1) for the search engine system of simple diary method, because log record is user's whole search condition, rather than a plurality of conditions of search condition after by participle, the speech that occurred before existing in therefore a lot of conditions but can not get rational prompting
2) for simple diary method, because the foundation of prompting is to exist identical search condition just can not point out in the daily record storehouse, the error condition or the inaccurate condition of input still can not get rational prompting when retrieving in the past so probably when the back is retrieved once more
3) for the prompting of adopting simple diary method to obtain, also just obtain according to historical search condition, and resultful sign whether not in the historical search condition, even therefore probably obtained prompting, but the result who does not also have corresponding prompting, such processing is also unfriendly.
4) for the search engine system that adopts dictionary method, because processing procedure is the same with diary method, therefore there is above-mentioned problem equally, and except this, also exist dictionary library to include infull situation, and absolute judging according to dictionary, occurring probably was exactly when wanting to search for a speech originally, but point out the another one speech, such processing is obviously also unfriendly.
Summary of the invention
The technical problem to be solved in the present invention is exactly to propose a kind of system and method that search condition during English search is pointed out, the prompting when solution search engine shortage is retrieved English, and solve the irrational problem of prompting in the prior art.
In order to solve the problems of the technologies described above, the invention provides a kind of method that search condition during English search is pointed out, comprising:
Search condition is divided into one or more terms;
The similar word of the described term of retrieval from the retrieve log index database;
Result and configured strategy according to the retrieval similar word are calculated the content of prompting.
Further, said method also can have following characteristics:
Before the described step that search condition is divided into one or more terms, configuration parameter also is set, and described configuration parameter comprises: the minimum similarity of similar word retrieval, fixedly prefix character coupling number, retrieve the number of times weight factor and write down total weight factor.
Further, said method also can have following characteristics:
Described search condition is divided into the step of one or more terms, specifically comprises:
After receiving the retrieval request of user input, judge whether to carry out word segmentation processing,, then do not carry out word segmentation processing if the retrieval type of described retrieval request is accurate retrieval, with the search condition of described retrieval request as a term; If the retrieval type of described retrieval request is fuzzy search, then the search condition of described retrieval request is carried out word segmentation processing, described search condition is divided into one or more terms.
Further, said method also can have following characteristics:
The described step that search condition is divided into one or more terms is also carried out after carrying out:
From the described term of content indexing library searching, and, upgrade the retrieve log index database according to the record sum that retrieves each term and the number of times that is retrieved of each term.
Further, said method also can have following characteristics:
The described step of retrieving the similar word of described term from the retrieve log index database specifically comprises:
Retrieval meets the minimum similarity and the fixing similar word of each term of prefix character coupling number from the retrieve log index database, and obtain each similar word similarity, number of times and record sum are retrieved.
Further, said method also can have following characteristics:
The described step of calculating the content of prompting according to the result and the configured strategy of retrieval similar word specifically comprises:
At each term, calculate the ratio P1 of number of times in all similar word that be retrieved of each similar word, the P1 of each similar word be multiply by retrieval number of times weight factor obtain retrieving number of times score value R1, and, calculate the ratio P2 of record sum in all similar word of each similar word, the P2 of each similar word be multiply by the total weight factor of record obtain writing down total score value R2;
The similarity that (R1+R2) of each similar word be multiply by this similar word obtains pointing out score value T, if the highest similar word of prompting score value T is different from term, then the highest similar word of described prompting score value T is pointed out as suggestion content.
Further, said method also can have following characteristics:
The index database that described retrieve log index database generates for the index technology that adopts search engine;
Adopt the search technique of search engine from described retrieve log index database, to retrieve similar word.
In order to solve the problems of the technologies described above, the present invention also provides a kind of system that search condition during English search is pointed out, be applied in the search engine, described system comprises word-dividing mode, similar word retrieval module, suggestion content computing center and retrieve log index database;
Described word-dividing mode is used for search condition is divided into one or more terms, and informs the similar word retrieval module;
Described similar word retrieval module is used for from the similar word of the described term of retrieve log index database retrieval;
Described suggestion content computing center is used for calculating the content of prompting according to the result for retrieval of similar word retrieval module and configured strategy.
Further, said system also can have following characteristics:
Described system also comprises configuration module,
Described configuration module is used to be provided with configuration parameter, and with the minimum similarity of the similar word in configuration parameter retrieval and fixedly prefix character coupling number inform to word-dividing mode; To retrieve the number of times weight factor and inform suggestion content computing center with the total weight factor of record.
Further, said system also can have following characteristics:
After described word-dividing mode is further used for receiving the retrieval request of user's input, judge whether to carry out word segmentation processing, if the retrieval type of described retrieval request is accurate retrieval, then do not carry out word segmentation processing, with the search condition of described retrieval request as a term; If the retrieval type of described retrieval request is fuzzy search, then the search condition of described retrieval request is carried out word segmentation processing, described search condition is divided into one or more terms.
Further, said system also can have following characteristics:
Described system also comprises term retrieval module and content indexing storehouse,
Described term retrieval module is used for from the described term of content indexing library searching, and according to the record sum that retrieves each term and the number of times that is retrieved of each term, upgrades the retrieve log index database.
Further, said system also can have following characteristics:
Described similar word retrieval module is further used for from the retrieve log index database retrieval and meets the minimum similarity and the fixing similar word of each term of prefix character coupling number, and obtain each similar word similarity, number of times and record sum are retrieved.
Further, said system also can have following characteristics:
Described suggestion content computing center is further used at each term, calculate the ratio P1 of number of times in all similar word that be retrieved of each similar word, the P1 of each similar word be multiply by retrieval number of times weight factor obtain retrieving number of times score value R1, and, calculate the ratio P2 of record sum in all similar word of each similar word, the P2 of each similar word be multiply by the total weight factor of record obtain writing down total score value R2; The similarity that (R1+R2) of each similar word be multiply by this similar word obtains pointing out score value T, if the highest similar word of prompting score value T is different from term, then the highest similar word of described prompting score value T is pointed out as suggestion content.
Compared with prior art, useful technique effect of the present invention is:
1) search condition is carried out carrying out the retrieval of similar word and the generation of daily record index database again after the word segmentation processing, can check to each term in the search condition to greatest extent that like this can constantly enrich the retrieve log index database provides Data Source for the prompting of back request;
2) in the retrieve log index database except the term that records historical retrieval, also record the retrieval number of times of corresponding term, and the record sum that exists in the content indexing storehouse during the last request retrieval, in conjunction with user configured number of times weight and number of times weight as a result, the needs that judge whether that can rationalize more according to user's care emphasis are pointed out, and the content of prompting
3) in the retrieve log storehouse except the term that generates by historical requests is arranged, system has also preset dictionary library in the retrieve log index database, include the English word of standard in the dictionary Kuku, phrase and more self-defining special words or phrase etc.; In the following time of situation few in the retrieval request amount or the Request Log storehouse is also not abundant, can rationalize the judgement of prompting by the dictionary library mode, also has the function of word error correction simultaneously;
4) maintenance for the retrieve log storehouse all is the index technology that adopts search engine, all retrieve log are built up an index database, similar word retrieval for the daily record storehouse also is the basic retrieval technique that has adopted search engine simultaneously, so not only strengthened the of search engine system own, also strengthened the cohesion of search engine system simultaneously, particularly utilize index and retrieval technique, can give full play of search engine higher performance when big data quantity is handled, further improve the efficient of prompting.
Description of drawings
Fig. 1 is the process flow diagram of the embodiment of the invention;
Fig. 2 is the system schematic of the embodiment of the invention.
Embodiment
The present invention is described in detail below in conjunction with drawings and the specific embodiments.
As described in Figure 1, the method flow for the embodiment of the invention comprises the steps:
Step 101 is provided with configuration parameter;
This step is an optional step, if configuration parameter is not set, then adopts the configuration parameter of system default;
Configuration parameter comprises: the minimum similarity of similar word retrieval, fixedly prefix character coupling number, retrieve the number of times weight factor and write down total weight factor;
Wherein, fixedly prefix character coupling number is an optional parameter, and promptly this parameter can be 0; Prefix character coupling number is n if be provided with fixedly, and when then similar word was retrieved, a front n character of similar word and term must mate;
Particularly, application need according to reality, to the minimum similarity of similar word retrieval and fixedly prefix character coupling number parameter be configured, if when carrying out similar to search, if do not specify minimum similarity dynamically and fixedly during prefix character coupling number, system will use the configuration parameter of giving tacit consent to retrieve; Need retrieval number of times weight factor and the total weight factor of record that configuration prompt content computing center uses simultaneously, the configuration of these two factors only need be disposed one, and another then is 1 to deduct the factor that has disposed, and the factor is configured to 0 to 1;
Step 102 is divided into one or more terms with search condition;
Particularly, word-dividing mode judges whether to carry out word segmentation processing after receiving the retrieval request of user input, if the retrieval type of described retrieval request is accurate retrieval, does not then carry out word segmentation processing, with the search condition of described retrieval request as a term; If the retrieval type of described retrieval request is fuzzy search, then the search condition of described retrieval request is carried out word segmentation processing, described search condition is divided into one or more terms;
Wherein, the source of search condition is to extract from retrieval request, except the content of condition itself also comprises other relevant informations of retrieval request, such as the retrieval type, is fuzzy search or accurate the retrieval;
In the time of the word-dividing mode participle, the source condition is resolved into minimum term unit, also can mask some specific word, as a, an, meaningless speech such as the, specific speech can be safeguarded in system by the mode of configuration; One or more term that division obtains will be used to follow-up retrieval of retrieve log index database and content indexing library searching; These terms also can be updated in the retrieve log index database simultaneously;
Step 103, the similar word of the described term of retrieval from the retrieve log index database;
Wherein, the similar word retrieval module is retrieved from the retrieve log index database and is met the minimum similarity and the fixing similar word of each term of prefix character coupling number, and obtain each similar word similarity, number of times and record sum are retrieved;
Particularly, according to 102 terms that obtain, the calling search engine expansion interface carries out the retrieval of similar word from the retrieve log index database, the retrieve log index database here at first is an index database that adopts the index technology generation of search engine, the similar word retrieval of term, it mainly is the term that obtains after handling according to word-dividing mode, default configuration by system or dynamically appointment then minimum similarity and fixedly prefix character coupling number carry out the similar word retrieval, the retrieval of similar word is retrieved by the similar word Retrieval Interface of calling search engine expansion, if exist and the identical speech of term among the similar word result, the similarity of then representing this speech is the highest, to come the foremost, other results then can according to similarity and fixedly prefix character coupling number arrange, because the minimum similarity and the fixing condition of prefix character coupling number are arranged, and therefore the similar word that may obtain from the retrieve log index database is come to nothing; If there is the similar word result, will have corresponding similar word so in each outcome record, the similarity of similar word and similar word altogether be retrieved content indexing stock when number of times and the last being retrieved the record sum; These results will be applied at follow-up suggestion content computing center;
Step 104 from the described term of content indexing library searching, and according to the record sum that retrieves each term and the number of times that is retrieved of each term, is upgraded the retrieve log index database;
This step can be carried out after step 102 is carried out, can before or after step 103, carry out, also can with step 103 executed in parallel;
Particularly, retrieve from the content indexing storehouse according to the term that obtains after the processing of step 102 word-dividing mode, and the result for retrieval that obtains matching, in the process of carrying out the retrieval of each term, the record sum of each term and term itself can be inserted in the historical retrieve log formation and use for back retrieve log index database, to term when the content indexing storehouse is retrieved, can adopt the mode of parallel and serial, after can handling by word-dividing mode in the ordinary course of things, the content indexing library searching is carried out in the retrieval of advanced line retrieval daily record index database again; If in some applications performance is had relatively more outstanding requirement, then can adopt parallel technology here, in the retrieve log index database, carry out the content indexing library searching; The parallel processing capability of giving full play to machine improves the performance of system;
The term source of retrieve log index database mainly is made of two parts, and a part is the dictionary dictionary that system is inserted in advance, and by the word of standard, phrase or some special speech constitute usually for this part; Another part then is the term that obtains by each retrieval request dynamically; No matter be from dictionary library or each request, all be the index technology that adopts search engine for administering and maintaining of retrieve log index database, these all terms are formed an index database by index technology; Only before system used, the retrieval number of times of those terms of inserting in advance in index database and record sum all were 0, only have term itself; For the system after the formal use, can come the retrieve log index database is safeguarded according to each retrieval request, in service by asynchronous mode, enable independent thread and come from the retrieve log formation, to take out term and the record sum that exists, then the data of taking out are added by the index interface or be updated in the retrieve log index database, if determination methods has existed in the retrieve log index database for this term, then only the retrieval number of times of term corresponding in the retrieve log index database need be added one, directly the record sum of this term in the retrieve log index database is updated to up-to-date record number simultaneously; If this term does not exist in the retrieve log index database, then directly this term relevant information is added in the retrieve log index database, comprise term itself, the retrieval number of times is one and up-to-date record number, this thread is independent safeguards the retrieve log index database, and this thread can periodically be checked formation, if having data then handle immediately, if there is no data are then slept according to the cycle of configuration, and then start inspection up to all data of handling formation;
Step 105 is calculated the content of prompting according to the result and the configured strategy of retrieval similar word.
Preferably, this step comprises: at each term, calculate the ratio P1 of number of times in all similar word that be retrieved of each similar word, the P1 of each similar word be multiply by retrieval number of times weight factor obtain retrieving number of times score value R1, and, calculate the ratio P2 of record sum in all similar word of each similar word, the P2 of each similar word be multiply by the total weight factor of record obtain writing down total score value R2; The similarity that (R1+R2) of each similar word be multiply by this similar word obtains pointing out score value T, if the highest similar word of prompting score value T is different from term, then the highest similar word of described prompting score value T is pointed out as suggestion content;
Particularly, the similar word result who obtains from the retrieve log index database according to step 103 term by the main thread of system, comprise similar word itself, the similarity of similar word, when be retrieved number of times and this speech the last time of similar word is retrieved the content indexing stock the record sum, calculate the prompting score value of each similar word then by retrieval number of times weight factor in the pre-configured step 101 and the total weight factor of record, the content that finds the highest similar word of prompting score value T to be prompted as needs, expression does not need to point out if this content is identical with the source term, if difference then need to point out, suggestion content is this similar word just, and the term that the similar word that the needs that find are pointed out can be substituted into correspondence position in the former search condition is pointed out; In the condition that has a plurality of terms,, just need point out whole condition as long as there is one to satisfy the prompting requirement.
As shown in Figure 2, the system applies of the embodiment of the invention comprises in search engine: configuration module, word-dividing mode, similar word retrieval module, suggestion content computing center, retrieve log index database, term retrieval module and content indexing storehouse;
Configuration module is optional module, is used to be provided with configuration parameter, and with the minimum similarity of the similar word in configuration parameter retrieval and fixedly prefix character coupling number inform to word-dividing mode; To retrieve the number of times weight factor and inform suggestion content computing center with the total weight factor of record; Particularly, this module is mainly used in some default parameterss of configuration-system, comprises according to the term that obtains behind the participle carrying out similar word minimum similarity of when retrieval and fixing prefix character coupling number from the retrieve log storehouse; Also be responsible for retrieval number of times weight factor and the total weight factor of record used at configuration prompt content computing center simultaneously, the configuration of these two factors is general only to need one of configuration, and another then is 1 to deduct the factor that has disposed, and the factor is configured to 0 to 1;
Word-dividing mode is used for search condition is divided into one or more terms, and informs the similar word retrieval module; Particularly, this module judges whether to carry out word segmentation processing after being used to receive the retrieval request of user's input, if the retrieval type of described retrieval request is accurate retrieval, does not then carry out word segmentation processing, with the search condition of described retrieval request as a term; If the retrieval type of described retrieval request is fuzzy search, then the search condition of described retrieval request is carried out word segmentation processing, described search condition is divided into one or more terms;
The similar word retrieval module is used for from the similar word of the described term of retrieve log index database retrieval; Particularly, the search technique that this module is used for adopting search engine meets the minimum similarity and the fixing similar word of each term of prefix character coupling number from the retrieval of retrieve log index database, and obtain each similar word similarity, number of times and record sum are retrieved;
Retrieve log index database record be the relevant information of historical term, the index database for the index technology that adopts search engine generates comprises term, the number of times that is retrieved, and the record sum in content indexing storehouse during the last being retrieved; The term of retrieve log index database is except deriving from historical term, simultaneity factor is built-in in advance dictionary library Chinese and English dictionary speech, had all dictionary datas in advance in the retrieve log storehouse of system in other words, only retrieving number of times and writing down sum all is 0;
Suggestion content computing center is used for calculating the content of prompting according to the result for retrieval of similar word retrieval module and configured strategy; Particularly, this computing center is used at each term, calculate the ratio P1 of number of times in all similar word that be retrieved of each similar word, the P1 of each similar word be multiply by retrieval number of times weight factor obtain retrieving number of times score value R1, and, calculate the ratio P2 of record sum in all similar word of each similar word, the P2 of each similar word be multiply by the total weight factor of record obtain writing down total score value R2; The similarity that (R1+R2) of each similar word be multiply by this similar word obtains pointing out score value T, if the highest similar word of prompting score value T is different from term, then the highest similar word of described prompting score value T is pointed out as suggestion content.
The term retrieval module is used for from the described term of content indexing library searching, and according to the record sum that retrieves each term and the number of times that is retrieved of each term, upgrades the retrieve log index database.
By the way, the user is when utilizing search engine retrieving English data, search engine can both will draw the prompting that whether needs to carry out search condition by rational computing according to the retrieval request number of times and the indexed data amount information of history, and the content of prompting; Such prompting has also comprised basic word misspelling inspection.
In sum, the present invention has not only increased the prompt facility when search engine is retrieved English, and makes suggestion content rationalize more; Generation and maintenance for retrieve log simultaneously also is based on basic index of search engine and retrieval technique, so not only can reduce of the consumption of ever-increasing retrieve log, can also make full use of the performance that search engine is pointed out search condition to a great extent when the advantage of handling index and retrieve improves English retrieval system resource.
Certainly; the present invention also can have other various embodiments; under the situation that does not deviate from spirit of the present invention and essence thereof; those of ordinary skill in the art work as can make various corresponding changes and distortion according to the present invention, but these corresponding changes and distortion all should belong to the protection domain of the appended claim of the present invention.

Claims (13)

1, a kind of method that search condition during English search is pointed out comprises:
Search condition is divided into one or more terms;
The similar word of the described term of retrieval from the retrieve log index database;
Result and configured strategy according to the retrieval similar word are calculated the content of prompting.
2, the method for claim 1 is characterized in that,
Before the described step that search condition is divided into one or more terms, configuration parameter also is set, and described configuration parameter comprises: the minimum similarity of similar word retrieval, fixedly prefix character coupling number, retrieve the number of times weight factor and write down total weight factor.
3, the method for claim 1 is characterized in that,
Described search condition is divided into the step of one or more terms, specifically comprises:
After receiving the retrieval request of user input, judge whether to carry out word segmentation processing,, then do not carry out word segmentation processing if the retrieval type of described retrieval request is accurate retrieval, with the search condition of described retrieval request as a term; If the retrieval type of described retrieval request is fuzzy search, then the search condition of described retrieval request is carried out word segmentation processing, described search condition is divided into one or more terms.
4, the method for claim 1 is characterized in that,
The described step that search condition is divided into one or more terms is also carried out after carrying out:
From the described term of content indexing library searching, and, upgrade the retrieve log index database according to the record sum that retrieves each term and the number of times that is retrieved of each term.
5, as any described method in the claim 1~4, it is characterized in that,
The described step of retrieving the similar word of described term from the retrieve log index database specifically comprises:
Retrieval meets the minimum similarity and the fixing similar word of each term of prefix character coupling number from the retrieve log index database, and obtain each similar word similarity, number of times and record sum are retrieved.
6, method as claimed in claim 5 is characterized in that,
The described step of calculating the content of prompting according to the result and the configured strategy of retrieval similar word specifically comprises:
At each term, calculate the ratio P1 of number of times in all similar word that be retrieved of each similar word, the P1 of each similar word be multiply by retrieval number of times weight factor obtain retrieving number of times score value R1, and, calculate the ratio P2 of record sum in all similar word of each similar word, the P2 of each similar word be multiply by the total weight factor of record obtain writing down total score value R2;
The similarity that (R1+R2) of each similar word be multiply by this similar word obtains pointing out score value T, if the highest similar word of prompting score value T is different from term, then the highest similar word of described prompting score value T is pointed out as suggestion content.
7, as the described method of claim 1~4, it is characterized in that,
The index database that described retrieve log index database generates for the index technology that adopts search engine;
Adopt the search technique of search engine from described retrieve log index database, to retrieve similar word.
8, a kind of system that search condition during English search is pointed out is applied to it is characterized in that in the search engine that described system comprises word-dividing mode, similar word retrieval module, suggestion content computing center and retrieve log index database;
Described word-dividing mode is used for search condition is divided into one or more terms, and informs the similar word retrieval module;
Described similar word retrieval module is used for from the similar word of the described term of retrieve log index database retrieval;
Described suggestion content computing center is used for calculating the content of prompting according to the result for retrieval of similar word retrieval module and configured strategy.
9, system as claimed in claim 8 is characterized in that, described system also comprises configuration module,
Described configuration module is used to be provided with configuration parameter, and with the minimum similarity of the similar word in configuration parameter retrieval and fixedly prefix character coupling number inform to word-dividing mode; To retrieve the number of times weight factor and inform suggestion content computing center with the total weight factor of record.
10, system as claimed in claim 8 is characterized in that,
After described word-dividing mode is further used for receiving the retrieval request of user's input, judge whether to carry out word segmentation processing, if the retrieval type of described retrieval request is accurate retrieval, then do not carry out word segmentation processing, with the search condition of described retrieval request as a term; If the retrieval type of described retrieval request is fuzzy search, then the search condition of described retrieval request is carried out word segmentation processing, described search condition is divided into one or more terms.
11, system as claimed in claim 8 is characterized in that, described system also comprises term retrieval module and content indexing storehouse,
Described term retrieval module is used for from the described term of content indexing library searching, and according to the record sum that retrieves each term and the number of times that is retrieved of each term, upgrades the retrieve log index database.
12, as any described system in the claim 8~11, it is characterized in that,
Described similar word retrieval module is further used for from the retrieve log index database retrieval and meets the minimum similarity and the fixing similar word of each term of prefix character coupling number, and obtain each similar word similarity, number of times and record sum are retrieved.
13, system as claimed in claim 12 is characterized in that,
Described suggestion content computing center is further used at each term, calculate the ratio P1 of number of times in all similar word that be retrieved of each similar word, the P1 of each similar word be multiply by retrieval number of times weight factor obtain retrieving number of times score value R1, and, calculate the ratio P2 of record sum in all similar word of each similar word, the P2 of each similar word be multiply by the total weight factor of record obtain writing down total score value R2; The similarity that (R1+R2) of each similar word be multiply by this similar word obtains pointing out score value T, if the highest similar word of prompting score value T is different from term, then the highest similar word of described prompting score value T is pointed out as suggestion content.
CN200910171271.2A 2009-08-27 2009-08-27 System and method for prompting search condition during English search Expired - Fee Related CN101650742B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN200910171271.2A CN101650742B (en) 2009-08-27 2009-08-27 System and method for prompting search condition during English search
PCT/CN2010/072737 WO2011022995A1 (en) 2009-08-27 2010-05-13 Search condition prompt system and method for english word search

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200910171271.2A CN101650742B (en) 2009-08-27 2009-08-27 System and method for prompting search condition during English search

Publications (2)

Publication Number Publication Date
CN101650742A true CN101650742A (en) 2010-02-17
CN101650742B CN101650742B (en) 2015-01-28

Family

ID=41672980

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200910171271.2A Expired - Fee Related CN101650742B (en) 2009-08-27 2009-08-27 System and method for prompting search condition during English search

Country Status (2)

Country Link
CN (1) CN101650742B (en)
WO (1) WO2011022995A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011022995A1 (en) * 2009-08-27 2011-03-03 中兴通讯股份有限公司 Search condition prompt system and method for english word search
CN104572836A (en) * 2014-12-10 2015-04-29 百度在线网络技术(北京)有限公司 Method and device for confirming comprehensive relevancy of candidate inquiry sequence
WO2016155626A1 (en) * 2015-04-02 2016-10-06 北京奇虎科技有限公司 Search prompt implementation apparatus, system and method
CN106777254A (en) * 2016-12-27 2017-05-31 努比亚技术有限公司 One kind application search method and device
CN106933883A (en) * 2015-12-31 2017-07-07 中移(苏州)软件技术有限公司 Point of interest Ordinary search word sorting technique, device based on retrieval daily record
CN110134970A (en) * 2019-07-10 2019-08-16 北京百度网讯科技有限公司 Header error correction method and apparatus
CN110457189A (en) * 2019-07-02 2019-11-15 平安科技(深圳)有限公司 A kind of blog management method and system, relevant device of application program
CN111858830A (en) * 2020-03-27 2020-10-30 北京梦天门科技股份有限公司 Health supervision law enforcement data retrieval system and method based on natural language processing
CN113157869A (en) * 2021-05-06 2021-07-23 日照蓝鸥信息科技有限公司 Method and system for accurately positioning and retrieving documents

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112667775A (en) * 2020-12-25 2021-04-16 平安科技(深圳)有限公司 Keyword prompt-based retrieval method and device, electronic equipment and storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1335574A (en) * 2001-09-05 2002-02-13 罗笑南 Intelligent semantic searching method
CN100437585C (en) * 2006-09-04 2008-11-26 北京航空航天大学 Method for carrying out retrieval hint based on inverted list
CN101246475B (en) * 2007-02-14 2010-05-19 北京书生国际信息技术有限公司 Retrieval methodology base on layout information
CN100595759C (en) * 2007-04-25 2010-03-24 北大方正集团有限公司 Method and device for enquire enquiry extending as well as related searching word stock
CN100517330C (en) * 2007-06-06 2009-07-22 华东师范大学 Word sense based local file searching method
CN101206673A (en) * 2007-12-25 2008-06-25 北京科文书业信息技术有限公司 Intelligent error correcting system and method in network searching process
CN101408879A (en) * 2008-11-19 2009-04-15 张琼 Method and system for searching product based on search engine
CN101650742B (en) * 2009-08-27 2015-01-28 中兴通讯股份有限公司 System and method for prompting search condition during English search

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011022995A1 (en) * 2009-08-27 2011-03-03 中兴通讯股份有限公司 Search condition prompt system and method for english word search
CN104572836A (en) * 2014-12-10 2015-04-29 百度在线网络技术(北京)有限公司 Method and device for confirming comprehensive relevancy of candidate inquiry sequence
WO2016155626A1 (en) * 2015-04-02 2016-10-06 北京奇虎科技有限公司 Search prompt implementation apparatus, system and method
CN106933883A (en) * 2015-12-31 2017-07-07 中移(苏州)软件技术有限公司 Point of interest Ordinary search word sorting technique, device based on retrieval daily record
CN106933883B (en) * 2015-12-31 2019-12-27 中移(苏州)软件技术有限公司 Method and device for classifying common search terms of interest points based on search logs
CN106777254A (en) * 2016-12-27 2017-05-31 努比亚技术有限公司 One kind application search method and device
CN110457189A (en) * 2019-07-02 2019-11-15 平安科技(深圳)有限公司 A kind of blog management method and system, relevant device of application program
CN110134970A (en) * 2019-07-10 2019-08-16 北京百度网讯科技有限公司 Header error correction method and apparatus
CN110134970B (en) * 2019-07-10 2019-10-22 北京百度网讯科技有限公司 Header error correction method and apparatus
CN111858830A (en) * 2020-03-27 2020-10-30 北京梦天门科技股份有限公司 Health supervision law enforcement data retrieval system and method based on natural language processing
CN111858830B (en) * 2020-03-27 2023-11-14 北京梦天门科技股份有限公司 Health supervision law enforcement data retrieval system and method based on natural language processing
CN113157869A (en) * 2021-05-06 2021-07-23 日照蓝鸥信息科技有限公司 Method and system for accurately positioning and retrieving documents

Also Published As

Publication number Publication date
WO2011022995A1 (en) 2011-03-03
CN101650742B (en) 2015-01-28

Similar Documents

Publication Publication Date Title
CN101650742A (en) System and method for prompting search condition during English search
US20120166414A1 (en) Systems and methods for relevance scoring
CN107247707B (en) Enterprise association relation information extraction method and device based on completion strategy
US8250053B2 (en) Intelligent enhancement of a search result snippet
CN102591880B (en) Information providing method and device
EP3301591A1 (en) System and method for identifying related queries for languages with multiple writing systems
US10585927B1 (en) Determining a set of steps responsive to a how-to query
US20080162456A1 (en) Structure extraction from unstructured documents
US9953185B2 (en) Identifying query patterns and associated aggregate statistics among search queries
EP2592572A1 (en) Facilitating extraction and discovery of enterprise services
CN102722499B (en) Search engine and implementation method thereof
CN102722498A (en) Search engine and implementation method thereof
CN101833579A (en) Method and system for automatically detecting academic misconduct literature
CN102737021A (en) Search engine and realization method thereof
CN102799586B (en) A kind of escape degree defining method for search results ranking and device
CN103530298A (en) Information searching method and device
CN103064847A (en) Indexing equipment, indexing method, search device, search method and search system
CN113254588A (en) Data searching method and system
CN103207682A (en) Syllable segmentation-based Uighur, Kazakh and Kirghiz intelligent input method
CN112115228A (en) Searching method, searching device, terminal and storage medium
US11347780B2 (en) System and method for automatic suggestion and or correcting of search keywords
CN113590792A (en) User problem processing method and device and server
CN112395856A (en) Text matching method, text matching device, computer system and readable storage medium
CN110389988A (en) A kind of the user data processing method and system of real-time high-efficiency
CN103064840A (en) Indexing equipment, indexing method, search device, search method and search system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20161215

Address after: 408402 Chongqing Nanchuan District West Street office Longhua Road No. 12 (General Chamber of Commerce Building 1 building 2-12-2)

Patentee after: Chongqing hi tech Enterprise Incubator Co., Ltd.

Address before: 518057 Nanshan District high tech Industrial Park, Guangdong, South Road, science and technology, ZTE building, legal department

Patentee before: Zhongxing Communication Co., Ltd.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20150128

Termination date: 20190827