CN104715065A - Long query word searching method and device - Google Patents

Long query word searching method and device Download PDF

Info

Publication number
CN104715065A
CN104715065A CN201510149927.6A CN201510149927A CN104715065A CN 104715065 A CN104715065 A CN 104715065A CN 201510149927 A CN201510149927 A CN 201510149927A CN 104715065 A CN104715065 A CN 104715065A
Authority
CN
China
Prior art keywords
keyword
document
hit
numbering
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510149927.6A
Other languages
Chinese (zh)
Other versions
CN104715065B (en
Inventor
陈进平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qiyuan Technology Co ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201510149927.6A priority Critical patent/CN104715065B/en
Publication of CN104715065A publication Critical patent/CN104715065A/en
Application granted granted Critical
Publication of CN104715065B publication Critical patent/CN104715065B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a long query word searching method and device. The method comprises the following steps: acquiring an input long query word; extracting N keywords contained in the long query word, wherein N refers to a natural number; recalling at least M keyword documents in the N keywords, wherein M is less than or equal to N; and generating searching results according to the recalled documents. According to the technical scheme disclosed by the invention, because only the at least M keyword documents in the N keywords are recalled, the recall number of the documents is greatly reduced. Moreover, each recalled document belongs to the at least M keyword documents in the N keywords; and therefore, the matching degree of the original long query words is greatly improved.

Description

A kind of searching method of long query word and device
Technical field
The present invention relates to the Internet search technology field, be specifically related to a kind of searching method and device of long query word.
Background technology
In the search scene of some reality, user can input longer sentence as query word, is called long query word.
Adopting for this long query word search in prior art asks method also to realize fuzzy matching.Namely inquire about respectively according to each key comprised in long query word, then Query Result is merged.But this method is the non-constant of performance in actual use.Because assuming that a long query word comprises N number of keyword, L document on average recalled in each keyword, then ask and can return N*L document, the quantity of recalling document is very large, causes calculated amount more, and the matching effect that provides of Search Results also and bad simultaneously.
Fig. 1 shows existing asking and the Search Results schematic diagram of method.As shown in Figure 1, the non-constant of matching effect of the document of recalling and former long query word.
Visible, needing badly a kind ofly has efficient search solution very to long query word.
Summary of the invention
In view of the above problems, the present invention is proposed to provide a kind of overcoming the problems referred to above or the long query word searching method of one solved the problem at least in part and device.
According to one aspect of the present invention, provide a kind of searching method of long query word, the method comprises:
Obtain the long query word of input;
Extract the N number of keyword comprised in described long query word; N is natural number;
Recall the document of at least M keyword in the described N number of keyword of hit; M is less than or equal to N;
According to recalled document structure tree Search Results.
Alternatively, the document of recalling at least M keyword in the described N number of keyword of hit described in comprises:
For each keyword in described N number of keyword, go out a document of this keyword hit as current document corresponding to this keyword according to document code sequential search from small to large;
If there is not the document of at least M keyword in the described N number of keyword of hit in each current document, then the numbering of N number of current document corresponding for described N number of keyword is sorted by order from small to large, by M numbering assignment after sequence to Dm;
Filter out the document that numbering is less than Dm, from remaining document, search the document of at least M keyword in the described N number of keyword of hit;
The document of at least M keyword in described for the hit found N number of keyword is recalled.
Alternatively, the document of recalling at least M keyword in the described N number of keyword of hit described in comprises:
For each keyword in described N number of keyword, go out a document of this keyword hit as current document corresponding to this keyword according to document code sequential search from small to large;
If there is the document of at least M keyword in the described N number of keyword of hit in each current document, the document of at least M keyword in the described N number of keyword of hit is recalled, and by the numbering assignment of the document to Dm;
Filter out the document that numbering is less than or equal to Dm, from remaining document, search the document of at least M keyword in the described N number of keyword of hit;
The document of at least M keyword in described for the hit found N number of keyword is recalled.
Alternatively, the method comprises further:
If there is the two or more document of at least M keyword in the described N number of keyword of hit in each current document, then this plural document is recalled, and by document code assignment maximum for the numbering in two or more document to Dm.
Alternatively, the document of the described at least M keyword searched from remaining document in the described N number of keyword of hit comprises:
To each keyword in described N number of keyword, from residue document, go out a document of this keyword hit as current document corresponding to this keyword according to document code sequential search from small to large;
Judge the document of at least M keyword whether existed in each current document in the described N number of keyword of hit;
If the judgment is Yes, the document of at least M keyword in the described N number of keyword of hit is recalled, and by the numbering assignment of the document to Dm; Filter out the document that numbering is less than or equal to Dm, then from remaining document, search the document of at least M keyword in the described N number of keyword of hit;
If the judgment is No, then the numbering of N number of current document corresponding for described N number of keyword is sorted by order from small to large, by M numbering assignment after sequence to Dm; Filter out the document that numbering is less than Dm, then from remaining document, search the document of at least M keyword in the described N number of keyword of hit.
According to another aspect of the present invention, provide a kind of searcher of long query word, wherein, this device comprises:
Acquiring unit, is suitable for the long query word obtaining input;
Extraction unit, is suitable for extracting the N number of keyword comprised in described long query word; N is natural number;
Recall unit, be suitable for the document of an at least M keyword of recalling in the described N number of keyword of hit; M is less than or equal to N;
Generation unit, is suitable for according to recalled document structure tree Search Results.
Alternatively, described in recall unit, be suitable for for each keyword in described N number of keyword, go out a document of this keyword hit as current document corresponding to this keyword according to document code sequential search from small to large; If there is not the document of at least M keyword in the described N number of keyword of hit in each current document, then the numbering of N number of current document corresponding for described N number of keyword is sorted by order from small to large, by M numbering assignment after sequence to Dm; Filter out the document that numbering is less than Dm, from remaining document, search the document of at least M keyword in the described N number of keyword of hit; The document of at least M keyword in described for the hit found N number of keyword is recalled.
Alternatively, described in recall unit, be suitable for for each keyword in described N number of keyword, go out a document of this keyword hit as current document corresponding to this keyword according to document code sequential search from small to large; If there is the document of at least M keyword in the described N number of keyword of hit in each current document, the document of at least M keyword in the described N number of keyword of hit is recalled, and by the numbering assignment of the document to Dm; Filter out the document that numbering is less than or equal to Dm, from remaining document, search the document of at least M keyword in the described N number of keyword of hit; The document of at least M keyword in described for the hit found N number of keyword is recalled.
Alternatively, describedly recall unit, when being further adapted for the two or more document that there is at least M keyword in the described N number of keyword of hit in each current document, this plural document is recalled, and by document code assignment maximum for the numbering in two or more document to Dm.
Alternatively, described in recall unit, be suitable for each keyword in described N number of keyword, from residue document, go out a document of this keyword hit as current document corresponding to this keyword according to document code sequential search from small to large; Judge the document of at least M keyword whether existed in each current document in the described N number of keyword of hit; If the judgment is Yes, the document of at least M keyword in the described N number of keyword of hit is recalled, and by the numbering assignment of the document to Dm; Filter out the document that numbering is less than or equal to Dm, then from remaining document, search the document of at least M keyword in the described N number of keyword of hit; If the judgment is No, then the numbering of N number of current document corresponding for described N number of keyword is sorted by order from small to large, by M numbering assignment after sequence to Dm; Filter out the document that numbering is less than Dm, then from remaining document, search the document of at least M keyword in the described N number of keyword of hit; Repetition above-mentioned steps like this, until searched all documents to be searched.
According to this long query word obtaining input of the present invention, extract the N number of keyword comprised in described long query word, recall the document of at least M keyword in the described N number of keyword of hit, M is less than or equal to N, according to the technical scheme of recalled document structure tree Search Results, owing to only recalling the document of at least M keyword in the described N number of keyword of hit, therefore the amount of recalling of document reduces greatly, and each document of recalling hits at least M keyword in described N number of keyword, therefore also greatly improve with the matching degree of former long query word.
Above-mentioned explanation is only the general introduction of technical solution of the present invention, in order to technological means of the present invention can be better understood, and can be implemented according to the content of instructions, and can become apparent, below especially exemplified by the specific embodiment of the present invention to allow above and other objects of the present invention, feature and advantage.
Accompanying drawing explanation
By reading hereafter detailed description of the preferred embodiment, various other advantage and benefit will become cheer and bright for those of ordinary skill in the art.Accompanying drawing only for illustrating the object of preferred implementation, and does not think limitation of the present invention.And in whole accompanying drawing, represent identical parts by identical reference symbol.In the accompanying drawings:
Fig. 1 shows existing asking and the Search Results schematic diagram of method;
Fig. 2 shows a kind of according to an embodiment of the invention process flow diagram of searching method of long query word;
Fig. 3 shows the schematic diagram of the Search Results of long according to an embodiment of the invention query word;
Fig. 4 shows a kind of according to an embodiment of the invention structural drawing of searcher of long query word.
Embodiment
Below with reference to accompanying drawings exemplary embodiment of the present disclosure is described in more detail.Although show exemplary embodiment of the present disclosure in accompanying drawing, however should be appreciated that can realize the disclosure in a variety of manners and not should limit by the embodiment set forth here.On the contrary, provide these embodiments to be in order to more thoroughly the disclosure can be understood, and complete for the scope of the present disclosure can be conveyed to those skilled in the art.
Fig. 2 shows a kind of according to an embodiment of the invention process flow diagram of searching method of long query word.As shown in Figure 2, the method comprises:
Step S210, obtains the long query word of input.
In this step, the long query word of user's input can be obtained from the search column of search engine interface, also can obtain the long query word of user's input from the search column of browser or address field.
Step S220, extracts the N number of keyword comprised in described long query word; N is natural number.
Carry out word segmentation processing to long query word in this step, propose each query word comprised in this long query word, quantity N represents.
Step S230, recalls the document of at least M keyword in the described N number of keyword of hit; M is less than or equal to N.
Arrange M=2*N/3 in one embodiment of the invention, namely M is 2/3rds of N.When allowing the value that can set M in other embodiments of the present invention according to actual conditions.
Step S240, according to recalled document structure tree Search Results.
Method shown in Fig. 2, owing to only recalling the document of at least M keyword in the described N number of keyword of hit, the amount of recalling of document reduces greatly, and each document of recalling hits at least M keyword in described N number of keyword, also greatly improve with the matching degree of former long query word, therefore improve efficiency and the accuracy at target of long query search.
In one embodiment of the invention, the document of recalling at least M keyword in the described N number of keyword of hit described in the step S230 of method described in Fig. 1 can adopt following diverse ways:
(1), find out the document at least hitting a keyword, and then filter out the document hitting and be less than M keyword, recall remaining document after filtering.
(2), for each keyword in described N number of keyword, a document of this keyword hit is gone out as current document corresponding to this keyword according to document code sequential search from small to large; If there is not the document of at least M keyword in the described N number of keyword of hit in each current document, then the numbering of N number of current document corresponding for described N number of keyword is sorted by order from small to large, by M numbering assignment after sequence to Dm; Filter out the document that numbering is less than Dm, from remaining document, search the document of at least M keyword in the described N number of keyword of hit; The document of at least M keyword in described for the hit found N number of keyword is recalled.
Suppose that current long query word comprises N=6 keyword: t1, t2, t3, t4, t5, t6, M gets 4.The document to be checked number of finishing.For each keyword t i(i=1,2,3,4,5,6), goes out a document of this keyword hit as current document corresponding to this keyword according to document code sequential search from small to large.Assuming that these 6 keywords of t1, t2, t3, t4, t5 and t6 corresponding current document numbering be followed successively by 10,15,7,30,5 and 13.By observe can find to finish be 5,7 and 10 document can not meet the requirement of hit 4 keywords.Even if the next document code of t3 and t5 is 10, that document 10 also can only hit 3 keywords.Document 13 then likely hits 4 keywords, as long as t1, t3 and t5 also appear at document 13 li.Therefore order the 4th document (i.e. document 13) numbered from small to large is the next document likely satisfied condition, by 13 assignment to Dm, i.e. and Dm=13.The document that numbering is less than 13 does not satisfy condition, and directly filters out.After filtering, remaining document can use the same method and repeat to filter.
(3), for each keyword in described N number of keyword, a document of this keyword hit is gone out as current document corresponding to this keyword according to document code sequential search from small to large; If there is the document of at least M keyword in the described N number of keyword of hit in each current document, the document of at least M keyword in the described N number of keyword of hit is recalled, and by the numbering assignment of the document to Dm; Filter out the document that numbering is less than or equal to Dm, from remaining document, search the document of at least M keyword in the described N number of keyword of hit; The document of at least M keyword in described for the hit found N number of keyword is recalled.
Still comprise N=6 keyword with current long query word: t1, t2, t3, t4, t5, t6, M gets 4 for example, for each keyword t i(i=1,2,3,4,5,6), a document of this keyword hit is gone out as current document corresponding to this keyword, assuming that current document numbering corresponding to these 6 keywords of t1, t2, t3, t4, t5 and t6 is followed successively by 10,15,7,30,5 and 13 according to document code sequential search from small to large.If t1, t3, t5 and t6 that document 13 hits simultaneously, then document 13 needs to be called back, Dm=13, filters out the document (therefore filtering out from document to be checked because document 13 has been called back) that numbering is less than or equal to 13.After filtering, remaining document can use the same method and repeat to filter.
In one embodiment of the invention, if there is the two or more document of at least M keyword in the described N number of keyword of hit in each current document, then this plural document is recalled, and by document code assignment maximum for the numbering in two or more document to Dm.Such as, in upper example, except t1, t3, t5 and t6 that document 13 hits simultaneously, at least 4 keywords that document 15 also hits, then document 13 and 15 all needs to be called back, Dm=15, filters out the document that numbering is less than or equal to 15.After filtering, remaining document can use the same method and repeat to filter.
(4), for each keyword in described N number of keyword, a document of this keyword hit is gone out as current document corresponding to this keyword according to document code sequential search from small to large;
If there is not the document of at least M keyword in the described N number of keyword of hit in each current document, then the numbering of N number of current document corresponding for described N number of keyword is sorted by order from small to large, by M numbering assignment after sequence to Dm; Filter out the document that numbering is less than Dm, from remaining document, search the document of at least M keyword in the described N number of keyword of hit; The document of at least M keyword in described for the hit found N number of keyword is recalled.
If there is the document of at least M keyword in the described N number of keyword of hit in each current document, the document of at least M keyword in the described N number of keyword of hit is recalled, and by the numbering assignment of the document to Dm; Filter out the document that numbering is less than or equal to Dm, from remaining document, search the document of at least M keyword in the described N number of keyword of hit; The document of at least M keyword in described for the hit found N number of keyword is recalled.
Wherein, the document of an at least M keyword searched in remaining document in the described N number of keyword of hit is specially: to each keyword in described N number of keyword, from residue document, go out a document that this keyword hits as current document corresponding to this keyword according to document code sequential search from small to large; Judge the document of at least M keyword whether existed in each current document in the described N number of keyword of hit; If the judgment is Yes, the document of at least M keyword in the described N number of keyword of hit is recalled, and by the numbering assignment of the document to Dm; Filter out the document that numbering is less than or equal to Dm, then from remaining document, search the document of at least M keyword in the described N number of keyword of hit; If the judgment is No, then the numbering of N number of current document corresponding for described N number of keyword is sorted by order from small to large, by M numbering assignment after sequence to Dm; Filter out the document that numbering is less than Dm, then from remaining document, search the document of at least M keyword in the described N number of keyword of hit.
Repetition above-mentioned steps like this, until all documents are all inquired about complete.
This method directly can filter out the numbering document less than Dm due to each step, can fall the document do not satisfied condition by fast filtering, therefore accelerate inquiry velocity.More than 10 times are improved compared to traditional also scheme speed of asking.Therefore the program substantially improves search efficiency and the inquiry effect of long query word, uses more extensive.
Fig. 3 shows the schematic diagram of the Search Results of long according to an embodiment of the invention query word.See Fig. 3, obtain Search Results compared to the Search Results shown in Fig. 1, more mate long query word by the solution of the present invention, inquiry effect obtains obvious improvement.
Fig. 4 shows a kind of according to an embodiment of the invention structural drawing of searcher of long query word.As shown in Figure 4, the searcher 400 of this long query word comprises:
Acquiring unit 410, is suitable for the long query word obtaining input.The long query word of user's input can be obtained from the search column of search engine interface, also can obtain the long query word of user's input from the search column of browser or address field.
Extraction unit 420, is suitable for extracting the N number of keyword comprised in described long query word; N is natural number.Can carry out word segmentation processing to long query word, propose each query word comprised in this long query word, quantity N represents.
Recall unit 430, be suitable for the document of an at least M keyword of recalling in the described N number of keyword of hit; M is less than or equal to N.M can set flexibly according to actual conditions.
Generation unit 440, is suitable for according to recalled document structure tree Search Results.
Device shown in Fig. 4, owing to only recalling the document of at least M keyword in the described N number of keyword of hit, the amount of recalling of document reduces greatly, and each document of recalling hits at least M keyword in described N number of keyword, also greatly improve with the matching degree of former long query word, therefore improve efficiency and the accuracy at target of long query search.
In one embodiment of the invention, described in recall unit 430, be suitable for for each keyword in described N number of keyword, go out a document of this keyword hit as current document corresponding to this keyword according to document code sequential search from small to large; If there is not the document of at least M keyword in the described N number of keyword of hit in each current document, then the numbering of N number of current document corresponding for described N number of keyword is sorted by order from small to large, by M numbering assignment after sequence to Dm; Filter out the document that numbering is less than Dm, from remaining document, search the document of at least M keyword in the described N number of keyword of hit; The document of at least M keyword in described for the hit found N number of keyword is recalled.
Suppose that current long query word comprises N=6 keyword: t1, t2, t3, t4, t5, t6, M gets 4.The document to be checked number of finishing.For each keyword t i(i=1,2,3,4,5,6), goes out a document of this keyword hit as current document corresponding to this keyword according to document code sequential search from small to large.Assuming that these 6 keywords of t1, t2, t3, t4, t5 and t6 corresponding current document numbering be followed successively by 10,15,7,30,5 and 13.By observe can find to finish be 5,7 and 10 document can not meet the requirement of hit 4 keywords.Even if the next document code of t3 and t5 is 10, that document 10 also can only hit 3 keywords.Document 13 then likely hits 4 keywords, as long as t1, t3 and t5 also appear at document 13 li.Therefore order the 4th document (i.e. document 13) numbered from small to large is the next document likely satisfied condition, by 13 assignment to Dm, i.e. and Dm=13.The document that numbering is less than 13 does not satisfy condition, and directly filters out.After filtering, remaining document can use the same method and repeat to filter.
In another embodiment of the present invention, described in recall unit 430, be suitable for for each keyword in described N number of keyword, go out a document of this keyword hit as current document corresponding to this keyword according to document code sequential search from small to large; If there is the document of at least M keyword in the described N number of keyword of hit in each current document, the document of at least M keyword in the described N number of keyword of hit is recalled, and by the numbering assignment of the document to Dm; Filter out the document that numbering is less than or equal to Dm, from remaining document, search the document of at least M keyword in the described N number of keyword of hit; The document of at least M keyword in described for the hit found N number of keyword is recalled.
Still comprise N=6 keyword with current long query word: t1, t2, t3, t4, t5, t6, M gets 4 for example, for each keyword t i(i=1,2,3,4,5,6), a document of this keyword hit is gone out as current document corresponding to this keyword, assuming that current document numbering corresponding to these 6 keywords of t1, t2, t3, t4, t5 and t6 is followed successively by 10,15,7,30,5 and 13 according to document code sequential search from small to large.If t1, t3, t5 and t6 that document 13 hits simultaneously, then document 13 needs to be called back, Dm=13, filters out the document (therefore filtering out from document to be checked because document 13 has been called back) that numbering is less than or equal to 13.After filtering, remaining document can use the same method and repeat to filter.
In this embodiment of the invention, describedly recall unit 430, when being further adapted for the two or more document that there is at least M keyword in the described N number of keyword of hit in each current document, this plural document is recalled, and by document code assignment maximum for the numbering in two or more document to Dm.
In the above embodiment of the present invention, from remaining document, search the document of an at least M keyword in the described N number of keyword of hit by following scheme: described in recall unit 430, be suitable for each keyword in described N number of keyword, from residue document, go out a document of this keyword hit as current document corresponding to this keyword according to document code sequential search from small to large; Judge the document of at least M keyword whether existed in each current document in the described N number of keyword of hit; If the judgment is Yes, the document of at least M keyword in the described N number of keyword of hit is recalled, and by the numbering assignment of the document to Dm; Filter out the document that numbering is less than or equal to Dm, then from remaining document, search the document of at least M keyword in the described N number of keyword of hit; If the judgment is No, then the numbering of N number of current document corresponding for described N number of keyword is sorted by order from small to large, by M numbering assignment after sequence to Dm; Filter out the document that numbering is less than Dm, then from remaining document, search the document of at least M keyword in the described N number of keyword of hit; Repetition above-mentioned steps like this, until searched all documents to be searched.
In sum, according to this long query word obtaining input of the present invention, extract the N number of keyword comprised in described long query word, recall the document of at least M keyword in the described N number of keyword of hit, M is less than or equal to N, according to the technical scheme of recalled document structure tree Search Results, owing to only recalling the document of at least M keyword in the described N number of keyword of hit, therefore the amount of recalling of document reduces greatly, and each document of recalling hits at least M keyword in described N number of keyword, therefore also greatly improve with the matching degree of former long query word.
It should be noted that:
Intrinsic not relevant to any certain computer, virtual bench or miscellaneous equipment with display at this algorithm provided.Various fexible unit also can with use based on together with this teaching.According to description above, the structure constructed required by this kind of device is apparent.In addition, the present invention is not also for any certain programmed language.It should be understood that and various programming language can be utilized to realize content of the present invention described here, and the description done language-specific is above to disclose preferred forms of the present invention.
In instructions provided herein, describe a large amount of detail.But can understand, embodiments of the invention can be put into practice when not having these details.In some instances, be not shown specifically known method, structure and technology, so that not fuzzy understanding of this description.
Similarly, be to be understood that, in order to simplify the disclosure and to help to understand in each inventive aspect one or more, in the description above to exemplary embodiment of the present invention, each feature of the present invention is grouped together in single embodiment, figure or the description to it sometimes.But, the method for the disclosure should be construed to the following intention of reflection: namely the present invention for required protection requires feature more more than the feature clearly recorded in each claim.Or rather, as claims below reflect, all features of disclosed single embodiment before inventive aspect is to be less than.Therefore, the claims following embodiment are incorporated to this embodiment thus clearly, and wherein each claim itself is as independent embodiment of the present invention.
Those skilled in the art are appreciated that and adaptively can change the module in the equipment in embodiment and they are arranged in one or more equipment different from this embodiment.Module in embodiment or unit or assembly can be combined into a module or unit or assembly, and multiple submodule or subelement or sub-component can be put them in addition.Except at least some in such feature and/or process or unit be mutually repel except, any combination can be adopted to combine all processes of all features disclosed in this instructions (comprising adjoint claim, summary and accompanying drawing) and so disclosed any method or equipment or unit.Unless expressly stated otherwise, each feature disclosed in this instructions (comprising adjoint claim, summary and accompanying drawing) can by providing identical, alternative features that is equivalent or similar object replaces.
In addition, those skilled in the art can understand, although embodiments more described herein to comprise in other embodiment some included feature instead of further feature, the combination of the feature of different embodiment means and to be within scope of the present invention and to form different embodiments.Such as, in the following claims, the one of any of embodiment required for protection can use with arbitrary array mode.
All parts embodiment of the present invention with hardware implementing, or can realize with the software module run on one or more processor, or realizes with their combination.It will be understood by those of skill in the art that the some or all functions that microprocessor or digital signal processor (DSP) can be used in practice to realize according to the some or all parts in the searcher of the long query word of the embodiment of the present invention.The present invention can also be embodied as part or all equipment for performing method as described herein or device program (such as, computer program and computer program).Realizing program of the present invention and can store on a computer-readable medium like this, or the form of one or more signal can be had.Such signal can be downloaded from internet website and obtain, or provides on carrier signal, or provides with any other form.
The present invention will be described instead of limit the invention to it should be noted above-described embodiment, and those skilled in the art can design alternative embodiment when not departing from the scope of claims.In the claims, any reference symbol between bracket should be configured to limitations on claims.Word " comprises " not to be got rid of existence and does not arrange element in the claims or step.Word "a" or "an" before being positioned at element is not got rid of and be there is multiple such element.The present invention can by means of including the hardware of some different elements and realizing by means of the computing machine of suitably programming.In the unit claim listing some devices, several in these devices can be carry out imbody by same hardware branch.Word first, second and third-class use do not represent any order.Can be title by these word explanations.

Claims (10)

1. a searching method for long query word, wherein, the method comprises:
Obtain the long query word of input;
Extract the N number of keyword comprised in described long query word; N is natural number;
Recall the document of at least M keyword in the described N number of keyword of hit; M is less than or equal to N;
According to recalled document structure tree Search Results.
2. the method for claim 1, wherein described in the document of an at least M keyword of recalling in the described N number of keyword of hit comprise:
For each keyword in described N number of keyword, go out a document of this keyword hit as current document corresponding to this keyword according to document code sequential search from small to large;
If there is not the document of at least M keyword in the described N number of keyword of hit in each current document, then the numbering of N number of current document corresponding for described N number of keyword is sorted by order from small to large, by M numbering assignment after sequence to Dm;
Filter out the document that numbering is less than Dm, from remaining document, search the document of at least M keyword in the described N number of keyword of hit;
The document of at least M keyword in described for the hit found N number of keyword is recalled.
3. the method for claim 1, wherein described in the document of an at least M keyword of recalling in the described N number of keyword of hit comprise:
For each keyword in described N number of keyword, go out a document of this keyword hit as current document corresponding to this keyword according to document code sequential search from small to large;
If there is the document of at least M keyword in the described N number of keyword of hit in each current document, the document of at least M keyword in the described N number of keyword of hit is recalled, and by the numbering assignment of the document to Dm;
Filter out the document that numbering is less than or equal to Dm, from remaining document, search the document of at least M keyword in the described N number of keyword of hit;
The document of at least M keyword in described for the hit found N number of keyword is recalled.
4. method as claimed in claim 3, wherein, the method comprises further:
If there is the two or more document of at least M keyword in the described N number of keyword of hit in each current document, then this plural document is recalled, and by document code assignment maximum for the numbering in two or more document to Dm.
5. the method according to any one of claim 2-4, wherein, the document of the described at least M keyword searched from remaining document in the described N number of keyword of hit comprises:
To each keyword in described N number of keyword, from residue document, go out a document of this keyword hit as current document corresponding to this keyword according to document code sequential search from small to large;
Judge the document of at least M keyword whether existed in each current document in the described N number of keyword of hit;
If the judgment is Yes, the document of at least M keyword in the described N number of keyword of hit is recalled, and by the numbering assignment of the document to Dm; Filter out the document that numbering is less than or equal to Dm, then from remaining document, search the document of at least M keyword in the described N number of keyword of hit;
If the judgment is No, then the numbering of N number of current document corresponding for described N number of keyword is sorted by order from small to large, by M numbering assignment after sequence to Dm; Filter out the document that numbering is less than Dm, then from remaining document, search the document of at least M keyword in the described N number of keyword of hit.
6. a searcher for long query word, wherein, this device comprises:
Acquiring unit, is suitable for the long query word obtaining input;
Extraction unit, is suitable for extracting the N number of keyword comprised in described long query word; N is natural number;
Recall unit, be suitable for the document of an at least M keyword of recalling in the described N number of keyword of hit; M is less than or equal to N;
Generation unit, is suitable for according to recalled document structure tree Search Results.
7. device as claimed in claim 6, wherein,
Describedly recall unit, be suitable for for each keyword in described N number of keyword, the document going out the hit of this keyword according to document code sequential search is from small to large as current document corresponding to this keyword; If there is not the document of at least M keyword in the described N number of keyword of hit in each current document, then the numbering of N number of current document corresponding for described N number of keyword is sorted by order from small to large, by M numbering assignment after sequence to Dm; Filter out the document that numbering is less than Dm, from remaining document, search the document of at least M keyword in the described N number of keyword of hit; The document of at least M keyword in described for the hit found N number of keyword is recalled.
8. device as claimed in claim 6, wherein,
Describedly recall unit, be suitable for for each keyword in described N number of keyword, the document going out the hit of this keyword according to document code sequential search is from small to large as current document corresponding to this keyword; If there is the document of at least M keyword in the described N number of keyword of hit in each current document, the document of at least M keyword in the described N number of keyword of hit is recalled, and by the numbering assignment of the document to Dm; Filter out the document that numbering is less than or equal to Dm, from remaining document, search the document of at least M keyword in the described N number of keyword of hit; The document of at least M keyword in described for the hit found N number of keyword is recalled.
9. device as claimed in claim 8, wherein,
Describedly recall unit, when being further adapted for the two or more document that there is at least M keyword in the described N number of keyword of hit in each current document, this plural document is recalled, and by document code assignment maximum for the numbering in two or more document to Dm.
10. device as claimed in any one of claims 7-9, wherein,
Describedly recall unit, be suitable for each keyword in described N number of keyword, the document going out the hit of this keyword from residue document according to document code sequential search is from small to large as current document corresponding to this keyword; Judge the document of at least M keyword whether existed in each current document in the described N number of keyword of hit; If the judgment is Yes, the document of at least M keyword in the described N number of keyword of hit is recalled, and by the numbering assignment of the document to Dm; Filter out the document that numbering is less than or equal to Dm, then from remaining document, search the document of at least M keyword in the described N number of keyword of hit; If the judgment is No, then the numbering of N number of current document corresponding for described N number of keyword is sorted by order from small to large, by M numbering assignment after sequence to Dm; Filter out the document that numbering is less than Dm, then from remaining document, search the document of at least M keyword in the described N number of keyword of hit; Repetition above-mentioned steps like this, until searched all documents to be searched.
CN201510149927.6A 2015-03-31 2015-03-31 Long query word searching method and device Expired - Fee Related CN104715065B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510149927.6A CN104715065B (en) 2015-03-31 2015-03-31 Long query word searching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510149927.6A CN104715065B (en) 2015-03-31 2015-03-31 Long query word searching method and device

Publications (2)

Publication Number Publication Date
CN104715065A true CN104715065A (en) 2015-06-17
CN104715065B CN104715065B (en) 2017-04-19

Family

ID=53414391

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510149927.6A Expired - Fee Related CN104715065B (en) 2015-03-31 2015-03-31 Long query word searching method and device

Country Status (1)

Country Link
CN (1) CN104715065B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108536740A (en) * 2018-03-07 2018-09-14 上海连尚网络科技有限公司 A kind of method, medium and the equipment of determining search result
CN110929125A (en) * 2019-11-15 2020-03-27 腾讯科技(深圳)有限公司 Search recall method, apparatus, device and storage medium thereof

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1808430A (en) * 2004-11-01 2006-07-26 西安迪戈科技有限责任公司 Internet and computer information retrieval and mining with intelligent conceptual filtering, visualization and automation
US20120150836A1 (en) * 2010-12-08 2012-06-14 Microsoft Corporation Training parsers to approximately optimize ndcg
CN102541960A (en) * 2010-12-31 2012-07-04 北大方正集团有限公司 Method and device of fuzzy retrieval
CN103049495A (en) * 2012-12-07 2013-04-17 百度在线网络技术(北京)有限公司 Method, device and equipment for providing searching advice corresponding to inquiring sequence
CN103810220A (en) * 2012-11-15 2014-05-21 腾讯科技(深圳)有限公司 Microblog search method and device
CN104361042A (en) * 2014-10-29 2015-02-18 中国建设银行股份有限公司 Information retrieval method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1808430A (en) * 2004-11-01 2006-07-26 西安迪戈科技有限责任公司 Internet and computer information retrieval and mining with intelligent conceptual filtering, visualization and automation
US20120150836A1 (en) * 2010-12-08 2012-06-14 Microsoft Corporation Training parsers to approximately optimize ndcg
CN102541960A (en) * 2010-12-31 2012-07-04 北大方正集团有限公司 Method and device of fuzzy retrieval
CN103810220A (en) * 2012-11-15 2014-05-21 腾讯科技(深圳)有限公司 Microblog search method and device
CN103049495A (en) * 2012-12-07 2013-04-17 百度在线网络技术(北京)有限公司 Method, device and equipment for providing searching advice corresponding to inquiring sequence
CN104361042A (en) * 2014-10-29 2015-02-18 中国建设银行股份有限公司 Information retrieval method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108536740A (en) * 2018-03-07 2018-09-14 上海连尚网络科技有限公司 A kind of method, medium and the equipment of determining search result
CN108536740B (en) * 2018-03-07 2020-06-26 上海连尚网络科技有限公司 Method, medium and equipment for determining search result
CN110929125A (en) * 2019-11-15 2020-03-27 腾讯科技(深圳)有限公司 Search recall method, apparatus, device and storage medium thereof
CN110929125B (en) * 2019-11-15 2023-07-11 腾讯科技(深圳)有限公司 Search recall method, device, equipment and storage medium thereof

Also Published As

Publication number Publication date
CN104715065B (en) 2017-04-19

Similar Documents

Publication Publication Date Title
JP5449628B2 (en) Determining category information using multistage
CN104715064B (en) It is a kind of to realize the method and server that keyword is marked on webpage
CN103514299A (en) Information searching method and device
CN103092943B (en) A kind of method of advertisement scheduling and advertisement scheduling server
CN104077275A (en) Method and device for performing word segmentation based on context
CN104036009A (en) Method for searching matched pictures and method and device for searching pictures
CN105488221A (en) Method and system for recommending query terms for conducting searching in search interface
CN103984776A (en) Repeated image identification method and image search duplicate removal method and device
CN103984757B (en) Search results pages is inserted the method and system of news information entry
CN105095391A (en) Device and method for identifying organization name by word segmentation program
CN105095175A (en) Method and device for obtaining truncated web title
CN105404688A (en) Searching method and searching device
CN103942264A (en) Method and device for pushing webpages containing news information
CN103559313A (en) Searching method and device
CN105095525A (en) Method and device for acquiring web page data
CN104715065A (en) Long query word searching method and device
CN104317796A (en) Multi-user interaction method, multi-user interaction server and multi-user interaction system based on searching
CN103744970A (en) Method and device for determining subject term of picture
CN104462519A (en) Search query method and device
CN105808623A (en) Search-based page access event association method and device
CN103678601A (en) Model essay retrieval request processing method and device
CN104715066B (en) Searching optimization method, searching optimization device and searching optimization system
CN105095386A (en) Device and method for determining web page quality
CN105404695A (en) Test question query method and apparatus
CN105354235A (en) Search result processing method and apparatus

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20170310

Address after: 100016 Chaoyang District Road, Jiuxianqiao, No. 10, building No. 3, floor 15, floor 17, 1701-15B,

Applicant after: BEIJING QIYUAN TECHNOLOGY CO.,LTD.

Address before: 100088 Beijing city Xicheng District xinjiekouwai Street 28, block D room 112 (Desheng Park)

Applicant before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Applicant before: Qizhi software (Beijing) Co.,Ltd.

GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170419