CN104715065B - Long query word searching method and device - Google Patents

Long query word searching method and device Download PDF

Info

Publication number
CN104715065B
CN104715065B CN201510149927.6A CN201510149927A CN104715065B CN 104715065 B CN104715065 B CN 104715065B CN 201510149927 A CN201510149927 A CN 201510149927A CN 104715065 B CN104715065 B CN 104715065B
Authority
CN
China
Prior art keywords
document
key word
hit
word
numbering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201510149927.6A
Other languages
Chinese (zh)
Other versions
CN104715065A (en
Inventor
陈进平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qiyuan Technology Co ltd
Original Assignee
Beijing Yuan Yuan Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yuan Yuan Technology Co Ltd filed Critical Beijing Yuan Yuan Technology Co Ltd
Priority to CN201510149927.6A priority Critical patent/CN104715065B/en
Publication of CN104715065A publication Critical patent/CN104715065A/en
Application granted granted Critical
Publication of CN104715065B publication Critical patent/CN104715065B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a long query word searching method and device. The method comprises the following steps: acquiring an input long query word; extracting N keywords contained in the long query word, wherein N refers to a natural number; recalling at least M keyword documents in the N keywords, wherein M is less than or equal to N; and generating searching results according to the recalled documents. According to the technical scheme disclosed by the invention, because only the at least M keyword documents in the N keywords are recalled, the recall number of the documents is greatly reduced. Moreover, each recalled document belongs to the at least M keyword documents in the N keywords; and therefore, the matching degree of the original long query words is greatly improved.

Description

A kind of searching method and device of long query word
Technical field
The present invention relates to the Internet search technology field, and in particular to a kind of searching method and device of long query word.
Background technology
In some actual search scenes, user can be input into longer sentence as query word, referred to as long query word.
In prior art for this long query word search for using ask and method realizing fuzzy matching.Looked into according to length Ask each key word included in word to be inquired about respectively, then Query Result is merged.But this method is actually used Middle performance is excessively poor.As it is assumed that a long query word includes N number of key word, each key word is averagely recalled L document, is then asked And N*L document can be returned, the quantity for recalling document is very big, causes amount of calculation more, simultaneously scans for the matching that result is provided Effect is also and bad.
Fig. 1 show it is existing ask and method Search Results schematic diagram.As shown in figure 1, the document recalled is looked into former length The matching effect for asking word is excessively poor.
It can be seen that, need a kind of effective search solution for long query word badly.
The content of the invention
In view of the above problems, it is proposed that the present invention so as to provide one kind overcome the problems referred to above or at least in part solve on State a kind of long query word searching method and device of problem.
According to one aspect of the present invention, there is provided a kind of searching method of long query word, the method includes:
Obtain the long query word of input;
Extract the N number of key word included in the long query word;N is natural number;
Recall the document of at least M key word in hit N number of key word;M is less than or equal to N;
According to the document structure tree Search Results recalled.
Alternatively, the document for recalling at least M key word in hit N number of key word includes:
For each key word in N number of key word, according to document code sequential search from small to large the pass is gone out One document of keyword hit is used as the corresponding current document of the key word;
If there is no the document of at least M key word in hit N number of key word in each current document, by institute The numbering for stating the corresponding N number of current document of N number of key word sorts by order from small to large, and the m-th numbering after sequence is assigned It is worth to Dm;
Document of the numbering less than Dm is filtered out, at least M in hit N number of key word is searched from remaining document The document of key word;
The document of at least M key word in the hit N number of key word for finding is recalled.
Alternatively, the document for recalling at least M key word in hit N number of key word includes:
For each key word in N number of key word, according to document code sequential search from small to large the pass is gone out One document of keyword hit is used as the corresponding current document of the key word;
If there is the document of at least M key word in hit N number of key word in each current document, by hitting The document for stating at least M key word in N number of key word is recalled, and the numbering of the document is assigned to into Dm;
Document of the numbering less than or equal to Dm is filtered out, during hit N number of key word is searched from remaining document The document of at least M key word;
The document of at least M key word in the hit N number of key word for finding is recalled.
Alternatively, the method is further included:
If there is the two or more document of at least M key word in hit N number of key word in each current document, Then the plural document is recalled, and the largest number of document code in two or more document is assigned to into Dm.
Alternatively, the document that at least M key word in hit N number of key word is searched from remaining document Including:
To each key word in N number of key word, the order from remaining document according to document code from small to large A document of the key word hit is found out as the corresponding current document of the key word;
Judge in each current document with the presence or absence of the document of at least M key word in hit N number of key word;
If the judgment is Yes, the document for hitting at least M key word in N number of key word is recalled, and by this article The numbering of shelves is assigned to Dm;Document of the numbering less than or equal to Dm is filtered out, then lookup hit is described N number of from remaining document The document of at least M key word in key word;
If the judgment is No, then by the numbering of the corresponding N number of current document of the N number of key word by order from small to large Sequence, by the m-th numbering after sequence Dm is assigned to;Document of the numbering less than Dm is filtered out, then is searched from remaining document Hit the document of at least M key word in N number of key word.
According to another aspect of the present invention, there is provided a kind of searcher of long query word, wherein, the device includes:
Acquiring unit, is suitable to obtain the long query word of input;
Extraction unit, is suitable to extract the N number of key word included in the long query word;N is natural number;
Unit is recalled, is suitable to recall the document of at least M key word in hit N number of key word;M is less than or equal to N;
Signal generating unit, is suitable to according to the document structure tree Search Results recalled.
Alternatively, it is described to recall unit, it is suitable to for each key word in N number of key word, according to document code Sequential search from small to large goes out a document of the key word hit as the corresponding current document of the key word;If respectively when There is no the document of at least M key word in hit N number of key word in front document, then by N number of key word correspondence N number of current document numbering by from small to large order sequence, by after sequence m-th numbering be assigned to Dm;Filter out volume Document number less than Dm, searches the document of at least M key word in hit N number of key word from remaining document;Will The document of at least M key word in the hit N number of key word for finding is recalled.
Alternatively, it is described to recall unit, it is suitable to for each key word in N number of key word, according to document code Sequential search from small to large goes out a document of the key word hit as the corresponding current document of the key word;If respectively when There is the document of at least M key word in hit N number of key word in front document, will hit in N number of key word The document of at least M key word is recalled, and the numbering of the document is assigned to into Dm;Filter out text of the numbering less than or equal to Dm Shelves, search the document of at least M key word in hit N number of key word from remaining document;By the hit for finding The document of at least M key word in N number of key word is recalled.
Alternatively, it is described to recall unit, it is further adapted for existing in each current document and hits in N number of key word During the two or more document of at least M key word, the plural document is recalled, and by the numbering in two or more document Maximum document code is assigned to Dm.
Alternatively, it is described to recall unit, it is suitable to, to each key word in N number of key word, press from remaining document According to document code sequential search from small to large go out a document of key word hit as the key word is corresponding ought be above Shelves;Judge in each current document with the presence or absence of the document of at least M key word in hit N number of key word;If it is determined that It is yes, the document for hitting at least M key word in N number of key word is recalled, and the numbering of the document is assigned to into Dm; Document of the numbering less than or equal to Dm is filtered out, then at least M in hit N number of key word is searched from remaining document The document of key word;If the judgment is No, then by the numbering of the corresponding N number of current document of the N number of key word by from small to large Order sequence, by after sequence m-th numbering be assigned to Dm;Document of the numbering less than Dm is filtered out, then from remaining document The middle document for searching at least M key word in hit N number of key word;Such repeat the above steps, until searched treating All documents of search.
This long query word for obtaining input of the invention, extracts the N number of key word included in the long query word, The document of at least M key word in hit N number of key word is recalled, M is less than or equal to N, according to the document life recalled Into the technical scheme of Search Results, due to only recalling the document for hitting at least M key word in N number of key word, therefore The amount of recalling of document is substantially reduced, and each document recalled hits at least M key word in N number of key word, Therefore also greatly improve with the matching degree of former long query word.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention, And can be practiced according to the content of description, and in order to allow the above and other objects of the present invention, feature and advantage can Become apparent, below especially exemplified by the specific embodiment of the present invention.
Description of the drawings
By the detailed description for reading hereafter preferred implementation, various other advantages and benefit is common for this area Technical staff will be clear from understanding.Accompanying drawing is only used for illustrating the purpose of preferred implementation, and is not considered as to the present invention Restriction.And in whole accompanying drawing, it is denoted by the same reference numerals identical part.In the accompanying drawings:
Fig. 1 show it is existing ask and method Search Results schematic diagram;
Fig. 2 shows a kind of flow chart of the searching method of long query word according to an embodiment of the invention;
Fig. 3 shows the schematic diagram of the Search Results of long query word according to an embodiment of the invention;
Fig. 4 shows a kind of structure chart of the searcher of long query word according to an embodiment of the invention.
Specific embodiment
The exemplary embodiment of the present invention is more fully described below with reference to accompanying drawings.Although showing the present invention in accompanying drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the present invention and should not be by embodiments set forth here Limited.On the contrary, there is provided these embodiments are able to be best understood from the present invention, and can be by the scope of the present invention Complete conveys to those skilled in the art.
Fig. 2 shows a kind of flow chart of the searching method of long query word according to an embodiment of the invention.Such as Fig. 2 institutes Show, the method includes:
Step S210, obtains the long query word of input.
In this step, the long query word of user input can be obtained from the search column of search engine interface, it is also possible to The long query word of user input is obtained from the search column or address field of browser.
Step S220, extracts the N number of key word included in the long query word;N is natural number.
Word segmentation processing is carried out to long query word in this step, each query word included in the long query word, quantity is proposed Represented with N.
Step S230, recalls the document of at least M key word in hit N number of key word;M is less than or equal to N.
M=2*N/3, i.e. M can be set in one embodiment of the invention for 2/3rds of N.Certainly of the invention The value of M can be set in other embodiment according to practical situation.
Step S240, according to the document structure tree Search Results recalled.
Method shown in Fig. 2, due to only recalling the document for hitting at least M key word in N number of key word, document The amount of recalling substantially reduce, and each document recalled hits at least M key word in N number of key word, with original The matching degree of long query word is also greatly improved, therefore improves the efficiency and accuracy at target of long query search.
In one embodiment of the invention, hit N number of key is recalled described in the step of Fig. 1 methods describeds S230 The document of at least M key word in word can adopt following different method:
(1) document at least hitting a key word, is found out, then after filtering text of the hit less than M key word Shelves, recall remaining document after filtration.
(2), for each key word in N number of key word, go out according to document code sequential search from small to large One document of the key word hit is used as the corresponding current document of the key word;If there is no hit institute in each current document The document of at least M key word in N number of key word is stated, is then pressed the numbering of the corresponding N number of current document of the N number of key word Order sequence from small to large, by the m-th numbering after sequence Dm is assigned to;Document of the numbering less than Dm is filtered out, from residue Document in search the document of at least M key word in hit N number of key word;By the hit for finding N number of pass The document of at least M key word in keyword is recalled.
Assume that current long query word includes N=6 key word:T1, t2, t3, t4, t5, t6, M takes 4.Document to be checked The number of finishing.For each key word ti(i=1,2,3,4,5,6), according to document code sequential search from small to large this is gone out One document of key word hit is used as the corresponding current document of the key word.It is assumed that this 6 passes of t1, t2, t3, t4, t5 and t6 The corresponding current document numbering of keyword is followed successively by 10,15,7,30,5 and 13.By observation it can be found that numbering is 5,7 and 10 Document can not possibly meet the requirement of 4 key words of hit.Even if the next document code of t3 and t5 is 10, that document 10 3 key words can only be hit.And document 13 is then possible to hit 4 key words, as long as t1, t3 and t5 also appear in document 13 In.Therefore numbering the 4th document (i.e. document 13) of order from small to large is that the next one is possible to meet the document of condition, by 13 It is assigned to Dm, i.e. Dm=13.Document of the numbering less than 13 is all unsatisfactory for condition, directly filters out.Remaining document can after filtration To be repeated to filter with same method.
(3), for each key word in N number of key word, go out according to document code sequential search from small to large One document of the key word hit is used as the corresponding current document of the key word;If it is described to there is hit in each current document The document of at least M key word in N number of key word, the document for hitting at least M key word in N number of key word is called together Return, and the numbering of the document is assigned to into Dm;Document of the numbering less than or equal to Dm is filtered out, life is searched from remaining document Described at least M key word in N number of key word document;By at least M in the hit N number of key word for finding The document of key word is recalled.
Still N=6 key word is included with current long query word:T1, t2, t3, t4, t5, t6, M is taken as a example by 4, for each Key word ti(i=1,2,3,4,5,6), according to document code sequential search from small to large a text of the key word hit is gone out Shelves are used as the corresponding current document of the key word, it is assumed that t1, t2, t3, t4, t5 and t6 this corresponding current document of 6 key words Numbering is followed successively by 10,15,7,30,5 and 13.If t1, t3, t5 and t6 that document 13 hits, document 13 needs to be called together simultaneously Return, Dm=13, filter out document of the numbering less than or equal to 13 (because therefore document 13 has been called back from mistake in document to be checked Filter).Remaining document can be repeated to filter with same method after filtration.
In one embodiment of the invention, if there is at least M in hit N number of key word in each current document The two or more document of individual key word, then recall the plural document, and the numbering in two or more document is maximum Document code be assigned to Dm.For example, in upper example, in addition to t1, t3, t5 and t6 that document 13 hits simultaneously, document 15 Also at least 4 key words for hitting, then document 13 and 15 be required for being called back, Dm=15 filters out numbering less than or equal to 15 Document.Remaining document can be repeated to filter with same method after filtration.
(4), for each key word in N number of key word, go out according to document code sequential search from small to large One document of the key word hit is used as the corresponding current document of the key word;
If there is no the document of at least M key word in hit N number of key word in each current document, by institute The numbering for stating the corresponding N number of current document of N number of key word sorts by order from small to large, and the m-th numbering after sequence is assigned It is worth to Dm;Document of the numbering less than Dm is filtered out, at least M in hit N number of key word is searched from remaining document The document of key word;The document of at least M key word in the hit N number of key word for finding is recalled.
If there is the document of at least M key word in hit N number of key word in each current document, by hitting The document for stating at least M key word in N number of key word is recalled, and the numbering of the document is assigned to into Dm;Filter out numbering little In or document equal to Dm, the document of at least M key word in hit N number of key word is searched from remaining document; The document of at least M key word in the hit N number of key word for finding is recalled.
Wherein, the document that at least M key word in hit N number of key word is searched in remaining document is specially: To each key word in N number of key word, the sequential search according to document code from small to large from remaining document goes out this One document of key word hit is used as the corresponding current document of the key word;Judge in each current document with the presence or absence of hit institute State the document of at least M key word in N number of key word;If the judgment is Yes, at least M in N number of key word will be hit The document of individual key word is recalled, and the numbering of the document is assigned to into Dm;Filter out and number the document for being less than or equal to Dm, then from The document of at least M key word in hit N number of key word is searched in remaining document;If the judgment is No, then by institute The numbering for stating the corresponding N number of current document of N number of key word sorts by order from small to large, and the m-th numbering after sequence is assigned It is worth to Dm;Document of the numbering less than Dm is filtered out, then at least M in hit N number of key word is searched from remaining document The document of individual key word.
So repeat the above steps, all inquire about until all of document and finish.
This method can directly filter out the numbering document less than Dm due to each step, can fast filtering fall it is discontented The document of sufficient condition, therefore accelerate inquiry velocity.Ask and scheme speed improves more than 10 times compared to traditional.Therefore should Scheme substantially improves the search efficiency of long query word and inquiry effect, with more extensive.
Fig. 3 shows the schematic diagram of the Search Results of long query word according to an embodiment of the invention.Referring to Fig. 3, lead to Cross the solution of the present invention and obtain Search Results compared to the Search Results shown in Fig. 1, more match long query word, inquiry effect is obtained Obvious improvement is arrived.
Fig. 4 shows a kind of structure chart of the searcher of long query word according to an embodiment of the invention.Such as Fig. 4 institutes Show, the searcher 400 of the long query word includes:
Acquiring unit 410, is suitable to obtain the long query word of input.Use can be obtained from the search column of search engine interface The long query word of family input, it is also possible to the long query word of user input is obtained from the search column or address field of browser.
Extraction unit 420, is suitable to extract the N number of key word included in the long query word;N is natural number.Can be to length Query word carries out word segmentation processing, proposes each query word included in the long query word, and quantity is represented with N.
Unit 430 is recalled, is suitable to recall the document of at least M key word in hit N number of key word;M be less than or Equal to N.M can flexibly set according to practical situation.
Signal generating unit 440, is suitable to according to the document structure tree Search Results recalled.
Device shown in Fig. 4, due to only recalling the document for hitting at least M key word in N number of key word, document The amount of recalling substantially reduce, and each document recalled hits at least M key word in N number of key word, with original The matching degree of long query word is also greatly improved, therefore improves the efficiency and accuracy at target of long query search.
In one embodiment of the invention, it is described to recall unit 430, it is suitable to for each in N number of key word Key word, a document of the key word hit is gone out as the key word correspondence according to document code sequential search from small to large Current document;If there is no the document of at least M key word in hit N number of key word in each current document, By numbering by order sequence from small to large for N number of corresponding N number of current document of key word, the m-th after sequence is compiled Number it is assigned to Dm;Document of the numbering less than Dm is filtered out, is searched from remaining document in hit N number of key word at least The document of M key word;The document of at least M key word in the hit N number of key word for finding is recalled.
Assume that current long query word includes N=6 key word:T1, t2, t3, t4, t5, t6, M takes 4.Document to be checked The number of finishing.For each key word ti(i=1,2,3,4,5,6), according to document code sequential search from small to large this is gone out One document of key word hit is used as the corresponding current document of the key word.It is assumed that this 6 passes of t1, t2, t3, t4, t5 and t6 The corresponding current document numbering of keyword is followed successively by 10,15,7,30,5 and 13.By observation it can be found that numbering is 5,7 and 10 Document can not possibly meet the requirement of 4 key words of hit.Even if the next document code of t3 and t5 is 10, that document 10 3 key words can only be hit.And document 13 is then possible to hit 4 key words, as long as t1, t3 and t5 also appear in document 13 In.Therefore numbering the 4th document (i.e. document 13) of order from small to large is that the next one is possible to meet the document of condition, by 13 It is assigned to Dm, i.e. Dm=13.Document of the numbering less than 13 is all unsatisfactory for condition, directly filters out.Remaining document can after filtration To be repeated to filter with same method.
In another embodiment of the present invention, it is described to recall unit 430, it is every in being suitable to for N number of key word Individual key word, a document of the key word hit is gone out as the key word pair according to document code sequential search from small to large The current document answered;If there is the document of at least M key word in hit N number of key word in each current document, will The document for hitting at least M key word in N number of key word is recalled, and the numbering of the document is assigned to into Dm;Filter out Document of the numbering less than or equal to Dm, searches at least M key word hit in N number of key word from remaining document Document;The document of at least M key word in the hit N number of key word for finding is recalled.
Still N=6 key word is included with current long query word:T1, t2, t3, t4, t5, t6, M is taken as a example by 4, for each Key word ti(i=1,2,3,4,5,6), according to document code sequential search from small to large a text of the key word hit is gone out Shelves are used as the corresponding current document of the key word, it is assumed that t1, t2, t3, t4, t5 and t6 this corresponding current document of 6 key words Numbering is followed successively by 10,15,7,30,5 and 13.If t1, t3, t5 and t6 that document 13 hits, document 13 needs to be called together simultaneously Return, Dm=13, filter out document of the numbering less than or equal to 13 (because therefore document 13 has been called back from mistake in document to be checked Filter).Remaining document can be repeated to filter with same method after filtration.
In this embodiment of the invention, it is described to recall unit 430, it is further adapted for the presence of hit in each current document During the two or more document of at least M key word in N number of key word, the plural document is recalled, and by two The largest number of document code in individual documents above is assigned to Dm.
In the above embodiment of the present invention, hit N number of key is searched from remaining document by following scheme The document of at least M key word in word:It is described to recall unit 430, it is suitable to each key word in N number of key word, Sequential search according to document code from small to large from remaining document goes out a document of the key word hit as the key The corresponding current document of word;Judge in each current document with the presence or absence of at least M key word in hit N number of key word Document;If the judgment is Yes, the document for hitting at least M key word in N number of key word is recalled, and by the document Numbering is assigned to Dm;Document of the numbering less than or equal to Dm is filtered out, then hit N number of key is searched from remaining document The document of at least M key word in word;If the judgment is No, then by the volume of the corresponding N number of current document of the N number of key word Number by from small to large order sequence, by after sequence m-th numbering be assigned to Dm;Document of the numbering less than Dm is filtered out, then The document of at least M key word in hit N number of key word is searched from remaining document;Such repeat the above steps, Until having searched all documents to be searched.
In sum, this long query word for obtaining input of the invention, extracts the N included in the long query word Individual key word, recalls the document of at least M key word in hit N number of key word, M less than or equal to N, according to being recalled Document structure tree Search Results technical scheme, due to only recalling hit N number of key word at least M key word text Shelves, therefore the amount of recalling of document substantially reduces, and each document recalled hits at least M in N number of key word Key word, therefore also greatly improve with the matching degree of former long query word.
It should be noted that:
Provided herein algorithm and display be not inherently related to any certain computer, virtual bench or miscellaneous equipment. Various fexible units can also be used together based on teaching in this.As described above, construct required by this kind of device Structure be obvious.Additionally, the present invention is also not for any certain programmed language.It is understood that, it is possible to use it is various Programming language realizes the content of invention described herein, and the description done to language-specific above is to disclose this Bright preferred forms.
In description mentioned herein, a large amount of details are illustrated.It is to be appreciated, however, that the enforcement of the present invention Example can be put into practice in the case of without these details.In some instances, known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify the present invention and help understand one or more in each inventive aspect, exist Above in the description of the exemplary embodiment of the present invention, each feature of the present invention is grouped together into single enforcement sometimes In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor The more features of feature that the application claims ratio of shield is expressly recited in each claim.More precisely, such as following Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore, Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, wherein each claim itself All as the separate embodiments of the present invention.
Those skilled in the art are appreciated that can be carried out adaptively to the module in the equipment in embodiment Change and they are arranged in one or more equipment different from the embodiment.Can be the module or list in embodiment Unit or component are combined into a module or unit or component, and can be divided in addition multiple submodule or subelement or Sub-component.In addition at least some in such feature and/or process or unit is excluded each other, can adopt any Combine to all features disclosed in this specification (including adjoint claim, summary and accompanying drawing) and so disclosed Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification is (including adjoint power Profit is required, summary and accompanying drawing) disclosed in each feature can it is identical by offers, be equal to or the alternative features of similar purpose carry out generation Replace.
Although additionally, it will be appreciated by those of skill in the art that some embodiments described herein include other embodiments In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment required for protection appoint One of meaning can in any combination mode using.
The present invention all parts embodiment can be realized with hardware, or with one or more processor operation Software module realize, or with combinations thereof realization.It will be understood by those of skill in the art that can use in practice Microprocessor or digital signal processor (DSP) are come in the searcher for realizing long query word according to embodiments of the present invention The some or all functions of some or all parts.The present invention is also implemented as performing method as described herein Some or all equipment or program of device (for example, computer program and computer program).Such reality The program of the existing present invention can be stored on a computer-readable medium, or can have the form of one or more signal. Such signal can be downloaded from internet website and obtained, or be provided on carrier signal, or in any other form There is provided.
It should be noted that above-described embodiment the present invention will be described rather than limits the invention, and ability Field technique personnel can design without departing from the scope of the appended claims alternative embodiment.In the claims, Any reference markss between bracket should not be configured to limitations on claims.Word "comprising" is not excluded the presence of not Element listed in the claims or step.Word "a" or "an" before element does not exclude the presence of multiple such Element.The present invention can come real by means of the hardware for including some different elements and by means of properly programmed computer It is existing.If in the unit claim for listing equipment for drying, several in these devices can be by same hardware branch To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and be run after fame Claim.

Claims (8)

1. a kind of searching method of long query word, wherein, the method includes:
Obtain the long query word of input;
Extract the N number of key word included in the long query word;N is natural number;
Recall the document of at least M key word in hit N number of key word;M is less than or equal to N;
According to the document structure tree Search Results recalled;
Wherein, the document for recalling at least M key word in hit N number of key word includes:
For each key word in N number of key word, according to document code sequential search from small to large the key word is gone out One document of hit is used as the corresponding current document of the key word;
If there is no the document of at least M key word in hit N number of key word in each current document, by the N Numbering by order sequence from small to large for the individual corresponding N number of current document of key word, the m-th numbering after sequence is assigned to Dm;Document of the numbering less than Dm is filtered out, at least M searched from remaining document in hit N number of key word is crucial The document of word;The document of at least M key word in the hit N number of key word for finding is recalled.
2. it is the method for claim 1, wherein described to recall at least M key word in hit N number of key word Document is further included:
If there is the document of at least M key word in hit N number of key word in each current document, the N will be hit The document of at least M key word in individual key word is recalled, and the numbering of the document is assigned to into Dm;Filter out numbering be less than or Document equal to Dm, searches the document of at least M key word in hit N number of key word from remaining document;To look into The document of at least M key word in the hit N number of key word for finding is recalled.
3. method as claimed in claim 2, wherein, the method is further included:
If there is the two or more document of at least M key word in hit N number of key word in each current document, will The plural document is recalled, and the largest number of document code in two or more document is assigned to into Dm.
4. the method as any one of claim 1-3, wherein, it is described that hit is searched from remaining document is described N number of The document of at least M key word in key word includes:
To each key word in N number of key word, the sequential search from remaining document according to document code from small to large Go out a document of the key word hit as the corresponding current document of the key word;
Judge in each current document with the presence or absence of the document of at least M key word in hit N number of key word;
If the judgment is Yes, the document for hitting at least M key word in N number of key word is recalled, and by the document Numbering is assigned to Dm;Document of the numbering less than or equal to Dm is filtered out, then hit N number of key is searched from remaining document The document of at least M key word in word;
If the judgment is No, then by the numbering of the corresponding N number of current document of the N number of key word by order row from small to large Sequence, by the m-th numbering after sequence Dm is assigned to;Document of the numbering less than Dm is filtered out, then life is searched from remaining document Described at least M key word in N number of key word document.
5. a kind of searcher of long query word, wherein, the device includes:
Acquiring unit, is suitable to obtain the long query word of input;
Extraction unit, is suitable to extract the N number of key word included in the long query word;N is natural number;
Unit is recalled, is suitable to recall the document of at least M key word in hit N number of key word;M is less than or equal to N;
Signal generating unit, is suitable to according to the document structure tree Search Results recalled;
Wherein, it is described to recall unit, be suitable to for each key word in N number of key word, according to document code from it is little to Big sequential search goes out a document of the key word hit as the corresponding current document of the key word;If each current document In there is no the document of at least M key word in hit N number of key word, then by the N number of key word it is corresponding it is N number of ought M-th numbering after sequence is assigned to Dm by the numbering of front document by order sequence from small to large;Numbering is filtered out less than Dm Document, the document of at least M key word in hit N number of key word is searched from remaining document;By what is found The document for hitting at least M key word in N number of key word is recalled.
6. device as claimed in claim 5, wherein,
Described to recall unit, at least M be further adapted in it there is hit N number of key word in each current document is crucial During the document of word, the document for hitting at least M key word in N number of key word is recalled, and the numbering of the document is assigned It is worth to Dm;Filter out document of the numbering less than or equal to Dm, search from remaining document in hit N number of key word to The document of few M key word;The document of at least M key word in the hit N number of key word for finding is recalled.
7. device as claimed in claim 6, wherein,
Described to recall unit, be further adapted for existing in each current document in hit N number of key word at least M is crucial During the two or more document of word, the plural document is recalled, and by the largest number of document in two or more document Numbering is assigned to Dm.
8. the device as any one of claim 5-7, wherein,
It is described to recall unit, be suitable to each key word in N number of key word, from remaining document according to document code from It is little to go out a document of the key word hit as the corresponding current document of the key word to big sequential search;Judge each current With the presence or absence of the document of at least M key word in hit N number of key word in document;If the judgment is Yes, by hitting The document for stating at least M key word in N number of key word is recalled, and the numbering of the document is assigned to into Dm;Filter out numbering little In or document equal to Dm, then the text that at least M key word in hit N number of key word is searched from remaining document Shelves;If the judgment is No, then by the numbering of the corresponding N number of current document of the N number of key word by order sequence from small to large, M-th numbering after sequence is assigned to into Dm;Document of the numbering less than Dm is filtered out, then hit institute is searched from remaining document State the document of at least M key word in N number of key word;Such repeat the above steps, until having searched all texts to be searched Shelves.
CN201510149927.6A 2015-03-31 2015-03-31 Long query word searching method and device Expired - Fee Related CN104715065B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510149927.6A CN104715065B (en) 2015-03-31 2015-03-31 Long query word searching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510149927.6A CN104715065B (en) 2015-03-31 2015-03-31 Long query word searching method and device

Publications (2)

Publication Number Publication Date
CN104715065A CN104715065A (en) 2015-06-17
CN104715065B true CN104715065B (en) 2017-04-19

Family

ID=53414391

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510149927.6A Expired - Fee Related CN104715065B (en) 2015-03-31 2015-03-31 Long query word searching method and device

Country Status (1)

Country Link
CN (1) CN104715065B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108536740B (en) * 2018-03-07 2020-06-26 上海连尚网络科技有限公司 Method, medium and equipment for determining search result
CN110929125B (en) * 2019-11-15 2023-07-11 腾讯科技(深圳)有限公司 Search recall method, device, equipment and storage medium thereof

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1808430A (en) * 2004-11-01 2006-07-26 西安迪戈科技有限责任公司 Internet and computer information retrieval and mining with intelligent conceptual filtering, visualization and automation
US8473486B2 (en) * 2010-12-08 2013-06-25 Microsoft Corporation Training parsers to approximately optimize NDCG
CN102541960A (en) * 2010-12-31 2012-07-04 北大方正集团有限公司 Method and device of fuzzy retrieval
CN103810220B (en) * 2012-11-15 2018-02-27 腾讯科技(深圳)有限公司 A kind of microblogging searching method and device
CN103049495A (en) * 2012-12-07 2013-04-17 百度在线网络技术(北京)有限公司 Method, device and equipment for providing searching advice corresponding to inquiring sequence
CN104361042B (en) * 2014-10-29 2019-02-12 中国建设银行股份有限公司 A kind of information retrieval method and device

Also Published As

Publication number Publication date
CN104715065A (en) 2015-06-17

Similar Documents

Publication Publication Date Title
CN105095440B (en) A kind of search recommended method and device
CN104036009B (en) A kind of method, image searching method and device for searching for matching picture
CN106815263B (en) The searching method and device of legal provision
CN104715064B (en) It is a kind of to realize the method and server that keyword is marked on webpage
CN105653537B (en) Paging query method and device for database application system
CN108227954A (en) A kind of method, apparatus and electronic equipment that search input associational word is provided
CN105095391A (en) Device and method for identifying organization name by word segmentation program
CN103559313B (en) Searching method and device
CN106598827A (en) Method and device for extracting log data
CN105095175A (en) Method and device for obtaining truncated web title
CN104715065B (en) Long query word searching method and device
CN107678968A (en) Sample extraction method, apparatus, computing device and the storage medium of source code function
CN109003170A (en) Acquisition methods and device for the shop material shown in the page
CN102467544A (en) Information smart searching method and system based on space fuzzy coding
CN106649385B (en) Data reordering method and device based on HBase database
CN109101651A (en) Method for analyzing metadata full link
CN106599062A (en) Data processing method and device in SparkSQL system
CN108090200A (en) A kind of sequence type hides the acquisition methods of grid database data
CN103353900B (en) Method, device and system for accessing and certificating web address through search bar
CN106570058A (en) Searching method and search engine
CN104462519A (en) Search query method and device
CN108920484B (en) Search content processing method and device, storage device and computer device
CN104715068B (en) Method and device for generating document indexes and searching method and device
CN105528414B (en) A kind of crawler method and system for collecting deep network data complete or collected works
US20070203895A1 (en) Recursive search engine using correlative words

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20170310

Address after: 100016 Chaoyang District Road, Jiuxianqiao, No. 10, building No. 3, floor 15, floor 17, 1701-15B,

Applicant after: BEIJING QIYUAN TECHNOLOGY CO.,LTD.

Address before: 100088 Beijing city Xicheng District xinjiekouwai Street 28, block D room 112 (Desheng Park)

Applicant before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Applicant before: Qizhi software (Beijing) Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170419

CF01 Termination of patent right due to non-payment of annual fee