CN104715065B - Long query word searching method and device - Google Patents
Long query word searching method and device Download PDFInfo
- Publication number
- CN104715065B CN104715065B CN201510149927.6A CN201510149927A CN104715065B CN 104715065 B CN104715065 B CN 104715065B CN 201510149927 A CN201510149927 A CN 201510149927A CN 104715065 B CN104715065 B CN 104715065B
- Authority
- CN
- China
- Prior art keywords
- document
- key word
- hit
- word
- numbering
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 239000000284 extract Substances 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 3
- 238000001914 filtration Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000004590 computer program Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000001035 drying Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a long query word searching method and device. The method comprises the following steps: acquiring an input long query word; extracting N keywords contained in the long query word, wherein N refers to a natural number; recalling at least M keyword documents in the N keywords, wherein M is less than or equal to N; and generating searching results according to the recalled documents. According to the technical scheme disclosed by the invention, because only the at least M keyword documents in the N keywords are recalled, the recall number of the documents is greatly reduced. Moreover, each recalled document belongs to the at least M keyword documents in the N keywords; and therefore, the matching degree of the original long query words is greatly improved.
Description
Technical field
The present invention relates to the Internet search technology field, and in particular to a kind of searching method and device of long query word.
Background technology
In some actual search scenes, user can be input into longer sentence as query word, referred to as long query word.
In prior art for this long query word search for using ask and method realizing fuzzy matching.Looked into according to length
Ask each key word included in word to be inquired about respectively, then Query Result is merged.But this method is actually used
Middle performance is excessively poor.As it is assumed that a long query word includes N number of key word, each key word is averagely recalled L document, is then asked
And N*L document can be returned, the quantity for recalling document is very big, causes amount of calculation more, simultaneously scans for the matching that result is provided
Effect is also and bad.
Fig. 1 show it is existing ask and method Search Results schematic diagram.As shown in figure 1, the document recalled is looked into former length
The matching effect for asking word is excessively poor.
It can be seen that, need a kind of effective search solution for long query word badly.
The content of the invention
In view of the above problems, it is proposed that the present invention so as to provide one kind overcome the problems referred to above or at least in part solve on
State a kind of long query word searching method and device of problem.
According to one aspect of the present invention, there is provided a kind of searching method of long query word, the method includes:
Obtain the long query word of input;
Extract the N number of key word included in the long query word;N is natural number;
Recall the document of at least M key word in hit N number of key word;M is less than or equal to N;
According to the document structure tree Search Results recalled.
Alternatively, the document for recalling at least M key word in hit N number of key word includes:
For each key word in N number of key word, according to document code sequential search from small to large the pass is gone out
One document of keyword hit is used as the corresponding current document of the key word;
If there is no the document of at least M key word in hit N number of key word in each current document, by institute
The numbering for stating the corresponding N number of current document of N number of key word sorts by order from small to large, and the m-th numbering after sequence is assigned
It is worth to Dm;
Document of the numbering less than Dm is filtered out, at least M in hit N number of key word is searched from remaining document
The document of key word;
The document of at least M key word in the hit N number of key word for finding is recalled.
Alternatively, the document for recalling at least M key word in hit N number of key word includes:
For each key word in N number of key word, according to document code sequential search from small to large the pass is gone out
One document of keyword hit is used as the corresponding current document of the key word;
If there is the document of at least M key word in hit N number of key word in each current document, by hitting
The document for stating at least M key word in N number of key word is recalled, and the numbering of the document is assigned to into Dm;
Document of the numbering less than or equal to Dm is filtered out, during hit N number of key word is searched from remaining document
The document of at least M key word;
The document of at least M key word in the hit N number of key word for finding is recalled.
Alternatively, the method is further included:
If there is the two or more document of at least M key word in hit N number of key word in each current document,
Then the plural document is recalled, and the largest number of document code in two or more document is assigned to into Dm.
Alternatively, the document that at least M key word in hit N number of key word is searched from remaining document
Including:
To each key word in N number of key word, the order from remaining document according to document code from small to large
A document of the key word hit is found out as the corresponding current document of the key word;
Judge in each current document with the presence or absence of the document of at least M key word in hit N number of key word;
If the judgment is Yes, the document for hitting at least M key word in N number of key word is recalled, and by this article
The numbering of shelves is assigned to Dm;Document of the numbering less than or equal to Dm is filtered out, then lookup hit is described N number of from remaining document
The document of at least M key word in key word;
If the judgment is No, then by the numbering of the corresponding N number of current document of the N number of key word by order from small to large
Sequence, by the m-th numbering after sequence Dm is assigned to;Document of the numbering less than Dm is filtered out, then is searched from remaining document
Hit the document of at least M key word in N number of key word.
According to another aspect of the present invention, there is provided a kind of searcher of long query word, wherein, the device includes:
Acquiring unit, is suitable to obtain the long query word of input;
Extraction unit, is suitable to extract the N number of key word included in the long query word;N is natural number;
Unit is recalled, is suitable to recall the document of at least M key word in hit N number of key word;M is less than or equal to
N;
Signal generating unit, is suitable to according to the document structure tree Search Results recalled.
Alternatively, it is described to recall unit, it is suitable to for each key word in N number of key word, according to document code
Sequential search from small to large goes out a document of the key word hit as the corresponding current document of the key word;If respectively when
There is no the document of at least M key word in hit N number of key word in front document, then by N number of key word correspondence
N number of current document numbering by from small to large order sequence, by after sequence m-th numbering be assigned to Dm;Filter out volume
Document number less than Dm, searches the document of at least M key word in hit N number of key word from remaining document;Will
The document of at least M key word in the hit N number of key word for finding is recalled.
Alternatively, it is described to recall unit, it is suitable to for each key word in N number of key word, according to document code
Sequential search from small to large goes out a document of the key word hit as the corresponding current document of the key word;If respectively when
There is the document of at least M key word in hit N number of key word in front document, will hit in N number of key word
The document of at least M key word is recalled, and the numbering of the document is assigned to into Dm;Filter out text of the numbering less than or equal to Dm
Shelves, search the document of at least M key word in hit N number of key word from remaining document;By the hit for finding
The document of at least M key word in N number of key word is recalled.
Alternatively, it is described to recall unit, it is further adapted for existing in each current document and hits in N number of key word
During the two or more document of at least M key word, the plural document is recalled, and by the numbering in two or more document
Maximum document code is assigned to Dm.
Alternatively, it is described to recall unit, it is suitable to, to each key word in N number of key word, press from remaining document
According to document code sequential search from small to large go out a document of key word hit as the key word is corresponding ought be above
Shelves;Judge in each current document with the presence or absence of the document of at least M key word in hit N number of key word;If it is determined that
It is yes, the document for hitting at least M key word in N number of key word is recalled, and the numbering of the document is assigned to into Dm;
Document of the numbering less than or equal to Dm is filtered out, then at least M in hit N number of key word is searched from remaining document
The document of key word;If the judgment is No, then by the numbering of the corresponding N number of current document of the N number of key word by from small to large
Order sequence, by after sequence m-th numbering be assigned to Dm;Document of the numbering less than Dm is filtered out, then from remaining document
The middle document for searching at least M key word in hit N number of key word;Such repeat the above steps, until searched treating
All documents of search.
This long query word for obtaining input of the invention, extracts the N number of key word included in the long query word,
The document of at least M key word in hit N number of key word is recalled, M is less than or equal to N, according to the document life recalled
Into the technical scheme of Search Results, due to only recalling the document for hitting at least M key word in N number of key word, therefore
The amount of recalling of document is substantially reduced, and each document recalled hits at least M key word in N number of key word,
Therefore also greatly improve with the matching degree of former long query word.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention,
And can be practiced according to the content of description, and in order to allow the above and other objects of the present invention, feature and advantage can
Become apparent, below especially exemplified by the specific embodiment of the present invention.
Description of the drawings
By the detailed description for reading hereafter preferred implementation, various other advantages and benefit is common for this area
Technical staff will be clear from understanding.Accompanying drawing is only used for illustrating the purpose of preferred implementation, and is not considered as to the present invention
Restriction.And in whole accompanying drawing, it is denoted by the same reference numerals identical part.In the accompanying drawings:
Fig. 1 show it is existing ask and method Search Results schematic diagram;
Fig. 2 shows a kind of flow chart of the searching method of long query word according to an embodiment of the invention;
Fig. 3 shows the schematic diagram of the Search Results of long query word according to an embodiment of the invention;
Fig. 4 shows a kind of structure chart of the searcher of long query word according to an embodiment of the invention.
Specific embodiment
The exemplary embodiment of the present invention is more fully described below with reference to accompanying drawings.Although showing the present invention in accompanying drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the present invention and should not be by embodiments set forth here
Limited.On the contrary, there is provided these embodiments are able to be best understood from the present invention, and can be by the scope of the present invention
Complete conveys to those skilled in the art.
Fig. 2 shows a kind of flow chart of the searching method of long query word according to an embodiment of the invention.Such as Fig. 2 institutes
Show, the method includes:
Step S210, obtains the long query word of input.
In this step, the long query word of user input can be obtained from the search column of search engine interface, it is also possible to
The long query word of user input is obtained from the search column or address field of browser.
Step S220, extracts the N number of key word included in the long query word;N is natural number.
Word segmentation processing is carried out to long query word in this step, each query word included in the long query word, quantity is proposed
Represented with N.
Step S230, recalls the document of at least M key word in hit N number of key word;M is less than or equal to N.
M=2*N/3, i.e. M can be set in one embodiment of the invention for 2/3rds of N.Certainly of the invention
The value of M can be set in other embodiment according to practical situation.
Step S240, according to the document structure tree Search Results recalled.
Method shown in Fig. 2, due to only recalling the document for hitting at least M key word in N number of key word, document
The amount of recalling substantially reduce, and each document recalled hits at least M key word in N number of key word, with original
The matching degree of long query word is also greatly improved, therefore improves the efficiency and accuracy at target of long query search.
In one embodiment of the invention, hit N number of key is recalled described in the step of Fig. 1 methods describeds S230
The document of at least M key word in word can adopt following different method:
(1) document at least hitting a key word, is found out, then after filtering text of the hit less than M key word
Shelves, recall remaining document after filtration.
(2), for each key word in N number of key word, go out according to document code sequential search from small to large
One document of the key word hit is used as the corresponding current document of the key word;If there is no hit institute in each current document
The document of at least M key word in N number of key word is stated, is then pressed the numbering of the corresponding N number of current document of the N number of key word
Order sequence from small to large, by the m-th numbering after sequence Dm is assigned to;Document of the numbering less than Dm is filtered out, from residue
Document in search the document of at least M key word in hit N number of key word;By the hit for finding N number of pass
The document of at least M key word in keyword is recalled.
Assume that current long query word includes N=6 key word:T1, t2, t3, t4, t5, t6, M takes 4.Document to be checked
The number of finishing.For each key word ti(i=1,2,3,4,5,6), according to document code sequential search from small to large this is gone out
One document of key word hit is used as the corresponding current document of the key word.It is assumed that this 6 passes of t1, t2, t3, t4, t5 and t6
The corresponding current document numbering of keyword is followed successively by 10,15,7,30,5 and 13.By observation it can be found that numbering is 5,7 and 10
Document can not possibly meet the requirement of 4 key words of hit.Even if the next document code of t3 and t5 is 10, that document 10
3 key words can only be hit.And document 13 is then possible to hit 4 key words, as long as t1, t3 and t5 also appear in document 13
In.Therefore numbering the 4th document (i.e. document 13) of order from small to large is that the next one is possible to meet the document of condition, by 13
It is assigned to Dm, i.e. Dm=13.Document of the numbering less than 13 is all unsatisfactory for condition, directly filters out.Remaining document can after filtration
To be repeated to filter with same method.
(3), for each key word in N number of key word, go out according to document code sequential search from small to large
One document of the key word hit is used as the corresponding current document of the key word;If it is described to there is hit in each current document
The document of at least M key word in N number of key word, the document for hitting at least M key word in N number of key word is called together
Return, and the numbering of the document is assigned to into Dm;Document of the numbering less than or equal to Dm is filtered out, life is searched from remaining document
Described at least M key word in N number of key word document;By at least M in the hit N number of key word for finding
The document of key word is recalled.
Still N=6 key word is included with current long query word:T1, t2, t3, t4, t5, t6, M is taken as a example by 4, for each
Key word ti(i=1,2,3,4,5,6), according to document code sequential search from small to large a text of the key word hit is gone out
Shelves are used as the corresponding current document of the key word, it is assumed that t1, t2, t3, t4, t5 and t6 this corresponding current document of 6 key words
Numbering is followed successively by 10,15,7,30,5 and 13.If t1, t3, t5 and t6 that document 13 hits, document 13 needs to be called together simultaneously
Return, Dm=13, filter out document of the numbering less than or equal to 13 (because therefore document 13 has been called back from mistake in document to be checked
Filter).Remaining document can be repeated to filter with same method after filtration.
In one embodiment of the invention, if there is at least M in hit N number of key word in each current document
The two or more document of individual key word, then recall the plural document, and the numbering in two or more document is maximum
Document code be assigned to Dm.For example, in upper example, in addition to t1, t3, t5 and t6 that document 13 hits simultaneously, document 15
Also at least 4 key words for hitting, then document 13 and 15 be required for being called back, Dm=15 filters out numbering less than or equal to 15
Document.Remaining document can be repeated to filter with same method after filtration.
(4), for each key word in N number of key word, go out according to document code sequential search from small to large
One document of the key word hit is used as the corresponding current document of the key word;
If there is no the document of at least M key word in hit N number of key word in each current document, by institute
The numbering for stating the corresponding N number of current document of N number of key word sorts by order from small to large, and the m-th numbering after sequence is assigned
It is worth to Dm;Document of the numbering less than Dm is filtered out, at least M in hit N number of key word is searched from remaining document
The document of key word;The document of at least M key word in the hit N number of key word for finding is recalled.
If there is the document of at least M key word in hit N number of key word in each current document, by hitting
The document for stating at least M key word in N number of key word is recalled, and the numbering of the document is assigned to into Dm;Filter out numbering little
In or document equal to Dm, the document of at least M key word in hit N number of key word is searched from remaining document;
The document of at least M key word in the hit N number of key word for finding is recalled.
Wherein, the document that at least M key word in hit N number of key word is searched in remaining document is specially:
To each key word in N number of key word, the sequential search according to document code from small to large from remaining document goes out this
One document of key word hit is used as the corresponding current document of the key word;Judge in each current document with the presence or absence of hit institute
State the document of at least M key word in N number of key word;If the judgment is Yes, at least M in N number of key word will be hit
The document of individual key word is recalled, and the numbering of the document is assigned to into Dm;Filter out and number the document for being less than or equal to Dm, then from
The document of at least M key word in hit N number of key word is searched in remaining document;If the judgment is No, then by institute
The numbering for stating the corresponding N number of current document of N number of key word sorts by order from small to large, and the m-th numbering after sequence is assigned
It is worth to Dm;Document of the numbering less than Dm is filtered out, then at least M in hit N number of key word is searched from remaining document
The document of individual key word.
So repeat the above steps, all inquire about until all of document and finish.
This method can directly filter out the numbering document less than Dm due to each step, can fast filtering fall it is discontented
The document of sufficient condition, therefore accelerate inquiry velocity.Ask and scheme speed improves more than 10 times compared to traditional.Therefore should
Scheme substantially improves the search efficiency of long query word and inquiry effect, with more extensive.
Fig. 3 shows the schematic diagram of the Search Results of long query word according to an embodiment of the invention.Referring to Fig. 3, lead to
Cross the solution of the present invention and obtain Search Results compared to the Search Results shown in Fig. 1, more match long query word, inquiry effect is obtained
Obvious improvement is arrived.
Fig. 4 shows a kind of structure chart of the searcher of long query word according to an embodiment of the invention.Such as Fig. 4 institutes
Show, the searcher 400 of the long query word includes:
Acquiring unit 410, is suitable to obtain the long query word of input.Use can be obtained from the search column of search engine interface
The long query word of family input, it is also possible to the long query word of user input is obtained from the search column or address field of browser.
Extraction unit 420, is suitable to extract the N number of key word included in the long query word;N is natural number.Can be to length
Query word carries out word segmentation processing, proposes each query word included in the long query word, and quantity is represented with N.
Unit 430 is recalled, is suitable to recall the document of at least M key word in hit N number of key word;M be less than or
Equal to N.M can flexibly set according to practical situation.
Signal generating unit 440, is suitable to according to the document structure tree Search Results recalled.
Device shown in Fig. 4, due to only recalling the document for hitting at least M key word in N number of key word, document
The amount of recalling substantially reduce, and each document recalled hits at least M key word in N number of key word, with original
The matching degree of long query word is also greatly improved, therefore improves the efficiency and accuracy at target of long query search.
In one embodiment of the invention, it is described to recall unit 430, it is suitable to for each in N number of key word
Key word, a document of the key word hit is gone out as the key word correspondence according to document code sequential search from small to large
Current document;If there is no the document of at least M key word in hit N number of key word in each current document,
By numbering by order sequence from small to large for N number of corresponding N number of current document of key word, the m-th after sequence is compiled
Number it is assigned to Dm;Document of the numbering less than Dm is filtered out, is searched from remaining document in hit N number of key word at least
The document of M key word;The document of at least M key word in the hit N number of key word for finding is recalled.
Assume that current long query word includes N=6 key word:T1, t2, t3, t4, t5, t6, M takes 4.Document to be checked
The number of finishing.For each key word ti(i=1,2,3,4,5,6), according to document code sequential search from small to large this is gone out
One document of key word hit is used as the corresponding current document of the key word.It is assumed that this 6 passes of t1, t2, t3, t4, t5 and t6
The corresponding current document numbering of keyword is followed successively by 10,15,7,30,5 and 13.By observation it can be found that numbering is 5,7 and 10
Document can not possibly meet the requirement of 4 key words of hit.Even if the next document code of t3 and t5 is 10, that document 10
3 key words can only be hit.And document 13 is then possible to hit 4 key words, as long as t1, t3 and t5 also appear in document 13
In.Therefore numbering the 4th document (i.e. document 13) of order from small to large is that the next one is possible to meet the document of condition, by 13
It is assigned to Dm, i.e. Dm=13.Document of the numbering less than 13 is all unsatisfactory for condition, directly filters out.Remaining document can after filtration
To be repeated to filter with same method.
In another embodiment of the present invention, it is described to recall unit 430, it is every in being suitable to for N number of key word
Individual key word, a document of the key word hit is gone out as the key word pair according to document code sequential search from small to large
The current document answered;If there is the document of at least M key word in hit N number of key word in each current document, will
The document for hitting at least M key word in N number of key word is recalled, and the numbering of the document is assigned to into Dm;Filter out
Document of the numbering less than or equal to Dm, searches at least M key word hit in N number of key word from remaining document
Document;The document of at least M key word in the hit N number of key word for finding is recalled.
Still N=6 key word is included with current long query word:T1, t2, t3, t4, t5, t6, M is taken as a example by 4, for each
Key word ti(i=1,2,3,4,5,6), according to document code sequential search from small to large a text of the key word hit is gone out
Shelves are used as the corresponding current document of the key word, it is assumed that t1, t2, t3, t4, t5 and t6 this corresponding current document of 6 key words
Numbering is followed successively by 10,15,7,30,5 and 13.If t1, t3, t5 and t6 that document 13 hits, document 13 needs to be called together simultaneously
Return, Dm=13, filter out document of the numbering less than or equal to 13 (because therefore document 13 has been called back from mistake in document to be checked
Filter).Remaining document can be repeated to filter with same method after filtration.
In this embodiment of the invention, it is described to recall unit 430, it is further adapted for the presence of hit in each current document
During the two or more document of at least M key word in N number of key word, the plural document is recalled, and by two
The largest number of document code in individual documents above is assigned to Dm.
In the above embodiment of the present invention, hit N number of key is searched from remaining document by following scheme
The document of at least M key word in word:It is described to recall unit 430, it is suitable to each key word in N number of key word,
Sequential search according to document code from small to large from remaining document goes out a document of the key word hit as the key
The corresponding current document of word;Judge in each current document with the presence or absence of at least M key word in hit N number of key word
Document;If the judgment is Yes, the document for hitting at least M key word in N number of key word is recalled, and by the document
Numbering is assigned to Dm;Document of the numbering less than or equal to Dm is filtered out, then hit N number of key is searched from remaining document
The document of at least M key word in word;If the judgment is No, then by the volume of the corresponding N number of current document of the N number of key word
Number by from small to large order sequence, by after sequence m-th numbering be assigned to Dm;Document of the numbering less than Dm is filtered out, then
The document of at least M key word in hit N number of key word is searched from remaining document;Such repeat the above steps,
Until having searched all documents to be searched.
In sum, this long query word for obtaining input of the invention, extracts the N included in the long query word
Individual key word, recalls the document of at least M key word in hit N number of key word, M less than or equal to N, according to being recalled
Document structure tree Search Results technical scheme, due to only recalling hit N number of key word at least M key word text
Shelves, therefore the amount of recalling of document substantially reduces, and each document recalled hits at least M in N number of key word
Key word, therefore also greatly improve with the matching degree of former long query word.
It should be noted that:
Provided herein algorithm and display be not inherently related to any certain computer, virtual bench or miscellaneous equipment.
Various fexible units can also be used together based on teaching in this.As described above, construct required by this kind of device
Structure be obvious.Additionally, the present invention is also not for any certain programmed language.It is understood that, it is possible to use it is various
Programming language realizes the content of invention described herein, and the description done to language-specific above is to disclose this
Bright preferred forms.
In description mentioned herein, a large amount of details are illustrated.It is to be appreciated, however, that the enforcement of the present invention
Example can be put into practice in the case of without these details.In some instances, known method, structure is not been shown in detail
And technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify the present invention and help understand one or more in each inventive aspect, exist
Above in the description of the exemplary embodiment of the present invention, each feature of the present invention is grouped together into single enforcement sometimes
In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor
The more features of feature that the application claims ratio of shield is expressly recited in each claim.More precisely, such as following
Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore,
Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, wherein each claim itself
All as the separate embodiments of the present invention.
Those skilled in the art are appreciated that can be carried out adaptively to the module in the equipment in embodiment
Change and they are arranged in one or more equipment different from the embodiment.Can be the module or list in embodiment
Unit or component are combined into a module or unit or component, and can be divided in addition multiple submodule or subelement or
Sub-component.In addition at least some in such feature and/or process or unit is excluded each other, can adopt any
Combine to all features disclosed in this specification (including adjoint claim, summary and accompanying drawing) and so disclosed
Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification is (including adjoint power
Profit is required, summary and accompanying drawing) disclosed in each feature can it is identical by offers, be equal to or the alternative features of similar purpose carry out generation
Replace.
Although additionally, it will be appreciated by those of skill in the art that some embodiments described herein include other embodiments
In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention
Within the scope of and form different embodiments.For example, in the following claims, embodiment required for protection appoint
One of meaning can in any combination mode using.
The present invention all parts embodiment can be realized with hardware, or with one or more processor operation
Software module realize, or with combinations thereof realization.It will be understood by those of skill in the art that can use in practice
Microprocessor or digital signal processor (DSP) are come in the searcher for realizing long query word according to embodiments of the present invention
The some or all functions of some or all parts.The present invention is also implemented as performing method as described herein
Some or all equipment or program of device (for example, computer program and computer program).Such reality
The program of the existing present invention can be stored on a computer-readable medium, or can have the form of one or more signal.
Such signal can be downloaded from internet website and obtained, or be provided on carrier signal, or in any other form
There is provided.
It should be noted that above-described embodiment the present invention will be described rather than limits the invention, and ability
Field technique personnel can design without departing from the scope of the appended claims alternative embodiment.In the claims,
Any reference markss between bracket should not be configured to limitations on claims.Word "comprising" is not excluded the presence of not
Element listed in the claims or step.Word "a" or "an" before element does not exclude the presence of multiple such
Element.The present invention can come real by means of the hardware for including some different elements and by means of properly programmed computer
It is existing.If in the unit claim for listing equipment for drying, several in these devices can be by same hardware branch
To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and be run after fame
Claim.
Claims (8)
1. a kind of searching method of long query word, wherein, the method includes:
Obtain the long query word of input;
Extract the N number of key word included in the long query word;N is natural number;
Recall the document of at least M key word in hit N number of key word;M is less than or equal to N;
According to the document structure tree Search Results recalled;
Wherein, the document for recalling at least M key word in hit N number of key word includes:
For each key word in N number of key word, according to document code sequential search from small to large the key word is gone out
One document of hit is used as the corresponding current document of the key word;
If there is no the document of at least M key word in hit N number of key word in each current document, by the N
Numbering by order sequence from small to large for the individual corresponding N number of current document of key word, the m-th numbering after sequence is assigned to
Dm;Document of the numbering less than Dm is filtered out, at least M searched from remaining document in hit N number of key word is crucial
The document of word;The document of at least M key word in the hit N number of key word for finding is recalled.
2. it is the method for claim 1, wherein described to recall at least M key word in hit N number of key word
Document is further included:
If there is the document of at least M key word in hit N number of key word in each current document, the N will be hit
The document of at least M key word in individual key word is recalled, and the numbering of the document is assigned to into Dm;Filter out numbering be less than or
Document equal to Dm, searches the document of at least M key word in hit N number of key word from remaining document;To look into
The document of at least M key word in the hit N number of key word for finding is recalled.
3. method as claimed in claim 2, wherein, the method is further included:
If there is the two or more document of at least M key word in hit N number of key word in each current document, will
The plural document is recalled, and the largest number of document code in two or more document is assigned to into Dm.
4. the method as any one of claim 1-3, wherein, it is described that hit is searched from remaining document is described N number of
The document of at least M key word in key word includes:
To each key word in N number of key word, the sequential search from remaining document according to document code from small to large
Go out a document of the key word hit as the corresponding current document of the key word;
Judge in each current document with the presence or absence of the document of at least M key word in hit N number of key word;
If the judgment is Yes, the document for hitting at least M key word in N number of key word is recalled, and by the document
Numbering is assigned to Dm;Document of the numbering less than or equal to Dm is filtered out, then hit N number of key is searched from remaining document
The document of at least M key word in word;
If the judgment is No, then by the numbering of the corresponding N number of current document of the N number of key word by order row from small to large
Sequence, by the m-th numbering after sequence Dm is assigned to;Document of the numbering less than Dm is filtered out, then life is searched from remaining document
Described at least M key word in N number of key word document.
5. a kind of searcher of long query word, wherein, the device includes:
Acquiring unit, is suitable to obtain the long query word of input;
Extraction unit, is suitable to extract the N number of key word included in the long query word;N is natural number;
Unit is recalled, is suitable to recall the document of at least M key word in hit N number of key word;M is less than or equal to N;
Signal generating unit, is suitable to according to the document structure tree Search Results recalled;
Wherein, it is described to recall unit, be suitable to for each key word in N number of key word, according to document code from it is little to
Big sequential search goes out a document of the key word hit as the corresponding current document of the key word;If each current document
In there is no the document of at least M key word in hit N number of key word, then by the N number of key word it is corresponding it is N number of ought
M-th numbering after sequence is assigned to Dm by the numbering of front document by order sequence from small to large;Numbering is filtered out less than Dm
Document, the document of at least M key word in hit N number of key word is searched from remaining document;By what is found
The document for hitting at least M key word in N number of key word is recalled.
6. device as claimed in claim 5, wherein,
Described to recall unit, at least M be further adapted in it there is hit N number of key word in each current document is crucial
During the document of word, the document for hitting at least M key word in N number of key word is recalled, and the numbering of the document is assigned
It is worth to Dm;Filter out document of the numbering less than or equal to Dm, search from remaining document in hit N number of key word to
The document of few M key word;The document of at least M key word in the hit N number of key word for finding is recalled.
7. device as claimed in claim 6, wherein,
Described to recall unit, be further adapted for existing in each current document in hit N number of key word at least M is crucial
During the two or more document of word, the plural document is recalled, and by the largest number of document in two or more document
Numbering is assigned to Dm.
8. the device as any one of claim 5-7, wherein,
It is described to recall unit, be suitable to each key word in N number of key word, from remaining document according to document code from
It is little to go out a document of the key word hit as the corresponding current document of the key word to big sequential search;Judge each current
With the presence or absence of the document of at least M key word in hit N number of key word in document;If the judgment is Yes, by hitting
The document for stating at least M key word in N number of key word is recalled, and the numbering of the document is assigned to into Dm;Filter out numbering little
In or document equal to Dm, then the text that at least M key word in hit N number of key word is searched from remaining document
Shelves;If the judgment is No, then by the numbering of the corresponding N number of current document of the N number of key word by order sequence from small to large,
M-th numbering after sequence is assigned to into Dm;Document of the numbering less than Dm is filtered out, then hit institute is searched from remaining document
State the document of at least M key word in N number of key word;Such repeat the above steps, until having searched all texts to be searched
Shelves.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510149927.6A CN104715065B (en) | 2015-03-31 | 2015-03-31 | Long query word searching method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510149927.6A CN104715065B (en) | 2015-03-31 | 2015-03-31 | Long query word searching method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104715065A CN104715065A (en) | 2015-06-17 |
CN104715065B true CN104715065B (en) | 2017-04-19 |
Family
ID=53414391
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510149927.6A Expired - Fee Related CN104715065B (en) | 2015-03-31 | 2015-03-31 | Long query word searching method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104715065B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108536740B (en) * | 2018-03-07 | 2020-06-26 | 上海连尚网络科技有限公司 | Method, medium and equipment for determining search result |
CN110929125B (en) * | 2019-11-15 | 2023-07-11 | 腾讯科技(深圳)有限公司 | Search recall method, device, equipment and storage medium thereof |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1808430A (en) * | 2004-11-01 | 2006-07-26 | 西安迪戈科技有限责任公司 | Internet and computer information retrieval and mining with intelligent conceptual filtering, visualization and automation |
US8473486B2 (en) * | 2010-12-08 | 2013-06-25 | Microsoft Corporation | Training parsers to approximately optimize NDCG |
CN102541960A (en) * | 2010-12-31 | 2012-07-04 | 北大方正集团有限公司 | Method and device of fuzzy retrieval |
CN103810220B (en) * | 2012-11-15 | 2018-02-27 | 腾讯科技(深圳)有限公司 | A kind of microblogging searching method and device |
CN103049495A (en) * | 2012-12-07 | 2013-04-17 | 百度在线网络技术(北京)有限公司 | Method, device and equipment for providing searching advice corresponding to inquiring sequence |
CN104361042B (en) * | 2014-10-29 | 2019-02-12 | 中国建设银行股份有限公司 | A kind of information retrieval method and device |
-
2015
- 2015-03-31 CN CN201510149927.6A patent/CN104715065B/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN104715065A (en) | 2015-06-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105095440B (en) | A kind of search recommended method and device | |
CN104036009B (en) | A kind of method, image searching method and device for searching for matching picture | |
CN106815263B (en) | The searching method and device of legal provision | |
CN104715064B (en) | It is a kind of to realize the method and server that keyword is marked on webpage | |
CN105653537B (en) | Paging query method and device for database application system | |
CN108227954A (en) | A kind of method, apparatus and electronic equipment that search input associational word is provided | |
CN105095391A (en) | Device and method for identifying organization name by word segmentation program | |
CN103559313B (en) | Searching method and device | |
CN106598827A (en) | Method and device for extracting log data | |
CN105095175A (en) | Method and device for obtaining truncated web title | |
CN104715065B (en) | Long query word searching method and device | |
CN107678968A (en) | Sample extraction method, apparatus, computing device and the storage medium of source code function | |
CN109003170A (en) | Acquisition methods and device for the shop material shown in the page | |
CN102467544A (en) | Information smart searching method and system based on space fuzzy coding | |
CN106649385B (en) | Data reordering method and device based on HBase database | |
CN109101651A (en) | Method for analyzing metadata full link | |
CN106599062A (en) | Data processing method and device in SparkSQL system | |
CN108090200A (en) | A kind of sequence type hides the acquisition methods of grid database data | |
CN103353900B (en) | Method, device and system for accessing and certificating web address through search bar | |
CN106570058A (en) | Searching method and search engine | |
CN104462519A (en) | Search query method and device | |
CN108920484B (en) | Search content processing method and device, storage device and computer device | |
CN104715068B (en) | Method and device for generating document indexes and searching method and device | |
CN105528414B (en) | A kind of crawler method and system for collecting deep network data complete or collected works | |
US20070203895A1 (en) | Recursive search engine using correlative words |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20170310 Address after: 100016 Chaoyang District Road, Jiuxianqiao, No. 10, building No. 3, floor 15, floor 17, 1701-15B, Applicant after: BEIJING QIYUAN TECHNOLOGY CO.,LTD. Address before: 100088 Beijing city Xicheng District xinjiekouwai Street 28, block D room 112 (Desheng Park) Applicant before: BEIJING QIHOO TECHNOLOGY Co.,Ltd. Applicant before: Qizhi software (Beijing) Co.,Ltd. |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170419 |
|
CF01 | Termination of patent right due to non-payment of annual fee |