Embodiment
For in the process that reduces searching page to comprising target keyword but be not the False Rate of the purpose page; The invention provides a kind of method and device through the keyword retrieval page; Main thought is to belong to paragraph through dividing target keyword, and filters the page through heavily retrieving the keyword that will reject.
Referring to shown in Figure 1, the method among the embodiment comprises following key step:
S1, in the page searched targets keyword.
S2, confirm the paragraph at this target keyword place according to the position of target keyword in the page.
S3, the keyword that retrieval will be rejected in above-mentioned paragraph.
S4, will retrieve the page that to reject keyword and from result for retrieval, filter.
More specifically, in retrieving, need at least one target keyword of retrieval, and at least one keyword that will reject, and have corresponding relation between each target keyword and the keyword that respectively will reject.For example: have one-to-one relationship between target keyword and the keyword that will reject; Again for example: have corresponding relation between a target keyword and at least two keywords that will reject.
If have corresponding relation between a target keyword and at least two keywords that will reject; Then the decision logic of step S4 can be in said paragraph, to retrieve the corresponding arbitrary keyword that will reject of target keyword, then this page is filtered from result for retrieval; Also can be in said paragraph, to retrieve all the corresponding keywords that will reject of target keyword, then this page filtered from result for retrieval.
Below pass through the content of the content of the invention described above background technology record as the page to be retrieved; Having one-to-one relationship between target keyword and the keyword that will reject is example; Target keyword is " keyword "; The keyword of rejecting is " prior art ", is described in the process in the concrete realization.
S101, retrieve in background technology of the present invention with " keyword ", retrieve " keyword " for first section in background technology of the present invention according to sequences of text.
S102, retrieval by window to the paragraph at " keyword " place be first section.
S103, retrieval " prior art " in first section do not retrieve, and then continue retrieval by sequences of text.
S104, retrieve " keyword " for second section in background technology of the present invention.
S105, retrieval by window to the paragraph at " keyword " place be second section.
S106, retrieval " prior art " in second section, and retrieve, then from result for retrieval, filter out this page.
Afterwards, if also have other page to be retrieved, then continue other page of retrieval.
Referring to shown in Figure 2, the device among the embodiment comprises: first retrieval unit, positioning unit, second retrieval unit and filter element.
First retrieval unit is used at page searched targets keyword.
Positioning unit, be used for according to first retrieval unit retrieves to target keyword confirm the paragraph at this target keyword place in the position of the page.
Second retrieval unit is used for the keyword that will reject in said paragraph retrieval.
Filter element is used for filtering from result for retrieval retrieving the page that will reject keyword.
More concrete; Also can comprise: Database Unit; Corresponding relation between the keyword that is used to store each target keyword and respectively will rejects; And need retrieve at least one target keyword at first retrieval unit, when second retrieval unit need be retrieved at least one keyword that will reject, call said corresponding relation.For example: the said corresponding relation of Database Unit storage is the one-to-one relationship between target keyword and the keyword that will reject; Again for example: the said corresponding relation of Database Unit storage is the corresponding relation between a target keyword and at least two keywords that will reject.
If the said corresponding relation of Database Unit storage is the corresponding relation between a target keyword and at least two keywords that will reject; Then filter logic can be that second retrieval unit retrieves the corresponding arbitrary keyword that will reject of target keyword in said paragraph, and then filter element filters this page from result for retrieval; Also can be that second retrieval unit retrieves all the corresponding keywords that will reject of target keyword in said paragraph, then filter element filters this page from result for retrieval.
Below pass through the content of the content of the invention described above background technology record as the page to be retrieved; The said corresponding relation of Database Unit storage is that the corresponding relation between a target keyword and two keywords that will reject is an example; Filter logic is that second retrieval unit retrieves the corresponding arbitrary keyword that will reject of target keyword in said paragraph; Then filter element filters this page from result for retrieval; Target keyword is " keyword ", and the keyword that reject is " rejecting " and " prior art ", is described in the process in the concrete realization.
At first, first retrieval unit retrieves in background technology of the present invention with " keyword " according to sequences of text, retrieves " keyword " for first section in background technology of the present invention.
Secondly, positioning unit locate first retrieval unit retrieves to the paragraph at " keyword " place be first section.
Thereafter, second retrieval unit retrieves " rejecting " in first section, and retrieve, then filter element filters out this page from result for retrieval.No longer this page is continued retrieval.
Afterwards, if also have other page to be retrieved, then continue other page of retrieval.
Obviously, those skilled in the art can carry out various changes and modification and not break away from the spirit and scope of the present invention the present invention, and for example: target keyword also can be the relation of multi-to-multi with the keyword that will reject.Like this, belong within the scope of claim of the present invention and equivalent technologies thereof if of the present invention these are revised with modification, then the present invention also is intended to comprise these changes and modification interior.