[ summary of the invention ]
The invention aims to solve the technical problems that in the existing retrieval platform, the process of obtaining the final target retrieval result is caused by the lack of big data analysis except for the simple matching result for the keywords input by a user, the professional experience of a retrieval worker and the content analysis of the process retrieval result are highly depended on, and the retrieval efficiency and the accuracy of successfully retrieving the target result are reduced.
The invention adopts the following technical scheme:
in a first aspect, the invention provides an intelligent retrieval method, a server acquires a first group of keywords for retrieval, obtains a retrieval result by querying a database, and returns the retrieval result to a first request end; the retrieval result carries summary information of one or more retrieval result objects, and the method comprises the following steps:
the server receives a request message for browsing one or more retrieval result objects; and/or receiving a request message for downloading one or more retrieval result objects;
the server records the operation information corresponding to the request message; the operation information comprises one or more items of identification information of a retrieval result object, a request message type and retrieval result related information;
when receiving the second group of keywords of the first request end, the server matches the first group of keywords and/or the second group of keywords with the content of the retrieval result object carried by the corresponding operation information to obtain one or more target result objects with the matching degree larger than a first preset threshold value;
and establishing a mapping relation between the target result object and the corresponding key phrase, or accumulating the target object with the mapping relation on the effective access times corresponding to the key phrase.
Preferably, the establishing a mapping relationship between the target result object and the corresponding key phrase includes:
the mapping relation comprises a key phrase, identification information of the target result object and the total number of times that the corresponding target object is effectively accessed.
Preferably, the method further comprises:
the server counts the number of retrieval result objects contained in each round of retrieval results, and identifies the accessed retrieval result objects from the total retrieval result objects; wherein the accessing comprises the browsing and/or the downloading;
returning a retrieval report to the first request end; the retrieval report comprises identification information of retrieval result objects which are accessed by the user, identification information of remaining retrieval result objects which are not accessed, and the number of times of effective access of the history, which corresponds to each retrieval result object and is matched with the keyword group sent by the request end.
Preferably, when the server receives a retrieval request of the second request terminal, the retrieval request carries the xth group of keywords, and the method further includes:
the server queries a database according to the Xth group of keywords to obtain one or more retrieval results;
determining one or more retrieval result objects containing the mapping relation of the keyword group in the retrieval result, matching the Xth group of keywords with the one or more retrieval result objects, correspondingly establishing each keyword group with the mapping relation, and obtaining one or more recommended retrieval result objects of which the matching result is greater than a second preset threshold value;
and identifying the one or more recommended search result objects in the search results; or, the one or more recommended retrieval result objects are presented in a higher-priority ranking display manner.
Preferably, the identifying the one or more recommended search result objects in the search result specifically includes:
and (3) adopting an icon and/or highlight mode to reflect recommendation characteristics on the corresponding recommendation retrieval result object.
Preferably, the matching of the content of the search result object carried by the corresponding operation information with the first group of keywords and/or the second group of keywords to obtain one or more target result objects with matching degrees greater than a first preset threshold specifically includes:
determining one or more entries having differences between the second set of keywords and the first set of keywords;
matching the retrieval result objects recorded by the operation information by the different entries;
obtaining one or more target result objects with the matching degree larger than a first preset threshold value; the target result object is a retrieval result object which is effectively accessed by a user through a browsing mode and/or a downloading mode.
Preferably, the method further comprises:
the server acquires a patent file to be retrieved of a request end;
the server inquires a key phrase associated with the identification information of the patent document to be retrieved from the mapping relation according to the identification information of the patent document to be retrieved; and/or the presence of a gas in the gas,
and feeding back other patent documents to be retrieved in history as reference information, wherein the patent documents to be retrieved are taken as target result objects.
Preferably, the method further comprises:
determining the same entries between the Nth group of keywords and the Mth group of keywords from the same request end, and if the same entries represent that the similarity between the Nth group of keywords and the Mth group of keywords is greater than a third preset threshold, establishing an association retrieval relationship between a target result object corresponding to the Nth group of keywords and a target result object corresponding to the Mth group of keywords;
and the associated retrieval relationship is used for feeding back the retrieval results of the target result object containing the Nth group of keywords and the target result object containing the Mth group of keywords by the server according to the associated retrieval relationship when other users import the same patent file to be retrieved for retrieval.
Preferably, the first request terminal is a smart phone, a tablet computer, a personal PC or an all-in-one machine.
In a second aspect, the present invention further provides an intelligent retrieval apparatus, configured to implement the intelligent retrieval method in the first aspect, where the apparatus includes:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions programmed to perform the intelligent retrieval method of the first aspect.
In a third aspect, the present invention further provides a non-volatile computer storage medium storing computer-executable instructions for execution by one or more processors for performing the intelligent retrieval method of the first aspect.
The invention realizes the calibration of the retrieval result object with high retrieval value by analyzing the retrieval result object which is effectively accessed and utilizing the key phrase in the corresponding analysis process to establish the mapping relation and accumulating the effective access times of the key phrase which is established with the mapping relation, thereby improving the reference angle of a new dimension for the retrieval process of other subsequent users.
On the other hand, the method provided by the invention analyzes which search result objects comprise the difference vocabulary entry between the first keyword group and the second keyword group by analyzing the first keyword group and the second keyword group involved in the continuous search process and the searched/downloaded search result objects between the two groups of keyword processes, thereby proving that the corresponding search result objects are the effectively accessed search result objects, and further ensuring that the established mapping relation has high reliability.
[ detailed description ] embodiments
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In the description of the present invention, the terms "inner", "outer", "longitudinal", "lateral", "upper", "lower", "top", "bottom", and the like indicate orientations or positional relationships based on those shown in the drawings, and are for convenience only to describe the present invention without requiring the present invention to be necessarily constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
The request end in embodiments of the present invention may exist in various forms, including but not limited to:
(1) a mobile communication device: such devices are characterized by mobile communications capabilities and are primarily targeted at providing voice, data communications. Such terminals include smart phones (e.g., iphones), multimedia phones, functional phones, and low-end phones, among others.
(2) The ultra-mobile personal computer equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include: PDA, MID, and UMPC devices, etc., such as ipads.
(3) A server: the device for providing the computing service comprises a processor, a hard disk, a memory, a system bus and the like, and the server is similar to a general computer architecture, but has higher requirements on processing capacity, stability, reliability, safety, expandability, manageability and the like because of the need of providing high-reliability service.
(4) Other electronic devices with internet connectivity.
In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Example 1:
the embodiment 1 of the invention provides an intelligent retrieval method, wherein a server acquires a first group of keywords for retrieval, obtains a retrieval result by querying a database, and returns the retrieval result to a first request end; wherein, the retrieval result carries summary information of one or more retrieval result objects. Taking patent website as an example, the search result object may be expressed as a patent document, and the summary information is expressed as: abstract, abstract drawings, application information, etc. As shown in fig. 1, the method of embodiment 1 of the present invention includes:
in step 201, a server receives a request message for browsing one or more retrieval result objects; and/or receiving a request message for downloading one or more retrieval result objects.
The request message may be triggered by the user on the first request end after browsing summary information of the retrieval result object. The "browsing" and "downloading" listed in step 201 are two operations that are conventional for similar patent retrieval websites, and the corresponding "browsing" may also be represented by browsing with different content display degrees (i.e. represented by browsing with different levels), for example: the summary information of more retrieval result objects is displayed in a limited display screen, and then the first level of browsing may be to click a summary information area (usually, the area shows partial summary information content) of a specific retrieval result object, so as to present complete summary information content; furthermore, the second level of browsing can open the complete retrieval result object for browsing by displaying the designated link on the complete summary information content interface. The "download" may also be represented in different formats or downloading manners, such as: the search result object can be a search result object in a downloaded word format or a search result object in a downloaded pdf format; in addition, the downloading may be performed by downloading the search result objects one by one, or may be performed by downloading the search result objects in batches. Therefore, the operation for the search result object described in step 201 only belongs to the basic manner of obtaining the content information, and if there are other obtaining means for obtaining the content information included in the search result object in the prior art, the corresponding obtaining means also belongs to the protection scope of the embodiment of the present invention.
In step 202, the server records operation information corresponding to the request message. Wherein the operation information comprises one or more items of identification information of the retrieval result object, the type of the request message and the related information of the retrieval result.
The server records the operation information corresponding to the request message, and prepares for the analysis process in step 203, and the implementation principle here is that only by executing the relevant operation in step 201, the target retrieval result object which is possibly effectively accessed by the user of the requesting end in the corresponding browsed and/or downloaded retrieval result objects can be determined.
The analysis in step 203 is required because the browsed and/or downloaded search result objects are not all valid, and some search result objects, although browsed/downloaded by the user, may not have any useful information, so the filtering process in step 203 is required.
In step 203, when receiving the second group of keywords of the first request end, the server matches the first group of keywords and/or the second group of keywords with the content of the retrieval result object carried by the corresponding operation information, and obtains one or more target result objects with matching degrees greater than a first preset threshold.
At this time, when the server receives the second group of keywords of the first request end, the server may obtain corresponding content of the search result object according to the identification information of the search result object recorded in the operation information, and then match the first group of keywords and/or the second group of keywords; in addition, a cache time of the search result object in the server memory may be set (the corresponding cache time may be determined according to an average time interval when the regular user performs two rounds of keyword search), so that when the server receives the second group of keywords of the first request end, the server may directly call the corresponding search result object content from the memory that has not released resources to match with the first group of keywords and/or the second group of keywords.
The matching here can be achieved at least in the following way:
by matching the first set of keywords, the second set of keywords, and the search result objects (here, the search result objects are the search result objects browsed and/or downloaded in step 201 and recorded in step 202), it is identified which search result objects are searched based on the difference vocabulary entry between the first set of keywords and the second set of keywords.
In step 204, a mapping relationship between the target result object and the corresponding key phrase is established, or the target object with the mapping relationship is accumulated on the effective access times corresponding to the key phrase.
For the above differential entry matching process, the establishment of the mapping relationship between the target result object and the corresponding keyword group may specifically be represented as: and establishing a mapping relation between the target result object and the different entries. Therefore, the keyword group described in the embodiment of the present invention may be actually characterized in the form of a combination of multiple keywords, or may be in the form of a representation of a single keyword in some cases.
The embodiment of the invention realizes the calibration of the retrieval result object with high retrieval value by analyzing the retrieval result object which is provided by the embodiment of the invention and is effectively accessed, establishing a mapping relation by utilizing the key phrase in the corresponding analysis process and accumulating the effective access times of the key phrase which is established with the mapping relation, thereby improving the reference angle of a new dimension for the retrieval process of other subsequent users.
On the other hand, in the embodiment of the present invention, by analyzing the first keyword group and the second keyword group involved in the continuous search process and the search result objects browsed/downloaded between the processes using two groups of keywords, it is analyzed which search result objects include the different vocabulary entries between the first keyword group and the second keyword group, so as to demonstrate that the corresponding search result objects are the search result objects that have been effectively accessed, and thus the established mapping relationship has high reliability.
In combination with the embodiments of the present invention, a simple implementation manner is provided for establishing the mapping relationship between the target result object and the corresponding keyword group, where the mapping relationship includes the keyword group, the identification information of the target result object, and the total number of times that the corresponding target object has been effectively accessed. Compared with the prior art, the concept of the effective access times proposed in the embodiment of the present invention is somewhat similar to the concept of the times cited in academic papers, however, in the similar patent retrieval field, there is no clear text label (embodied as the record of the cited documents in the last appendix in each paper journal) similar to the citation in academic papers, and there is no way to track whether the relevant retrieval process in the patent is effectively adopted by the user and the adoption times, therefore, the mapping relationship mechanism established based on the analysis result of step 203 proposed in the embodiment of the present invention can provide a dimension to authenticate whether the retrieved result object that is browsed and/or downloaded is required by the user, specifically, by matching the similarity between the differential vocabulary entry and the retrieved result object that is recorded, to determine that the search result object includes a keyword (e.g., the second keyword group with respect to the first keyword group in the embodiment of the present invention) in a new round of search process.
For the intelligent retrieval method provided by the embodiment of the present invention, in addition to the function of establishing the mapping relationship provided in step 201 and step 204 included in the main body embodiment, an extensible implementation scheme is also provided based on the embodiment of the present invention, as shown in fig. 2, the method further includes:
in step 205, the server counts the number of search result objects included in each round of search results, and identifies the accessed search result objects from the total search result objects.
Wherein the access includes browsing and/or the downloading as described in embodiment 1 of the present invention.
In step 206, returning a retrieval report to the first request end; the retrieval report includes identification information of retrieval result objects which have been accessed by a user (i.e. a user on the first request end side), identification information of retrieval result objects which are not accessed, and the number of times of history effective access which corresponds to each retrieval result object and is matched with the keyword group sent by the request end.
In the embodiment 1, the different keywords listed in the embodiment 1 are taken as examples of the number of times that the history is effectively visited and the keyword group sent by the request end is matched, and the deep meaning of the number is that the corresponding search result objects are all search result objects which have been historically regarded as "inflection points", that is, the corresponding search result objects provide important reference meanings for the user to adjust the keyword combination (the search result objects are also contents which are considered by the embodiment of the present invention to be capable of generating resonance for different search users or bringing more reference values).
Compared with the retrieval report in the prior art, the retrieval report provided by combining the embodiment of the invention not only increases the identifiers of which are accessed and which are still not accessed; and the keyword group similarity analysis and the feedback of the information of the times of effective access of the corresponding history are given, so that the user can quickly locate the retrieval result object which is more meaningful than the user himself.
The embodiment 1 of the invention provides an intelligent retrieval method, and particularly provides intelligent analysis with more referential significance for the retrieval process of other users by introducing effective access dimensions of a retrieval result object and introducing statistical dimensions of the total number of the effective accesses of the retrieval result object. However, the foregoing only represents the performance of the intelligent retrieval proposed by the present invention calibrated on the retrieval result object, and for the intelligent retrieval proposed by the present invention, there is a further implementation extension manner, specifically, when the server receives the retrieval request of the second request end, the retrieval request carries the xth group of keywords, as shown in fig. 3, the method further includes:
in step 301, the server queries the database according to the xth group of keywords to obtain one or more search results.
Here, the description of the X-th group of keywords is only for distinguishing the first group of keywords and the second group of keywords from the first request side, and is not particularly limited.
In step 302, one or more search result objects containing a mapping relationship of a keyword group in the search result are determined, the xth group of keywords and the keywords corresponding to the mapping relationship are matched, and one or more recommended search result objects with matching results larger than a second preset threshold are obtained. The second preset threshold may be set according to experience and practical results, and is not described herein again.
The reason why it is necessary to confirm one or more search result objects containing the mapping relationship of the keyword group in the search result is that some search result objects which do not have the mapping relationship or have not yet created may exist in the total database.
In step 303, identifying the one or more recommended search result objects in the search result; or, the one or more recommended retrieval result objects are presented in a higher-priority ranking display manner.
The one or more recommended search result objects are identified in the search result, and the recommendation characteristics may be embodied on the corresponding recommended search result objects in an icon and/or highlight manner.
With reference to the embodiment of the present invention, for the content of the search result object carried in the corresponding operation information matching with the first group of keywords and/or the second group of keywords, one or more target result objects with matching degrees greater than a first preset threshold are obtained, and a specific implementation manner is further provided, as shown in fig. 4, the implementation manner includes:
in step 401, one or more entries having a difference between the second set of keywords and the first set of keywords are determined. In the embodiment of the present invention, the entry is an equivalent description of a certain keyword.
In a specific implementation, step 402 may be entered with a single differential entry (i.e., a keyword) to complete the corresponding matching operation, which is simpler and easier to calibrate.
In step 402, the search result objects described in the operation information are matched with the different terms (specifically, one or more keywords).
If the single difference entry is used for matching, it is not always possible to sufficiently express the inspiration generated by the user by the retrieval result object, so that in step 402, there is a preferable mode that preferentially uses the combination of the difference entries for matching in a hierarchical manner and gradually reduces the number of the combination of the difference entries until the last level is matched (i.e., the single difference entry is used as the matching object); in this case, the larger the number of entries included in the different entry combinations at each level, the greater the meaning of the search result object that is finally successfully matched in terms of providing the keyword hint and influence, and the more conspicuous the search result object is worth presenting in step 303.
In step 403, one or more target result objects with matching degrees greater than a first preset threshold are obtained; the target result object is a retrieval result object which is effectively accessed by a user through a browsing mode and/or a downloading mode.
In the embodiment of the present invention, the first preset threshold is set forth after a concept of a synonym is introduced into the embodiment of the present invention. For a complete match result, the degree of match may be represented as 100%; for synonyms, the similarity between English abbreviations and Chinese full names with the same meaning is correspondingly lower, for example: the similarity will be between 80% and 90%. For a keyword group with multiple entries, the corresponding matching result is more complex, and therefore, the value of the first preset threshold is preferably about 80%.
Example 2:
the above embodiment 1 shows an intelligent retrieval method capable of recording an effective access process, but the intelligent retrieval method provided by the present invention is not limited to the contents set forth in embodiment 1, so that, in combination with the new increase of effective access dimensions in embodiment 1, the existing retrieval process can be further improved, and the retrieval processes of different users are associated through the mapping relationship in embodiment 1, thereby improving the retrieval efficiency of each user in the whole retrieval platform. As shown in fig. 5, the method specifically includes:
in step 501, the server obtains a patent document to be retrieved by the request end.
The patent document to be retrieved is generally similar to the first request end in embodiment 1 of the present invention, and the transmission is completed through an interworking port between the first request end and the server. For example, the patent document is imported by a uniform login platform provided by the server for each request end.
In step 502, the server queries, according to the identification information of the patent document to be retrieved, a keyword group associated with the identification information of the patent document to be retrieved from the mapping relationship.
The identification information may be an application number, a patent name, and the like of the patent document, and can uniquely identify the content of the patent document.
In step 503, the searched keyword group is fed back to the request end for reference and reference by the user of the request end.
In addition to directly feeding back the queried keyword group to the user at the request end through step 503, the embodiment of the present invention further provides another way of feeding back reference information, which is specifically described in step 504 below:
in step 504, the patent document to be retrieved is taken as a target result object, and other patent documents to be retrieved are taken as reference information to be fed back historically.
The step 504 and the step 503 may alternatively form a complete scheme with the step 501 and the step 502 (for example, as shown in fig. 6), or may be implemented in a manner of being combined together, for example, as shown in fig. 7, the step 501 and the step 502 and the step 504 form a complete scheme, and of course, the step 503 and the step 504 do not have a strict sequence, and the timing adjustment may be performed according to the design requirement.
In combination with the embodiment of the present invention, an optional extended implementation scheme is also provided, and the extended scheme is proposed depending on the mapping relationship and the reference dimension of effective access proposed in embodiment 1, and the manner of implementing retrieval by sending a patent document to be retrieved to a server in an alternative manner or a supplementary manner using the patent document as a key phrase proposed in embodiment 2; on the basis of coexistence of the two, the above-mentioned extension implementation scheme can be implemented, as shown in fig. 8, the extension scheme specifically includes:
in step 601, the same entry between the nth group of keywords and the mth group of keywords from the same request end is determined, and if the same entry indicates that the similarity between the nth group of keywords and the mth group of keywords is greater than a third preset threshold, an association retrieval relationship between the target result object corresponding to the nth group of keywords and the target result object corresponding to the mth group of keywords is established. The third preset threshold may be set according to practice and experience, and is not described herein again.
In step 602, when other users import the same patent document to be retrieved for retrieval, the server feeds back the retrieval results of the target result object containing the nth group of keywords and the target result object containing the mth group of keywords according to the associated retrieval relationship.
In the specific display process, a zooming form may be adopted, and the sequence among the groups of keywords is expressed on the zooming presentation of the corresponding target retrieval result object, for example: the target result object corresponding to the key phrase at the top of the order is directly and explicitly presented, while the target results corresponding to the other key phrases at the back of the order (each key phrase represents the adjustment of the retrieval strategy of the same user before the final result is retrieved) are contracted at the back and presented after being expanded through the interaction (for example, clicking operation) of the user.
Example 3:
fig. 9 is a schematic structural diagram of an intelligent retrieval device according to an embodiment of the present invention. The intelligent retrieval device of the present embodiment includes one or more processors 21 and a memory 22. In fig. 9, one processor 21 is taken as an example.
The processor 21 and the memory 22 may be connected by a bus or other means, and fig. 9 illustrates the connection by a bus as an example.
The memory 22, as a non-volatile computer-readable storage medium for an intelligent retrieval method and apparatus, can be used for storing a non-volatile software program and a non-volatile computer-executable program, such as the intelligent retrieval method in embodiment 1. The processor 21 executes the intelligent retrieval method by executing non-volatile software programs and instructions stored in the memory 22.
The memory 22 may include high speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory 22 may optionally include memory located remotely from the processor 21, and these remote memories may be connected to the processor 21 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The program instructions/modules are stored in the memory 22 and, when executed by the one or more processors 21, perform the intelligent retrieval method of embodiment 1 described above, for example, perform the steps shown in fig. 1 to 8 described above.
It should be noted that, because the contents of information interaction, execution process, and the like between the modules and units in the device are based on the same concept as the processing method embodiment of the present invention, specific contents may refer to the description in the method embodiment of the present invention, and are not described herein again.
Those of ordinary skill in the art will appreciate that all or part of the steps of the various methods of the embodiments may be implemented by associated hardware as instructed by a program, which may be stored on a computer-readable storage medium, which may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.