CN110309416B - Client, server, retrieval method and system thereof - Google Patents

Client, server, retrieval method and system thereof Download PDF

Info

Publication number
CN110309416B
CN110309416B CN201810323375.XA CN201810323375A CN110309416B CN 110309416 B CN110309416 B CN 110309416B CN 201810323375 A CN201810323375 A CN 201810323375A CN 110309416 B CN110309416 B CN 110309416B
Authority
CN
China
Prior art keywords
document
retrieval result
result data
data
document retrieval
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810323375.XA
Other languages
Chinese (zh)
Other versions
CN110309416A (en
Inventor
裘钢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suoyi Interactive Beijing Information Technology Co ltd
Original Assignee
Suoyi Interactive Beijing Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suoyi Interactive Beijing Information Technology Co ltd filed Critical Suoyi Interactive Beijing Information Technology Co ltd
Publication of CN110309416A publication Critical patent/CN110309416A/en
Application granted granted Critical
Publication of CN110309416B publication Critical patent/CN110309416B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A client, comprising: first and second receiving units for receiving first and second document retrieval conditions, respectively; at least one document retrieval result data in the first document retrieval result data set is not subordinate to the second document retrieval result data set, and at least one document retrieval result data in the second document retrieval result data set is not subordinate to the first document retrieval result data set; an output unit configured to output a combination of document retrieval result data including at least a first piece of document retrieval result data and a second piece of document retrieval result data and a third piece of data, and: at least one document in a first document corresponding to the first document retrieval result data and a second document corresponding to the second document retrieval result data has a document citation relation with a third document corresponding to the third data. Thus, the present disclosure can directly output a group of search documents having a common reference or a common referenced association relationship.

Description

Client, server, retrieval method and system thereof
Technical Field
The present disclosure relates to the field of information processing, and for example, to a client, a server, a retrieval method, and a system thereof.
Background
In the prior art, in the aspect of document retrieval, only a first retrieval result aiming at a certain retrieval formula or a result of further screening based on the first retrieval result can be simply provided, but the retrieval result with relevance can not be directly obtained in the retrieval.
Disclosure of Invention
In order to solve the above problem, the present disclosure provides a document retrieval method including:
step S100: receiving a first document retrieval condition corresponding to a first document retrieval result dataset and a second document retrieval condition corresponding to a second document retrieval result dataset, and:
the first literature retrieval result data set and the second literature retrieval result data set belong to a retrieval data set;
at least one document retrieval result data in the first document retrieval result data set is not affiliated with a second document retrieval result data set, and at least one document retrieval result data in the second document retrieval result data set is not affiliated with the first document retrieval result data set;
step S200: responding to the first literature retrieval condition and the second literature retrieval condition, and performing retrieval in the retrieval data set to obtain a first literature retrieval result set and a second literature retrieval result set; wherein the search data set comprises the first document search result set and a second document search result set;
step S300: outputting a document retrieval result data combination, wherein the document retrieval result data combination at least comprises: a first piece of document retrieval result data and a second piece of document retrieval result data, and a third piece of data;
wherein the first piece of document retrieval result data is from a first document retrieval result data set, the second piece of document retrieval result data is from a second document retrieval result data set, the third piece of data is from a retrieval data set, and:
at least one document in a first document corresponding to the first document retrieval result data and a second document corresponding to the second document retrieval result data has a document citation relation with a third document corresponding to the third data.
In addition, the present disclosure also provides a client, including:
a first receiving unit configured to receive a first document retrieval condition;
a second receiving unit configured to receive a second document retrieval condition;
wherein the first document retrieval condition corresponds to a first document retrieval result dataset, the second document retrieval condition corresponds to a second document retrieval result dataset, and:
the first literature retrieval result data set and the second literature retrieval result data set belong to a retrieval data set;
at least one document retrieval result data in the first document retrieval result data set is not affiliated with a second document retrieval result data set, and at least one document retrieval result data in the second document retrieval result data set is not affiliated with the first document retrieval result data set;
an output unit configured to output a combination of document retrieval result data, the combination of document retrieval result data including at least: a first piece of document retrieval result data and a second piece of document retrieval result data, and a third piece of data;
wherein the first piece of document retrieval result data is from a first document retrieval result data set, the second piece of document retrieval result data is from a second document retrieval result data set, the third piece of data is from a retrieval data set, and:
at least one document in a first document corresponding to the first document retrieval result data and a second document corresponding to the second document retrieval result data has a document citation relation with a third document corresponding to the third data.
In addition, the present disclosure also provides a server, including:
a first receiving unit configured to receive a first document retrieval condition;
a second receiving unit configured to receive a second document retrieval condition;
wherein the first document retrieval condition corresponds to a first document retrieval result dataset, the second document retrieval condition corresponds to a second document retrieval result dataset, and:
the first literature retrieval result data set and the second literature retrieval result data set belong to a retrieval data set;
at least one document retrieval result data in the first document retrieval result data set is not affiliated with a second document retrieval result data set, and at least one document retrieval result data in the second document retrieval result data set is not affiliated with the first document retrieval result data set;
a retrieval unit, which is used for responding to the first literature retrieval condition and the second literature retrieval condition, and executing retrieval in the retrieval data set and obtaining the first literature retrieval result set and the second literature retrieval result set; wherein the search data set comprises the first document search result set and a second document search result set;
an output unit configured to output a combination of document retrieval result data, the combination of document retrieval result data including at least: a first piece of document retrieval result data and a second piece of document retrieval result data, and a third piece of data;
wherein the first piece of document retrieval result data is from a first document retrieval result data set, the second piece of document retrieval result data is from a second document retrieval result data set, the third piece of data is from a retrieval data set, and:
at least one document in a first document corresponding to the first document retrieval result data and a second document corresponding to the second document retrieval result data has a document citation relation with a third document corresponding to the third data.
In addition, the present disclosure also provides a retrieval system, which performs any of the above-described methods.
In addition, the disclosure also provides a retrieval system, which comprises any one of the clients and any one of the servers.
Thus, the present disclosure can directly output a group of search documents having a common reference or a common referenced association relationship.
Drawings
FIG. 1 is a schematic illustration of a method according to one embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a client according to one embodiment of the present disclosure;
fig. 3 is a schematic diagram of a server according to an embodiment of the present disclosure.
Detailed Description
In order to make those skilled in the art understand the technical solutions disclosed in the present disclosure, the technical solutions of the various embodiments will be described below with reference to the embodiments and the related drawings, and the described embodiments are a part of the embodiments of the present disclosure, but not all of the embodiments. The terms "first," "second," and the like as used in this disclosure are used for distinguishing between different objects and not for describing a particular order. Furthermore, "include" and "have," as well as any variations thereof, are intended to cover and not to exclude inclusions. For example, a process, method, system, or article or apparatus that comprises a list of steps or elements is not limited to only those steps or elements but may alternatively include other steps or elements not expressly listed or inherent to such process, method, system, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the disclosure. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It will be appreciated by those skilled in the art that the embodiments described herein may be combined with other embodiments.
Referring to fig. 1, in one embodiment, the present disclosure discloses a document retrieval method, comprising:
step S100: receiving a first document retrieval condition corresponding to a first document retrieval result dataset and a second document retrieval condition corresponding to a second document retrieval result dataset, and:
the first literature retrieval result data set and the second literature retrieval result data set belong to a retrieval data set;
at least one document retrieval result data in the first document retrieval result data set is not affiliated with a second document retrieval result data set, and at least one document retrieval result data in the second document retrieval result data set is not affiliated with the first document retrieval result data set;
step S200: responding to the first literature retrieval condition and the second literature retrieval condition, and performing retrieval in the retrieval data set to obtain a first literature retrieval result set and a second literature retrieval result set; wherein the search data set comprises the first document search result set and a second document search result set;
step S300: outputting a document retrieval result data combination, wherein the document retrieval result data combination at least comprises: a first piece of document retrieval result data and a second piece of document retrieval result data, and a third piece of data;
wherein the first piece of document retrieval result data is from a first document retrieval result data set, the second piece of document retrieval result data is from a second document retrieval result data set, the third piece of data is from a retrieval data set, and:
at least one document in a first document corresponding to the first document retrieval result data and a second document corresponding to the second document retrieval result data has a document citation relation with a third document corresponding to the third data.
For the sake of understanding, this example will be described with specific search conditions. Those skilled in the art will recognize that the detailed description is not meant to limit the search methodology. Specific examples are as follows:
it is assumed that the retrieval of the data set is implemented in a database: with respect to the first and second document retrieval result data sets and the retrieval data set described above, if understood from the perspective of a set, the first and second document retrieval result data sets and the third piece of data belong to the retrieval data set in common and the set to which the third piece of data belongs can be understood as at least two cases:
in case 1, the collection to which the third piece of data belongs with respect to the retrieved data set is: a complement of a union of the sets represented by the first and second document retrieval result datasets, respectively, i.e., a third data belonging to the retrieval dataset but not belonging to the first or second document retrieval result dataset;
case 2, the third piece of data belongs to the first document retrieval result data set or the second document retrieval result data set;
for case 1:
assume that the first search condition refers to a condition for searching for the following result: communication-side documents of organization A, wherein all document retrieval result data in the result is a first document retrieval result set; for example, the first search condition may be boolean search, and the corresponding search field relates to two fields, namely "document belonging organization" and "full text", where the content of the "document belonging organization" field is "a organization" and the content of the "full text" field is "communication"; alternatively, the first search condition may be a semantic search for a document in communication with the a organization;
similarly, it is assumed that the second search condition refers to a condition for searching for the following result: documents in the LTE aspect — all document search result data in the result is then a second document search result set;
based on such specific example, the above steps S100, S200 are easily understood by those skilled in the art.
In contrast, step S300 illustrates that the present embodiment focuses on outputting the document search result data in a combined manner, which is specifically described as follows:
assuming that the first literature search result set comprises 5 items of communication literature search result data of the organization A, and numbering the data items as 1-1, 1-2, 1-3, 1-4 and 1-5 respectively;
the second literature search result set is assumed to comprise 3 pieces of literature search result data in the aspect of LTE, and the data are numbered as 2-1, 2-2 and 2-3 respectively;
1. assuming case 1, the search data set includes, in addition to the above search result data: 1-1, 1-2, 1-3, 1-4, 1-5, and 2-1, 2-2, 2-3, additional but not hit by the two search criteria: 3-1,3-2:
(1) when the data 3-1 is the third piece of data, there is a case as exemplified below:
suppose that 3 documents corresponding to the document retrieval result data 1-1, 1-2 and 1-3 all quote the document corresponding to the third data 3-1; at this time, it can be considered that the first cited index of the third piece of data 3-1 is 3;
suppose that 2 documents corresponding to the document retrieval result data 2-1 and 2-2 also quote the document corresponding to the third data 3-1; at this time, it can be considered that the second referred index of the third piece of data 3-1 is 2;
that is, 5 documents in total corresponding to the document retrieval result data 1-1, 1-2, 1-3, 2-1, 2-2 collectively refer to the document corresponding to the third data 3-1;
in addition, for the search result data: 1-1, 1-2, 1-3, 1-4, 1-5, and 2-1, 2-2, 2-3, no document has a citation relationship with the document corresponding to the third data 3-1;
then, for step S300, the combination of the document retrieval result data output by it can be exemplarily expressed as:
combination 1: { third data 3-1; document search result data 1-1, 1-2, 1-3, 2-1, 2-2 };
further, it can be considered that, for the first search condition and the second search condition, the multi-pointed index of the third piece of data 3-1 is defined as: the sum of all the quoted indexes of the piece of data, then the quoted index of the third piece of data 3-1 is 5, namely the sum of the aforementioned first quoted index 3 and the second quoted index 2;
(2) when the data 3-2 is the third piece of data, there is a case as exemplified below:
suppose that the documents corresponding to the third piece of data 3-2 cite the documents corresponding to the document retrieval result data 1-1 and 1-2; at this time, the first referential index of the third piece of data 3-2 can be considered to be 2;
further, it is assumed that the document corresponding to the third piece of data 3-2 also cites the document corresponding to the document retrieval result data 2-1; at this time, the second referential index of the third piece of data 3-2 can be considered to be 1;
that is, the document corresponding to the third piece of data 3-2 refers to a total of 3 documents corresponding to the document retrieval result data 1-1, 1-2, 2-1;
then, for step S300, the combination of the document retrieval result data output by it can be exemplarily expressed as:
and (3) combination 2: { literature search result data 1-1, 1-2, 2-1; a third piece of data 3-2 };
further, it can be considered that, for the first search condition and the second search condition, the multi-index number of the third piece of data 3-2 is defined as: the sum of all the index references of the piece of data, then the index of the third piece of data 3-1 is 3, i.e. the sum of the aforementioned first index of reference 2 and the second index of reference 1.
The above is a specific exemplary illustration of case 1, and it can be found that it relates to data numbered 3-1 or data numbered 3-2, neither of which is subject to the first or second document search result set, but to the complement to which the union of the two document search result sets corresponds, and to the entire corpus. Case 1 means that the present embodiment can retrieve other data having a reference relationship or a referenced relationship with the retrieval results in both the first document retrieval result set and the second document retrieval result set from the complementary sets other than the first document retrieval result set and the second document retrieval result set.
2. Under the assumption of the case 2, the search data set includes search result data corresponding to the first and second search conditions: 1-1, 1-2, 1-3, 1-4, 1-5, and 2-1, 2-2, 2-3, additional but not hit by the two search criteria: 3-1,3-2:
(1) when the data 1 to 5 are the third pieces of data, there are cases as exemplified below:
suppose that 3 documents corresponding to the document retrieval result data 1-1, 1-2 and 1-3 all quote the document corresponding to the third data 1-5; at this time, it can be considered that the first cited index of the third piece of data 1 to 5 is 3;
suppose that 2 documents corresponding to the document retrieval result data 2-1 and 2-2 also quote documents corresponding to the third data 1-5; at this time, the second referenced index of the third piece of data 1-5 can be considered to be 2;
that is, 5 documents in total corresponding to the document retrieval result data 1-1, 1-2, 1-3, 2-1, 2-2 collectively refer to the document corresponding to the third data 1-5;
in addition, for the search result data: 1-1, 1-2, 1-3, 1-4, 1-5, and 2-1, 2-2, 2-3, no document has a citation relationship with the document corresponding to the third data 1-5;
then, for step S300, the combination of the document retrieval result data output by it can be exemplarily expressed as:
combination 1: { third data 1-1; document search result data 1-1, 1-2, 1-3, 2-1, 2-2 };
further, it can be considered that, for the first search condition and the second search condition, the multi-pointed index of the third piece of data 1 to 5 is defined as: the sum of all the quoted indexes of the piece of data, then the quoted index of the third piece of data 1-5 is 5, i.e. the sum of the aforementioned first quoted index 3 and the second quoted index 2;
(2) when the data 2-3 is the third piece of data, there are cases as exemplified below:
suppose that the documents corresponding to the third piece of data 2-3 cite the documents corresponding to the document retrieval result data 1-1 and 1-2; at this time, the first referential index of the third piece of data 2-3 can be considered to be 2;
further, it is assumed that the document corresponding to the third piece of data 2-3 also cites the document corresponding to the document retrieval result data 2-1; at this time, the second referential index of the third piece of data 2-3 can be considered to be 1;
that is, the document corresponding to the third piece of data 2-3 cites a total of 3 documents corresponding to the document retrieval result data 1-1, 1-2, 2-1;
then, for step S300, the combination of the document retrieval result data output by it can be exemplarily expressed as:
and (3) combination 2: { literature search result data 1-1, 1-2, 2-1; a third piece of data 2-3 };
further, it can be considered that, for the first search condition and the second search condition, the multi-index number of the third piece of data 2 to 3 is defined as: the sum of all the index references of the piece of data, then the index of the third piece of data 3-1 is 3, i.e. the sum of the aforementioned first index of reference 2 and the second index of reference 1.
The above is a specific exemplary illustration of case 2, which can be found to involve data numbered 1-5 or data numbered 2-3, which are either affiliated with a first document search result set or affiliated with a second document search result set. Case 2 means that the present embodiment can retrieve, from among the first document retrieval result set and the second document retrieval result set, other data having a reference relationship or a referenced relationship with the retrieval results in both the first document retrieval result set and the second document retrieval result set.
It can be understood that the above embodiment can output the combination of the document retrieval result data with the reference relationship, and the reference relationship refers to the documents commonly cited among a plurality of pieces of data, and can refer to the relationship that one document is commonly cited by a plurality of documents, and can also refer to the relationship that one document refers to a plurality of documents, which is greatly convenient for the document researchers to identify, further study and compare the relevant documents. A citation relationship is an associative relationship between documents. How the client displays the document retrieval result data combination is not a limitation of the present disclosure.
The combination 1 and the combination 2 are referred to as document search result data combinations, but may be referred to as document families or document clusters, and represent one type of combination regardless of definition or naming.
Incidentally, the above-described embodiments can be applied not only to patent documents, academic journal documents, but also to web documents, and any other documents having a citation relationship.
As for the search condition, it may be directly or indirectly received through various receiving means such as an input box, a menu, and the like. In the input box receiving mode, the retrieval condition can be various retrieval expressions, for example, the retrieval condition is input in the input box through a keyboard; in the menu receiving mode, the retrieval condition may be a selected character, for example, the selected character is selected by a mouse, and the retrieval condition is activated by popping up a menu through a left key or a right key.
In another embodiment of the present invention, the substrate is,
at least one document in the first document corresponding to the first document retrieval result data and the second document corresponding to the second document retrieval result data and a third document corresponding to the third data comprise at least one same keyword in the whole text.
With this embodiment, it is further able to present the associations between documents through key fonts throughout.
In combination 1: { third data 3-1; the document search result data 1-1, 1-2, 1-3, 2-1, 2-2} is taken as an example, and the following cases are assumed to exist:
the document corresponding to the document retrieval result data 1-1 and 2-2 includes the following keywords in the whole text: a mobile terminal;
the third item of data 3-1 corresponds to a document, the entire text of which also includes keywords: a mobile terminal;
then, this means that the sub-combination 1.1 can be derived from combination 1: { third data 3-1; document search result data 1-1, 2-2, and the corresponding document has not only a relationship in which a document is commonly referred to by a plurality of documents but also an association relationship of keywords, so that the association relationship between the document search result data 1-1, 2-2 and 3 documents corresponding to the third data 3-1 is stronger and is easy to identify.
It should be noted that the same keyword does not mean that the same word or phrase is necessarily the same, because there are synonyms and vocabularies of other languages, for example, those skilled in the art can understand that in the communication field: the mobile terminal may also be expressed as a mobile terminal, or user equipment, or UE for short for user equipment.
For this embodiment, the keyword index may also be defined like the related index of the previous embodiment. The more identical keywords involved, the stronger the relevance of the document.
In another embodiment of the present invention, the substrate is,
at least one of the first document corresponding to the first document retrieval result data and the second document corresponding to the second document retrieval result data and a third document corresponding to the third data comprise at least one same semantic concept or at least one approximate semantic concept in the whole text.
It can be understood that unlike the keywords of the previous embodiment, this embodiment can further embody the association between documents by semantic concepts throughout. Keywords often relate to traditional boolean searches, while emerging semantic searches relate to the same semantic concepts or similar semantic concepts.
In combination 1: { third data 3-1; the document search result data 1-1, 1-2, 1-3, 2-1, 2-2} is taken as an example, and the following cases are assumed to exist:
the document corresponding to the document search result data 1-2 includes semantic concepts in its entirety: a smart phone;
the third item of data 3-1 corresponds to a document, which also includes the same semantic concepts throughout: an iPhone; it can be understood that an iPhone is a specific smart phone brand, and can be considered to belong to the same semantic concept as a mobile phone;
the document search result data 1-3 correspond to documents whose full text includes similar semantic concepts: an iPad; it can be understood that iPad, although not a smartphone, belongs to a smart tablet and belongs to a similar semantic concept.
Then, this means that the sub-combination 1.2 can be derived from combination 1: { third data 3-1; document search result data 1-2}, and sub-combination 1.3: { third data 3-1; and (4) document retrieval result data 1-3), and corresponding documents not only have a reference relationship, but also have an association relationship of the same semantics or similar semantics. Therefore, the relevance of the 3 documents corresponding to the document retrieval result data 1-2, 1-3 and the third data 3-1 is stronger, and the document retrieval result data is convenient for the document researchers to identify.
It will be appreciated that this embodiment may also define a semantic concept index similar to the index associated with the previous embodiment. The more identical or similar semantic concepts that are involved, the stronger the relevance of the document.
In another embodiment of the present invention, the substrate is,
at least one of the first document corresponding to the first document retrieval result data and the second document corresponding to the second document retrieval result data and a third document corresponding to the third data comprise at least one approximate picture in the whole text.
Unlike the keywords and semantics mentioned in the previous embodiments, this embodiment focuses on the relevance between documents represented by the approximation of the picture. In the prior art, all the related art schemes for finding the image can be used in this embodiment, such as www.tineye.com or hundred degree image search function or other similar technologies. Similarly, the present embodiment may also define the picture index. The more pictures that relate to an approximation, the stronger the relevance of the document.
It can be understood that, for the reference relations, the same keywords, the same semantic concepts, the similar semantic concepts, and the similar pictures referred in the above embodiments, all belong to different kinds of association relations, and if the kinds of the related association relations are more, the higher the measurement index of the corresponding kind is, the stronger the association relation between documents is.
In another embodiment of the present invention, the substrate is,
step S300 further includes: and responding to a sorting condition to output the document retrieval result data combination.
In combination 1: { third data 3-1; literature search result data 1-1, 1-2, 1-3, 2-1, 2-2} are examples:
as described above, the third piece of data 3-1 has a multi-quoted exponent of 5;
suppose that there is a fourth data 4-1, which corresponds to a document cited by the document corresponding to the document retrieval result data 1-1, 1-2, that is, the fourth data 4-1 has a multi-index of 2; the corresponding combination 3 may be: { fourth data 4-1; document retrieval result data 1-1, 1-2 };
if the sorting condition is descending order, combination 1 is ranked before combination 3, otherwise combination 3 is ranked before combination 1.
That is, with the aid of ranking, the present embodiment is to facilitate improving the user experience, thereby facilitating later statistics or other processing. It will be appreciated that the fourth piece of data, like the third piece of data, is also from the retrieved data set.
As for the data set described in the above embodiments, it represents a collection of data, and the data set is stored in the form of a database or otherwise, and is not limited.
Further, referring to fig. 2, the present disclosure also discloses in one embodiment a corresponding client, comprising:
a first receiving unit configured to receive a first document retrieval condition;
a second receiving unit configured to receive a second document retrieval condition;
wherein the first document retrieval condition corresponds to a first document retrieval result dataset, the second document retrieval condition corresponds to a second document retrieval result dataset, and:
the first literature retrieval result data set and the second literature retrieval result data set belong to a retrieval data set;
at least one document retrieval result data in the first document retrieval result data set is not affiliated with a second document retrieval result data set, and at least one document retrieval result data in the second document retrieval result data set is not affiliated with the first document retrieval result data set;
an output unit configured to output a combination of document retrieval result data, the combination of document retrieval result data including at least: a first piece of document retrieval result data and a second piece of document retrieval result data, and a third piece of data;
wherein the first piece of document retrieval result data is from a first document retrieval result data set, the second piece of document retrieval result data is from a second document retrieval result data set, the third piece of data is from a retrieval data set, and:
at least one document in a first document corresponding to the first document retrieval result data and a second document corresponding to the second document retrieval result data has a document citation relation with a third document corresponding to the third data.
Similar to the method related embodiment, this embodiment discloses a technical solution corresponding to the client through the corresponding functional unit.
With reference to the foregoing embodiments, it is preferred,
at least one document in the first document corresponding to the first document retrieval result data and the second document corresponding to the second document retrieval result data and a third document corresponding to the third data comprise at least one same keyword in the whole text.
With reference to the foregoing embodiments, it is preferred,
at least one of the first document corresponding to the first document retrieval result data and the second document corresponding to the second document retrieval result data and a third document corresponding to the third data comprise at least one same semantic concept or at least one approximate semantic concept in the whole text.
With reference to the foregoing embodiments, it is preferred,
at least one of the first document corresponding to the first document retrieval result data and the second document corresponding to the second document retrieval result data and a third document corresponding to the third data comprise at least one approximate picture in the whole text.
With reference to the foregoing embodiments, it is preferred,
the output unit is also used for responding to a sorting condition and outputting the document retrieval result data combination.
Similar to the related embodiment of the method, this embodiment discloses a technical solution corresponding to the server side through corresponding functional units:
referring to FIG. 3, the present disclosure discloses in one embodiment a server comprising:
a first receiving unit configured to receive a first document retrieval condition;
a second receiving unit configured to receive a second document retrieval condition;
wherein the first document retrieval condition corresponds to a first document retrieval result dataset, the second document retrieval condition corresponds to a second document retrieval result dataset, and:
the first literature retrieval result data set and the second literature retrieval result data set belong to a retrieval data set;
at least one document retrieval result data in the first document retrieval result data set is not affiliated with a second document retrieval result data set, and at least one document retrieval result data in the second document retrieval result data set is not affiliated with the first document retrieval result data set;
a retrieval unit, which is used for responding to the first literature retrieval condition and the second literature retrieval condition, and executing retrieval in the retrieval data set and obtaining the first literature retrieval result set and the second literature retrieval result set; wherein the search data set comprises the first document search result set and a second document search result set;
an output unit configured to output a combination of document retrieval result data, the combination of document retrieval result data including at least: a first piece of document retrieval result data and a second piece of document retrieval result data, and a third piece of data;
wherein the first piece of document retrieval result data is from a first document retrieval result data set, the second piece of document retrieval result data is from a second document retrieval result data set, the third piece of data is from a retrieval data set, and:
at least one document in a first document corresponding to the first document retrieval result data and a second document corresponding to the second document retrieval result data has a document citation relation with a third document corresponding to the third data.
With reference to the foregoing embodiments, it is preferred,
at least one document in the first document corresponding to the first document retrieval result data and the second document corresponding to the second document retrieval result data and a third document corresponding to the third data comprise at least one same keyword in the whole text.
With reference to the foregoing embodiments, it is preferred,
at least one of the first document corresponding to the first document retrieval result data and the second document corresponding to the second document retrieval result data and a third document corresponding to the third data comprise at least one same semantic concept or at least one approximate semantic concept in the whole text.
With reference to the foregoing embodiments, it is preferred,
at least one of the first document corresponding to the first document retrieval result data and the second document corresponding to the second document retrieval result data and a third document corresponding to the third data comprise at least one approximate picture in the whole text.
With reference to the foregoing embodiments, it is preferred,
the output unit is also used for responding to a sorting condition and outputting the document retrieval result data combination.
Similar to the foregoing embodiments, the present disclosure further discloses the following technical solutions of the system through the following embodiments:
a retrieval system, said system performing any of the retrieval methods described above.
Similar to the foregoing embodiments, the present disclosure further discloses the following technical solutions of the system through the following embodiments:
a retrieval system, the system comprising a client as described in any of the preceding, and a server as described in any of the preceding.
The steps in the method of the embodiment of the present disclosure may be sequentially adjusted, combined, and deleted according to actual needs.
The units in the device of the embodiment of the disclosure can be combined, divided and deleted according to actual needs. It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Furthermore, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts, modules, and elements described herein are not necessarily required by the invention.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the several embodiments provided in the present disclosure, it should be understood that the disclosed apparatus may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the units is only one logical division, and in actual implementation, there may be other divisions, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the coupling or direct coupling or communication connection between the units or components may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a smartphone, a personal digital assistant, a wearable device, a laptop, a tablet computer) to perform all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
As described above, the above embodiments are only used to illustrate the technical solutions of the present disclosure, and not to limit the same; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present disclosure.

Claims (5)

1. A client, comprising:
a first receiving unit configured to receive a first document retrieval condition;
a second receiving unit configured to receive a second document retrieval condition;
wherein the first document retrieval condition corresponds to a first document retrieval result dataset, the second document retrieval condition corresponds to a second document retrieval result dataset, and:
the first literature retrieval result data set and the second literature retrieval result data set belong to a retrieval data set;
at least one document retrieval result data in the first document retrieval result data set is not affiliated with a second document retrieval result data set, and at least one document retrieval result data in the second document retrieval result data set is not affiliated with the first document retrieval result data set;
an output unit configured to output a combination of document retrieval result data, the combination of document retrieval result data including at least: a first piece of document retrieval result data and a second piece of document retrieval result data, and a third piece of data;
wherein the first piece of document retrieval result data is from a first document retrieval result data set, the second piece of document retrieval result data is from a second document retrieval result data set, the third piece of data is from a retrieval data set, and:
at least one document in a first document corresponding to the first document retrieval result data and a second document corresponding to the second document retrieval result data has a document reference relation with a third document corresponding to a third piece of data;
for the first search condition and the second search condition, the multi-indexed number of the third piece of data is defined as: the sum of all the referenced indices of the piece of data;
the client is used for directly obtaining a retrieval result with relevance in retrieval;
wherein the content of the first and second substances,
at least one document in a first document corresponding to the first document retrieval result data and a second document corresponding to the second document retrieval result data and a third document corresponding to the third data comprise at least one same keyword in the whole text; or, for at least one of the first document corresponding to the first document retrieval result data and the second document corresponding to the second document retrieval result data, and the third document corresponding to the third data, the whole text includes at least one identical semantic concept or at least one approximate semantic concept; or, for at least one of the first document corresponding to the first document retrieval result data and the second document corresponding to the second document retrieval result data, and the third document corresponding to the third data, the whole text of the third document includes at least one approximate picture;
the output unit is also used for responding to a sorting condition and outputting the document retrieval result data combination.
2. A server, comprising:
a first receiving unit configured to receive a first document retrieval condition;
a second receiving unit configured to receive a second document retrieval condition;
wherein the first document retrieval condition corresponds to a first document retrieval result dataset, the second document retrieval condition corresponds to a second document retrieval result dataset, and:
the first literature retrieval result data set and the second literature retrieval result data set belong to a retrieval data set;
at least one document retrieval result data in the first document retrieval result data set is not affiliated with a second document retrieval result data set, and at least one document retrieval result data in the second document retrieval result data set is not affiliated with the first document retrieval result data set;
a retrieval unit, which is used for responding to the first literature retrieval condition and the second literature retrieval condition, and executing retrieval in the retrieval data set and obtaining the first literature retrieval result set and the second literature retrieval result set; wherein the search data set comprises the first document search result set and a second document search result set;
an output unit configured to output a combination of document retrieval result data, the combination of document retrieval result data including at least: a first piece of document retrieval result data and a second piece of document retrieval result data, and a third piece of data;
wherein the first piece of document retrieval result data is from a first document retrieval result data set, the second piece of document retrieval result data is from a second document retrieval result data set, the third piece of data is from a retrieval data set, and:
at least one document in a first document corresponding to the first document retrieval result data and a second document corresponding to the second document retrieval result data has a document reference relation with a third document corresponding to a third piece of data;
for the first search condition and the second search condition, the multi-indexed number of the third piece of data is defined as: the sum of all the referenced indices of the piece of data;
the server is used for directly obtaining a retrieval result with relevance in retrieval;
wherein the content of the first and second substances,
at least one document in a first document corresponding to the first document retrieval result data and a second document corresponding to the second document retrieval result data and a third document corresponding to the third data comprise at least one same keyword in the whole text; or, for at least one of the first document corresponding to the first document retrieval result data and the second document corresponding to the second document retrieval result data, and the third document corresponding to the third data, the whole text includes at least one identical semantic concept or at least one approximate semantic concept; or, for at least one of the first document corresponding to the first document retrieval result data and the second document corresponding to the second document retrieval result data, and the third document corresponding to the third data, the whole text of the third document includes at least one approximate picture;
the output unit is also used for responding to a sorting condition and outputting the document retrieval result data combination.
3. A document retrieval method, comprising:
step S100: receiving a first document retrieval condition corresponding to a first document retrieval result dataset and a second document retrieval condition corresponding to a second document retrieval result dataset, and:
the first literature retrieval result data set and the second literature retrieval result data set belong to a retrieval data set;
at least one document retrieval result data in the first document retrieval result data set is not affiliated with a second document retrieval result data set, and at least one document retrieval result data in the second document retrieval result data set is not affiliated with the first document retrieval result data set;
step S200: responding to the first literature retrieval condition and the second literature retrieval condition, and performing retrieval in the retrieval data set to obtain a first literature retrieval result set and a second literature retrieval result set; wherein the search data set comprises the first document search result set and a second document search result set;
step S300: outputting a document retrieval result data combination, wherein the document retrieval result data combination at least comprises: a first piece of document retrieval result data and a second piece of document retrieval result data, and a third piece of data;
wherein the first piece of document retrieval result data is from a first document retrieval result data set, the second piece of document retrieval result data is from a second document retrieval result data set, the third piece of data is from a retrieval data set, and:
at least one document in a first document corresponding to the first document retrieval result data and a second document corresponding to the second document retrieval result data has a document reference relation with a third document corresponding to a third piece of data;
for the first search condition and the second search condition, the multi-indexed number of the third piece of data is defined as: the sum of all the referenced indices of the piece of data;
the method is used for directly obtaining a retrieval result with relevance in retrieval;
at least one document in a first document corresponding to the first document retrieval result data and a second document corresponding to the second document retrieval result data and a third document corresponding to the third data comprise at least one same keyword in the whole text; or, for at least one of the first document corresponding to the first document retrieval result data and the second document corresponding to the second document retrieval result data, and the third document corresponding to the third data, the whole text includes at least one identical semantic concept or at least one approximate semantic concept; or, for at least one of the first document corresponding to the first document retrieval result data and the second document corresponding to the second document retrieval result data, and the third document corresponding to the third data, the whole text of the third document includes at least one approximate picture;
step S300 further includes: and responding to a sorting condition to output the document retrieval result data combination.
4. A retrieval system, said system performing the method of claim 3 above.
5. A retrieval system comprising the client of claim 1, the server of claim 2.
CN201810323375.XA 2018-02-05 2018-04-11 Client, server, retrieval method and system thereof Active CN110309416B (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN2018101164702 2018-02-05
CN2018101164670 2018-02-05
CN201810116467 2018-02-05
CN201810116470 2018-02-05

Publications (2)

Publication Number Publication Date
CN110309416A CN110309416A (en) 2019-10-08
CN110309416B true CN110309416B (en) 2021-11-30

Family

ID=67779039

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201810323759.1A Active CN110209779B (en) 2018-02-05 2018-04-11 Client, server, retrieval method and system thereof
CN201810323375.XA Active CN110309416B (en) 2018-02-05 2018-04-11 Client, server, retrieval method and system thereof

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN201810323759.1A Active CN110209779B (en) 2018-02-05 2018-04-11 Client, server, retrieval method and system thereof

Country Status (1)

Country Link
CN (2) CN110209779B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103257985A (en) * 2012-05-30 2013-08-21 韩俊 Device and method for simultaneously searching, inserting and displaying multiple cross-domain databases
CN103761307A (en) * 2014-01-22 2014-04-30 华为技术有限公司 Data processing device and data processing method
CN105938493A (en) * 2016-04-14 2016-09-14 乐视控股(北京)有限公司 Resource search method and apparatus
CN105956125A (en) * 2016-05-06 2016-09-21 长沙市麓智信息科技有限公司 Patent monitoring system and method
CN105989142A (en) * 2015-02-28 2016-10-05 华为技术有限公司 Data query method and device
CN106557493A (en) * 2015-09-25 2017-04-05 索意互动(北京)信息技术有限公司 A kind of data retrieval method, device and data retrieval server
CN107463566A (en) * 2016-06-02 2017-12-12 索意互动(北京)信息技术有限公司 A kind of document retrieval method and system

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001167087A (en) * 1999-12-14 2001-06-22 Fujitsu Ltd Device and method for retrieving structured document, program recording medium for structured document retrieval and index preparing method for structured document retrieval
JP2007183864A (en) * 2006-01-10 2007-07-19 Fujitsu Ltd File retrieval method and system therefor
CN100573531C (en) * 2008-07-04 2009-12-23 华中科技大学 A kind of document retrieval method based on association analysis
CN102279893B (en) * 2011-09-19 2015-07-22 索意互动(北京)信息技术有限公司 Many-to-many automatic analysis method of document group
CN103886063B (en) * 2014-03-18 2017-03-08 国家电网公司 A kind of text searching method and device
CN104346446A (en) * 2014-10-27 2015-02-11 百度在线网络技术(北京)有限公司 Paper associated information recommendation method and device based on mapping knowledge domain
CN107180059A (en) * 2016-03-11 2017-09-19 北大方正集团有限公司 Data retrieval method and data retrieval system
CN106445916A (en) * 2016-09-19 2017-02-22 合肥清浊信息科技有限公司 Semantic analysis method for patent retrieval

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103257985A (en) * 2012-05-30 2013-08-21 韩俊 Device and method for simultaneously searching, inserting and displaying multiple cross-domain databases
CN103761307A (en) * 2014-01-22 2014-04-30 华为技术有限公司 Data processing device and data processing method
CN105989142A (en) * 2015-02-28 2016-10-05 华为技术有限公司 Data query method and device
CN106557493A (en) * 2015-09-25 2017-04-05 索意互动(北京)信息技术有限公司 A kind of data retrieval method, device and data retrieval server
CN105938493A (en) * 2016-04-14 2016-09-14 乐视控股(北京)有限公司 Resource search method and apparatus
CN105956125A (en) * 2016-05-06 2016-09-21 长沙市麓智信息科技有限公司 Patent monitoring system and method
CN107463566A (en) * 2016-06-02 2017-12-12 索意互动(北京)信息技术有限公司 A kind of document retrieval method and system

Also Published As

Publication number Publication date
CN110209779A (en) 2019-09-06
CN110309416A (en) 2019-10-08
CN110209779B (en) 2021-11-30

Similar Documents

Publication Publication Date Title
US11663254B2 (en) System and engine for seeded clustering of news events
US20160034514A1 (en) Providing search results based on an identified user interest and relevance matching
US10002183B2 (en) Resource efficient document search
US8600997B2 (en) Method and framework to support indexing and searching taxonomies in large scale full text indexes
KR101098703B1 (en) System and method for identifying related queries for languages with multiple writing systems
US20120084291A1 (en) Applying search queries to content sets
US20120246154A1 (en) Aggregating search results based on associating data instances with knowledge base entities
US20120239650A1 (en) Unsupervised message clustering
US20110035374A1 (en) Segment sensitive query matching of documents
US8959112B2 (en) Methods for semantics-based citation-pairing information
Saini et al. Information retrieval models and searching methodologies: Survey
CN107844493B (en) File association method and system
US9298757B1 (en) Determining similarity of linguistic objects
US8732194B2 (en) Systems and methods for generating issue libraries within a document corpus
US9558185B2 (en) Method and system to discover and recommend interesting documents
CN112100396A (en) Data processing method and device
US20230073243A1 (en) Systems and methods for term prevalance-volume based relevance
US9552415B2 (en) Category classification processing device and method
CN110309416B (en) Client, server, retrieval method and system thereof
CN110968680B (en) Client, server, retrieval method and system
CN103646060B (en) Method and device for searching for group
US11669555B2 (en) System and method of creating index
CN109783816B (en) Short text clustering method and terminal equipment
Hoxha et al. Towards a modular recommender system for research papers written in albanian
CN107992565B (en) Method and system for optimizing search engine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant