CN116069957A - Information retrieval method, device and equipment - Google Patents

Information retrieval method, device and equipment Download PDF

Info

Publication number
CN116069957A
CN116069957A CN202111302360.3A CN202111302360A CN116069957A CN 116069957 A CN116069957 A CN 116069957A CN 202111302360 A CN202111302360 A CN 202111302360A CN 116069957 A CN116069957 A CN 116069957A
Authority
CN
China
Prior art keywords
search
document
vector
data
filtering vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111302360.3A
Other languages
Chinese (zh)
Inventor
季琰
涂敬伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Suzhou Software Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202111302360.3A priority Critical patent/CN116069957A/en
Publication of CN116069957A publication Critical patent/CN116069957A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/38Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Computational Linguistics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Library & Information Science (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an information retrieval method, device and equipment, wherein the method comprises the following steps: receiving a document filtering vector set sent by a data owner and a retrieval filtering vector sent by a data user, wherein the document filtering vector set comprises at least one document filtering vector, and the document filtering vector is obtained by performing dimension reduction processing according to a document vector of a plaintext document; the search filtering vector is obtained by performing dimension reduction processing according to the search vector of the search keyword input by the data user; and obtaining a target search result according to the document filtering vector and the search filtering vector, and sending the target search result to the data user. By the method, the retrieval efficiency and the accuracy of ciphertext retrieval of the multiple data owners in the mixed cloud environment are improved.

Description

Information retrieval method, device and equipment
Technical Field
The present invention relates to the field of data retrieval technologies, and in particular, to an information retrieval method, apparatus, and device.
Background
In order to solve the privacy protection problem of cloud outsourced data, data encryption is often a common solution, but because encrypted data cannot be directly applied and retrieved, a data user can only acquire outsourced data by a method of downloading ciphertext data and then decrypting the ciphertext data. Obviously, the solution is only applicable to small-scale data outsourcing application scenarios; however, when there is a larger-scale data packet or more frequent data access, a large amount of data bandwidth resources and encryption and decryption computing resources will be consumed, which also makes the scheme have a larger limitation in practical application. Currently, how to implement a ciphertext retrieval mechanism with privacy capability in an outsourced cloud environment is still a challenging research topic.
Most of the existing technical schemes are developed and researched only aiming at public cloud environments and are not suitable for mixed cloud environments; meanwhile, the existing scheme has few researches on how to perform parallelizable ciphertext search; in addition, existing solutions have little model research for multiple data owners.
The existing ciphertext retrieval scheme has at least the following problems:
(1) The privacy security of the application scene of the current multiple data owners cannot be guaranteed.
(2) Research oriented to the hybrid cloud environment is to be improved in retrieval efficiency.
(3) When the data size is large, problems such as server memory overflow, calculation overflow and the like may occur.
Disclosure of Invention
In view of the foregoing, embodiments of the present invention are provided to provide an information retrieval method, apparatus, and device that overcome or at least partially solve the foregoing problems.
According to an aspect of an embodiment of the present invention, there is provided an information retrieval method, the method including:
receiving a document filtering vector set sent by a data owner and a retrieval filtering vector sent by a data user, wherein the document filtering vector set comprises at least one document filtering vector, and the document filtering vector is obtained by performing dimension reduction processing according to a document vector of a plaintext document; the search filtering vector is obtained by performing dimension reduction processing according to the search vector of the search keyword input by the data user;
and obtaining a target search result according to the document filtering vector and the search filtering vector, and sending the target search result to the data user.
According to another aspect of an embodiment of the present invention, there is provided an information retrieval apparatus including:
the receiving module is used for receiving a document filtering vector set sent by a data owner and a retrieval filtering vector sent by a data user, wherein the document filtering vector set comprises at least one document filtering vector, and the document filtering vector is obtained by performing dimension reduction processing according to a document vector of a plaintext document; the search filtering vector is obtained by performing dimension reduction processing according to the search vector of the search keyword input by the data user;
and the processing module is used for obtaining a target search result according to the document filtering vector and the search filtering vector and sending the target search result to the data user.
According to yet another aspect of an embodiment of the present invention, there is provided a computing device including: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the information retrieval method.
According to still another aspect of the embodiments of the present invention, there is provided a computer storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the above-described information retrieval method.
According to the scheme provided by the embodiment of the invention, the document filtering vector set and the searching filtering vector are matched to obtain the target searching result, so that the searching efficiency of a data user is higher while the confidentiality of the data of multiple data owners is ensured, the searching complexity is reduced, and the searching efficiency and the searching accuracy are improved.
The foregoing description is only an overview of the technical solutions of the embodiments of the present invention, and may be implemented according to the content of the specification, so that the technical means of the embodiments of the present invention can be more clearly understood, and the following specific implementation of the embodiments of the present invention will be more apparent.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1 shows a flow chart of an information retrieval method provided by an embodiment of the invention;
FIG. 2 is a system model diagram of an information retrieval method according to another embodiment of the present invention;
fig. 3 is a schematic structural diagram of an information retrieval device according to an embodiment of the present invention;
FIG. 4 illustrates a schematic diagram of a computing device provided by an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
Fig. 1 shows a flowchart of an information retrieval method provided by an embodiment of the present invention. As shown in fig. 1, the method comprises the steps of:
step 11, receiving a document filtering vector set sent by a data owner and a retrieval filtering vector sent by a data user, wherein the document filtering vector set comprises at least one document filtering vector, and the document filtering vector is obtained by performing dimension reduction processing according to a document vector of a plaintext document; the search filtering vector is obtained by performing dimension reduction processing according to the search vector of the search keyword input by the data user;
and step 12, obtaining a target search result according to the document filtering vector and the search filtering vector, and sending the target search result to the data user.
In the example, a data owner can utilize a space model to vectorize a plurality of sets of documents, map a plaintext document into a document vector to obtain a document vector set of the plaintext document, perform dimension reduction on each document vector in the document vector set according to the keyword relevance in a keyword dictionary to obtain a document filtering vector set, and send the document filtering vector set to a private cloud server to perform dimension reduction on the document vector, reduce the complexity of the document vector and improve the accuracy of subsequent processing; simultaneously, the collection of the plurality of documents and the document vector set are encrypted to obtain an encrypted document set and an encrypted document vector set, and the encrypted document set and the encrypted document vector set are sent to a public cloud server, so that confidentiality of document data is guaranteed, and meanwhile, implementation accuracy of subsequent steps is also guaranteed;
the data enables an owner to utilize a space model to carry out vectorization processing on a set of a plurality of search keywords, map the search keywords into search vectors to obtain search vectors of the search keywords, and can carry out dimension reduction processing on the search vectors according to the keyword relevance in a keyword dictionary to obtain search filter vectors and send the search filter vectors to a private cloud server; and combining the document filtering vector set to process so as to obtain a target retrieval result.
In an alternative embodiment of the present invention, step 12 may include:
step 121, obtaining a search candidate document set through a private cloud server according to the document filtering vector and the search filtering vector;
step 122, the search candidate document set is sent to a public cloud server, and the public cloud server obtains a search result according to the search candidate document set and the received search trapdoor and sends the search result to the data user; the search trapdoor is obtained by encrypting according to the search vector of the search keyword input by the data user.
In the example, the private cloud server processes the document filtering vector set and the retrieval filtering vector set, the retrieval range is reduced according to the matching degree of the document filtering vector set and the retrieval filtering vector set, and a retrieval candidate document set is obtained, so that the accuracy of data processing is further ensured, and the efficiency and the accuracy of subsequent retrieval are improved; the candidate document set is sent to a public cloud server, a basis is provided for subsequent retrieval, and the subsequent retrieval step is optimized; and the data user encrypts the search vector to obtain a search trapdoor, sends the search trapdoor to a public cloud server, and determines a search result by combining the search candidate document set and the search trapdoor in the public cloud server, so that the search accuracy is improved.
Preferably, the check trap door may be obtained by encrypting the search vector according to a key shared by the data owner and the data user.
In an alternative embodiment of the present invention, step 121 may include:
step 1211, performing inner product operation on each document filtering vector in the document filtering vectors and the retrieval filtering vector through a private cloud server to obtain an operation result;
and 1212, determining a search candidate document set according to the operation result.
In the example, inner product operation is carried out on the document filtering vector and the search filtering vector, the search range is reduced according to the matching degree between the document filtering vector and the search filtering vector, and an inner product operation result, namely a search candidate document set, is obtained, and during inner product operation, the data structure of a balanced binary tree is used for replacing the existing index structure, so that the prefiltering efficiency is further improved, and the search efficiency is further improved; and sending the search candidate document set obtained through operation to the public cloud server to serve as a subsequent search basis.
In an alternative embodiment of the present invention, step 122 may include:
and 1221, determining K documents with highest relevance scores with the search keywords from the search candidate document set according to the search trapdoor, wherein K is a positive integer as the search result.
In this example, the relevance is the relevance between the encrypted document and the search keyword in the public cloud server, because the document is mapped to a document vector, the search keyword is mapped to a search vector, the relevance score can be regarded as a vector dot product, and the inner product result of the ciphertext vector is the same as the inner product result of the plaintext vector.
And the public cloud server determines K documents with highest relevance scores between the encrypted documents in the public cloud server and the search keywords from the search candidate document set according to the received search trapdoor, and returns the K documents to the data user as a search result.
In an optional embodiment of the present invention, the K documents are encrypted documents in a plurality of encrypted documents sent by the data owner, where the encrypted documents are obtained by encrypting a plaintext document by the data owner according to a shared key.
In the example, the encrypted document is obtained by the data owner through key encryption processing on the owned plaintext document data, so that the security of the data is ensured;
further, the data user decrypts the K documents by using a key shared with the owner, thereby obtaining a plaintext retrieval result.
In an alternative embodiment of the present invention, the shared key is a key shared by the data owner and the data user;
in this embodiment, the shared key is generated by the data owner, and the shared key is shared only by the owner and the data user, so that confidentiality of the data document, the document vector and the retrieval trapdoor is guaranteed, and accuracy of subsequent steps is also guaranteed;
the shared key may be represented as SK 2 ={S,M 1 ,M 2 G }, where SK 2 S is an n-dimensional random 0/1 vector for sharing a secret key; m is M 1 And M 2 And g is a document encryption key, which is an n multiplied by n dimensional random invertible matrix.
In an alternative embodiment of the present invention, the plurality of data owners have identities therebetween for indicating that data among the plurality of data owners are not shared with each other, and the identities are configured for the plurality of data owners by a trusted third party.
In this example, the number of the data owners may be plural, so that when the data size is large, the accuracy and timeliness of data processing can be ensured, and the plurality of data owners have marks for representing respective identities and are independent of each other; secret sharing mechanism can be used by trusted third party to distribute secret key to multiple data owners as multiple data owner identity marks or data marks, data among multiple data owners are not shared or not communicated, the distributed secret key can ensure that data stored by the multiple data owners are in a non-sharing state, each party of data is safe and independent, and the data users can obtain required results by searching through search keywords, preferably, the distributed secret key can be expressed as SK 1 ,SK 1 = { G, k, G }, where G, k is the prime number that generates G.
The above method will be described with reference to a specific system model diagram, as shown in fig. 2:
in a specific application scenario of the above method, the system model mainly includes: the specific procedures of the trusted third party (Trust Third Party, TTP), data Owner (Data Owner, DO), data User (DU), private cloud (Private Clouds) and Public Clouds (Public Clouds) are as follows:
step 21, ttp distributes key SK to DO using secret sharing mechanism 1 Data non-sharing, communication and distribution of data between multiple DO with data = { G, k, G } (where G, k are prime numbers generating G) are carried out, the distributed secret key is used as a multiple DO data identifier, and the identity identifier of data non-sharing among multiple data owners is not limited to the secret key SK 1 Other identifications are also possible.
Step 22, DO generates key SK 2 ,SK 2 ={S,M 1 ,M 2 G }, wherein S is an n-dimensional random 0/1 vector; m is M 1 And M 2 And g is a document encryption key, which is an n multiplied by n dimensional random invertible matrix. Key SK 2 Shared only by DO and DU.
Step 23, DO owns the data and processes the data to generate an encrypted document set, an encrypted document vector set, and a document filter vector set, wherein:
the plaintext document set may be represented as d= { D 1 ,d 2 ,…,d m };
Document vector set V D ={V d1 ,V d2 ,…,V dm Vector set generated correspondingly for plaintext document set, document vector V di Representation d i Corresponding vectors, each bit in the vector storing the frequency of occurrence in the document of the keyword corresponding to the word, i.e.
Figure BDA0003338741760000071
Where TF is the frequency with which keywords appear in the document and wj is the jth bit;
the document filtering vector set is obtained by performing dimension reduction processing on each document vector in the plain document vector set according to the keyword relevance, and can be expressed as VF D ={VF d1 ,VF d2 ,…,VF dm Using SK } 2 And encrypting the plaintext document and the document vector set to obtain an encrypted document and an encrypted document vector set.
Step 24, the DU forms a search keyword set Q according to the search keywords of the interest of the DU, and initiates Top-K search, wherein the Top-K search is ciphertext search, namely, the DU wants to search K encrypted documents from a public cloud server, the Top-K search is search performed on the public cloud server, the DU generates a search trapdoor and a search filtering vector, the search trapdoor is sent to the public cloud server, and the search filtering vector is sent to a private cloud server;
wherein DU utilizes SK 2 For search vector V Q Encryption processing to generate the search trapdoor T, and DU generates V according to the correlation between keywords Q Dimension reduction generates the retrieval filter vector VF Q . And T and VF Q As search instructions sent to the public cloud server and the private cloud server respectively, the search vectors can be expressed as:
Figure BDA0003338741760000072
where IDF is the inverse document frequency of the keyword and wj is the j-th bit. In the scheme, the encrypted document set, the encrypted document vector set and the retrieval trapdoor T are all sent to a public cloud server; the document filtering vector set VF D The set of search filter vectors VF Q Are sent to the private cloud server.
And step 25, the private cloud server determines a search result candidate document ID set according to the search filtering vector and the inner product result between the filtering vectors, and submits the search result candidate document ID set to the public cloud server. The inner product operation here is an and operation by bit for each filter vector in the search filter vector and the document filter vector set.
Step 26, the public cloud server determines the document d according to the received search trapdoor and the search result candidate document ID set i K encrypted documents with highest relevance scores among the search keywords Q, namely target search results, are returned to the DU; finally, the DU decrypts the ciphertext data by using the key shared with the DO, thereby obtaining a plaintext retrieval result. Wherein the relevance is the relevance before the document and the search keyword, because the document is mapped to a document vector, the search keyword is mapped to a search vector, the relevance score can be regarded as a vector dot product, and the inner product result of the ciphertext vector is the same as the inner product result of the plaintext vector.
According to the embodiment of the invention, the trusted third party issues the secret key to the DO of the multiple data owners by utilizing the secret sharing mechanism, so that data among the DO can be invisible and not shared and communicated with each other, and the accuracy of an operation result is ensured while the confidentiality of the data of the multiple data owners is protected; the filtering efficiency of the data is further improved through inner product operation between the retrieval filtering vector and the document filtering vector in the private cloud server, so that the subsequent retrieval efficiency is improved; sharing key SK with DU through DO 2 And ensuring confidentiality of the data, confidentiality of the vector and confidentiality of the trapdoor.
Fig. 3 is a schematic structural diagram of an information retrieval apparatus 30 according to an embodiment of the present invention. As shown in fig. 3, the apparatus 30 includes:
the receiving module 31 is configured to receive a document filtering vector set sent by a data owner and a search filtering vector sent by a data user, where the document filtering vector set includes at least one document filtering vector, and the document filtering vector is obtained by performing dimension reduction processing according to a document vector of a plaintext document; the search filtering vector is obtained by performing dimension reduction processing according to the search vector of the search keyword input by the data user;
and the processing module 32 is used for obtaining a target search result according to the document filtering vector and the search filtering vector and sending the target search result to the data user.
Optionally, the processing module 32 is configured to obtain a target search result according to the document filtering vector and the search filtering vector, and send the target search result to the data user, where the processing module includes:
obtaining a search candidate document set through a private cloud server according to the document filtering vector and the search filtering vector;
the search candidate document set is sent to a public cloud server, and the public cloud server obtains a search result according to the search candidate document set and the received search trapdoor and sends the search result to the data user; the search trapdoor is obtained by encrypting according to the search vector of the search keyword input by the data user.
Optionally, the processing module 32 obtains, by a private cloud server, a search candidate document according to the document filtering vector and the search filtering vector, including:
performing inner product operation on each document filtering vector in the document filtering vectors and the retrieval filtering vector through a private cloud server to obtain an operation result;
and determining a search candidate document set according to the operation result.
Optionally, the processing module 32 obtains a search result according to the search candidate document set and the received search trapdoor, including:
and determining K documents with highest relevance scores with the search keywords from the search candidate document set according to the search trapdoor, wherein K is a positive integer as the search result.
Optionally, the K documents are encrypted documents in a plurality of encrypted documents sent by the data owner and received by the public cloud server, and the encrypted documents are obtained by encrypting a plaintext document by the data owner according to a shared key.
Optionally, the shared key is a key shared by the data owner and the data user, the shared key SK 2 ={S,M 1 ,M 2 G }, wherein S is an n-dimensional random 0/1 vector; m is M 1 And M 2 And g is a document encryption key, which is an n multiplied by n dimensional random invertible matrix.
Optionally, the plurality of data owners are multiple, and an identity for indicating that the data among the plurality of data owners are not shared is arranged among the plurality of data owners, and the identity is configured for the plurality of data owners by a trusted third party.
It should be noted that, the apparatus 30 is an apparatus corresponding to the above method, and all implementation manners in the above method embodiments are applicable to the embodiment of the apparatus, so that the same technical effects can be achieved.
Embodiments of the present invention provide a non-volatile computer storage medium storing at least one executable instruction that may perform the information retrieval method of any of the method embodiments described above.
FIG. 4 illustrates a schematic diagram of a computing device according to an embodiment of the present invention, and the embodiment of the present invention is not limited to a specific implementation of the computing device.
As shown in fig. 4, the computing device may include: a processor (processor), a communication interface (Communications Interface), a memory (memory), and a communication bus.
Wherein: the processor, communication interface, and memory communicate with each other via a communication bus. A communication interface for communicating with network elements of other devices, such as clients or other servers, etc. A processor for executing the program, and in particular, may perform the relevant steps in the above-described information retrieval method embodiment for a computing device.
In particular, the program may include program code including computer-operating instructions.
The processor may be a central processing unit, CPU, or specific integrated circuit ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement embodiments of the present invention. The one or more processors included by the computing device may be the same type of processor, such as one or more CPUs; but may also be different types of processors such as one or more CPUs and one or more ASICs.
And the memory is used for storing programs. The memory may comprise high-speed RAM memory or may further comprise non-volatile memory, such as at least one disk memory.
The program may in particular be operative to cause a processor to perform the information retrieval method of any of the method embodiments described above. The specific implementation of each step in the program may refer to the corresponding steps and corresponding descriptions in the units in the above embodiment of the information retrieval method, which are not repeated herein. It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus and modules described above may refer to corresponding procedure descriptions in the foregoing method embodiments, which are not repeated herein.
The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a construction of such a system is apparent from the description above. In addition, embodiments of the present invention are not directed to any particular programming language. It will be appreciated that the teachings of embodiments of the present invention described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of enablement and best mode of the embodiments of the present invention.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the above description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., an embodiment of the invention that is claimed, requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or components of the embodiments may be combined into one module or unit or component and, furthermore, they may be divided into a plurality of sub-modules or sub-units or sub-components. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments can be used in any combination.
Various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that some or all of the functionality of some or all of the components according to embodiments of the present invention may be implemented in practice using a microprocessor or Digital Signal Processor (DSP). Embodiments of the present invention may also be implemented as a device or apparatus program (e.g., a computer program and a computer program product) for performing a portion or all of the methods described herein. Such a program embodying the embodiments of the present invention may be stored on a computer readable medium, or may have the form of one or more signals. Such signals may be downloaded from an internet website, provided on a carrier signal, or provided in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. Embodiments of the invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specifically stated.

Claims (10)

1. An information retrieval method, the method comprising:
receiving a document filtering vector set sent by a data owner and a retrieval filtering vector sent by a data user, wherein the document filtering vector set comprises at least one document filtering vector, and the document filtering vector is obtained by performing dimension reduction processing according to a document vector of a plaintext document; the search filtering vector is obtained by performing dimension reduction processing according to the search vector of the search keyword input by the data user;
and obtaining a target search result according to the document filtering vector and the search filtering vector, and sending the target search result to the data user.
2. The information retrieval method according to claim 1, wherein obtaining a target retrieval result from the document filtering vector and the retrieval filtering vector, and transmitting to the data user, comprises:
obtaining a search candidate document set through a private cloud server according to the document filtering vector and the search filtering vector;
the search candidate document set is sent to a public cloud server, and the public cloud server obtains a search result according to the search candidate document set and the received search trapdoor and sends the search result to the data user; the search trapdoor is obtained by encrypting according to the search vector of the search keyword input by the data user.
3. The information retrieval method according to claim 2, wherein obtaining, by a private cloud server, a retrieval candidate document from the document filter vector and the retrieval filter vector, comprises:
performing inner product operation on each document filtering vector in the document filtering vectors and the retrieval filtering vector through a private cloud server to obtain an operation result;
and determining a search candidate document set according to the operation result.
4. The information retrieval method according to claim 2, wherein obtaining a retrieval result from the retrieval candidate document set and the received retrieval trapdoor, comprises:
and determining K documents with highest relevance scores with the search keywords from the search candidate document set according to the search trapdoor, wherein K is a positive integer as the search result.
5. The information retrieval method according to claim 4, wherein the K documents are encrypted documents among a plurality of encrypted documents transmitted by the data owner, which are encrypted by the data owner based on a shared key.
6. The information retrieval method as recited in claim 5, wherein the shared key is a key shared by the data owner and the data user, the shared key SK 2 ={S,M 1 ,M 2 G }, wherein S is an n-dimensional random 0/1 vector; m is M 1 And M 2 And g is a document encryption key, which is an n multiplied by n dimensional random invertible matrix.
7. An information retrieval method as claimed in any one of claims 1 to 6 wherein the plurality of data owners have an identity therebetween indicating that data between the plurality of data owners is not shared with each other, the identity being configured for the plurality of data owners by a trusted third party.
8. An information retrieval apparatus, the apparatus comprising:
the receiving module is used for receiving a document filtering vector set sent by a data owner and a retrieval filtering vector sent by a data user, wherein the document filtering vector set comprises at least one document filtering vector, and the document filtering vector is obtained by performing dimension reduction processing according to a document vector of a plaintext document; the search filtering vector is obtained by performing dimension reduction processing according to the search vector of the search keyword input by the data user;
and the processing module is used for obtaining a target search result according to the document filtering vector and the search filtering vector and sending the target search result to the data user.
9. A computing device, comprising: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is configured to store at least one executable instruction, where the executable instruction causes the processor to perform operations corresponding to the information retrieval method according to any one of claims 1 to 7.
10. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the information retrieval method of any one of claims 1-7.
CN202111302360.3A 2021-11-04 2021-11-04 Information retrieval method, device and equipment Pending CN116069957A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111302360.3A CN116069957A (en) 2021-11-04 2021-11-04 Information retrieval method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111302360.3A CN116069957A (en) 2021-11-04 2021-11-04 Information retrieval method, device and equipment

Publications (1)

Publication Number Publication Date
CN116069957A true CN116069957A (en) 2023-05-05

Family

ID=86182586

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111302360.3A Pending CN116069957A (en) 2021-11-04 2021-11-04 Information retrieval method, device and equipment

Country Status (1)

Country Link
CN (1) CN116069957A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116521743A (en) * 2023-06-27 2023-08-01 北京中科江南信息技术股份有限公司 Ciphertext retrieval method and device, storage medium and electronic equipment
CN117235803A (en) * 2023-11-15 2023-12-15 联通(广东)产业互联网有限公司 Data security authentication method and device based on data elements and electronic equipment

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116521743A (en) * 2023-06-27 2023-08-01 北京中科江南信息技术股份有限公司 Ciphertext retrieval method and device, storage medium and electronic equipment
CN117235803A (en) * 2023-11-15 2023-12-15 联通(广东)产业互联网有限公司 Data security authentication method and device based on data elements and electronic equipment
CN117235803B (en) * 2023-11-15 2024-02-27 联通(广东)产业互联网有限公司 Data security authentication method and device based on data elements and electronic equipment

Similar Documents

Publication Publication Date Title
US11973889B2 (en) Searchable encrypted data sharing method and system based on blockchain and homomorphic encryption
CN108062485A (en) A kind of fuzzy keyword searching method of multi-service oriented device multi-user
CN108363689B (en) Privacy protection multi-keyword Top-k ciphertext retrieval method and system facing hybrid cloud
CN114860735A (en) Method and device for inquiring hiding trace
CN116069957A (en) Information retrieval method, device and equipment
Qayyum Data security in mobile cloud computing: A state of the art review
Fan et al. Secure ultra-lightweight RFID mutual authentication protocol based on transparent computing for IoV
Wang et al. PeGraph: A system for privacy-preserving and efficient search over encrypted social graphs
Gahi et al. Privacy preserving scheme for location-based services
CN109783456B (en) Duplication removing structure building method, duplication removing method, file retrieving method and duplication removing system
US11133926B2 (en) Attribute-based key management system
Sultan et al. A novel image-based homomorphic approach for preserving the privacy of autonomous vehicles connected to the cloud
CN117951730A (en) Cloud security searchable encryption method based on hash index
Sun et al. Public data integrity auditing without homomorphic authenticators from indistinguishability obfuscation
Yan et al. Privacy-preserving content-based image retrieval in edge environment
Yan et al. Secure and efficient big data deduplication in fog computing
WO2023196016A1 (en) Secure computation using multi-party computation and a trusted execution environment
EP3827572B1 (en) Systems and methods for protecting data
Ghunaim et al. Secure kNN query of outsourced spatial data using two-cloud architecture
Xue et al. Privacy-Preserving Location Sharing via LWE-based Private Information Retrieval
Thiyagarajan et al. Ensuring Security for Data Storage in Cloud Computing using HECC-ElGamal Cryptosystem and GSO Optimization
Bülbül et al. Privacy preserving data retrieval on data clouds with fully homomorphic encryption
JP7440662B2 (en) Multi-key information search
Amaithi Rajan et al. EdgeShield: Attack resistant secure and privacy-aware remote sensing image retrieval system for military and geological applications using edge computing
Wang et al. A Secure Searchable Image Retrieval Scheme with Correct Retrieval Identity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination