CN114519202A - Cross-modal privacy semantic retrieval method, system and storage medium - Google Patents

Cross-modal privacy semantic retrieval method, system and storage medium Download PDF

Info

Publication number
CN114519202A
CN114519202A CN202210089487.XA CN202210089487A CN114519202A CN 114519202 A CN114519202 A CN 114519202A CN 202210089487 A CN202210089487 A CN 202210089487A CN 114519202 A CN114519202 A CN 114519202A
Authority
CN
China
Prior art keywords
semantic
retrieval
data
modal
dense
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210089487.XA
Other languages
Chinese (zh)
Inventor
束建钢
张伟哲
程正涛
杨帆
邹庆胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peng Cheng Laboratory
Original Assignee
Peng Cheng Laboratory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peng Cheng Laboratory filed Critical Peng Cheng Laboratory
Priority to CN202210089487.XA priority Critical patent/CN114519202A/en
Publication of CN114519202A publication Critical patent/CN114519202A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a cross-modal privacy semantic retrieval method, a cross-modal privacy semantic retrieval system and a storage medium, and relates to the technical field of data processing, wherein the method comprises the following steps: the client side extracts semantic features of the multi-modal data based on the multi-modal joint representation model to obtain a semantic representation vector, and encrypts the semantic representation vector to obtain a secret semantic representation vector; the server receives a dense-state semantic representation vector sent by the client, determines a semantic retrieval keyword associated with the dense-state semantic representation vector according to a preset retrieval index table, searches a data address corresponding to the semantic retrieval keyword in the preset retrieval index table to obtain a dense-state semantic retrieval result, and sends the dense-state semantic retrieval result to the client; and the client decrypts and displays the encrypted semantic retrieval result. The invention solves the problem of lower retrieval accuracy in the prior art, and achieves the effect of improving the accuracy of semantic retrieval results on the premise of ensuring query privacy and stored data privacy.

Description

Cross-modal privacy semantic retrieval method, system and storage medium
Technical Field
The invention relates to the technical field of data processing, in particular to a cross-modal privacy semantic retrieval method, a cross-modal privacy semantic retrieval system and a storage medium.
Background
With the development of the internet technology and the popularization of the big data cloud service technology, in the data sharing service, query retrieval is indispensable operation for a user to access cloud data to acquire information, but in the query process, a query request of the user and data stored in the cloud are in a plaintext form, and the data control capability is lost, so that the contradiction between big data sharing and privacy protection is obvious in a data search scene.
In addition, with the progress of internet technology and the change of production life style of people, the search demand among multi-modal data is gradually highlighted due to the diversity of data modalities. The cross-modal retrieval technology is a technology for searching other modal data similar to semantics by searching semantic relations among different modal samples and utilizing certain modal data.
In the existing scheme, cross-modal retrieval can be performed on plaintext data, corresponding modal retrieval can also be performed on privacy data, but the scheme for performing cross-modal retrieval on privacy data is less. The method for performing cross-modal semantic retrieval on the private data has the problems that extracted keywords are not semantically associated with search contents, and search results are possibly biased, so that the accuracy of the cross-modal private semantic retrieval is low.
Disclosure of Invention
The main purposes of the invention are as follows: the cross-modal privacy semantic retrieval method, the cross-modal privacy semantic retrieval system and the storage medium are provided, and the technical problem that the method for retrieving multi-modal privacy data in the prior art is low in accuracy is solved.
In order to realize the purpose, the invention adopts the following technical scheme:
in a first aspect, the present invention provides a cross-modal privacy semantic retrieval method, applied to a server, where the method includes:
receiving a dense semantic representation vector sent by a client; the secret semantic representation vector is obtained by the client side through carrying out semantic feature extraction on multimodal data based on a multimodal joint representation model to obtain a semantic representation vector and encrypting the semantic representation vector;
determining semantic retrieval keywords associated with the dense-state semantic representation vectors according to a preset retrieval index table; the preset retrieval index table comprises a mapping relation between a semantic retrieval keyword and a data address of stored data;
and searching a data address corresponding to the semantic retrieval key word in the preset retrieval index table to obtain a dense-state semantic retrieval result, and sending the dense-state semantic retrieval result to the client so that the client decrypts and displays the dense-state semantic retrieval result.
Optionally, in the above cross-modality privacy semantic retrieval method, the step of determining, according to a preset retrieval index table, a semantic retrieval keyword associated with the dense-modality semantic feature vector includes:
determining the association values between the dense semantic representation vector and all retrieval keywords in a preset retrieval index table according to the preset retrieval index table;
determining semantic retrieval keywords associated with the dense-state semantic representation vector according to the relevance value; and the semantic retrieval keywords comprise retrieval keywords which are sequenced according to the relevance values.
Optionally, in the above cross-modality privacy-based semantic retrieval method, before the step of receiving a dense-state semantic representation vector sent by a client, the method further includes:
receiving secret state data to be stored and corresponding secret state keywords sent by the client; the secret state data to be stored are obtained by encrypting the data to be stored through the client, the secret state keywords are obtained by extracting semantic features of the data to be stored through the client based on a multi-mode combined representation model, and the keywords are obtained by encrypting;
storing the secret state data to be stored, and acquiring stored data and data addresses thereof;
and constructing an index table according to the secret key words and the data addresses of the stored data to obtain a preset retrieval index table.
Optionally, in the above cross-modality privacy semantic retrieval method, the step of constructing an index table according to the secret keyword and the data address of the stored data, and obtaining a preset retrieval index table includes:
constructing an index table according to the secret key words and the data addresses of the stored data to obtain a retrieval index table;
determining similarity values among the secret keywords according to the retrieval index table;
performing clustering processing according to the similarity value to obtain semantic retrieval keywords; the semantic retrieval keywords comprise a cluster of dense keywords of which the similarity value is smaller than a preset threshold value;
and obtaining a preset retrieval index table according to the mapping relation between the semantic retrieval key words and the corresponding data addresses of the stored data.
In a second aspect, the present invention provides a cross-modal privacy semantic retrieval method, applied to a client, where the method includes:
obtaining multi-modal data;
extracting semantic features of the multi-modal data through a multi-modal joint representation model to obtain a semantic representation vector;
encrypting the semantic representation vector to obtain a secret semantic representation vector;
sending the dense-state semantic representation vector to a server so that the server determines a semantic retrieval keyword associated with the dense-state semantic representation vector according to a preset retrieval index table, searches a data address corresponding to the semantic retrieval keyword in the preset retrieval index table to obtain a dense-state semantic retrieval result, and sends the dense-state semantic retrieval result to the client; the preset retrieval index table comprises a mapping relation between a semantic retrieval keyword and a data address of stored data;
and receiving a dense semantic retrieval result sent by the server, and decrypting and displaying the dense semantic retrieval result.
Optionally, in the above cross-modal privacy semantic retrieval method, the step of extracting semantic features from the multi-modal data by using a multi-modal joint representation model to obtain a semantic representation vector includes:
preprocessing the multi-modal data to obtain preprocessed multi-modal data; wherein the preprocessing comprises any one of format switching, text encoding, size scaling and noise elimination;
inputting the preprocessed multi-modal data into a multi-modal joint representation model, and outputting semantic feature vectors; wherein the multi-modal joint characterization model comprises a representation learning model that modality converts the multi-modal data.
In a third aspect, the present invention provides a server, where the server includes a processor and a memory, where the memory stores a cross-modality privacy semantic retrieval program, and when the cross-modality privacy semantic retrieval program is executed by the processor, the cross-modality privacy semantic retrieval method is implemented.
In a fourth aspect, the present invention provides a client, where the client includes a processor and a memory, where the memory stores a cross-modality privacy semantic retrieval program, and when the cross-modality privacy semantic retrieval program is executed by the processor, the cross-modality privacy semantic retrieval method is implemented.
In a fifth aspect, the present invention provides a cross-modal privacy semantic retrieval system, including:
the server as described above;
the client terminal as described above;
the server is in communication connection with the client.
In a sixth aspect, the present invention provides a computer-readable storage medium having stored thereon a computer program executable by one or more processors to implement a cross-modality privacy semantic retrieval method as described above.
One or more technical solutions provided by the present invention may have the following advantages or at least achieve the following technical effects:
according to the cross-modal privacy semantic retrieval method, the system and the storage medium, a client performs semantic feature extraction on multi-modal data based on a multi-modal combined representation model to obtain a semantic representation vector, then encrypts the semantic representation vector to obtain a dense semantic representation vector, sends the dense semantic representation vector to a server, the server determines semantic retrieval keywords associated with the dense semantic representation vector according to a preset retrieval index table, then searches data addresses corresponding to the semantic retrieval keywords in the preset retrieval index table to obtain a dense semantic retrieval result, returns the dense semantic retrieval result to the client, and the client decrypts and displays the dense semantic retrieval result to achieve the purpose of privacy semantic retrieval of the multi-modal data. According to the invention, semantic representation is not required to be carried out by relying on a keyword bank, so that the semantic correlation between the retrieval keywords and the retrieval result can be ensured, and the effect of improving the accuracy of the semantic retrieval result on the premise of ensuring the query privacy and the stored data privacy is realized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a flowchart illustrating a cross-modal privacy semantic retrieval method according to a first embodiment of the present invention;
FIG. 2 is a schematic diagram of a hardware architecture of a server according to the present invention;
FIG. 3 is a flowchart illustrating a cross-modal privacy semantic retrieval method according to a second embodiment of the present invention;
FIG. 4 is a diagram illustrating a hardware structure of a client according to the present invention;
FIG. 5 is a functional block diagram of the cross-modal privacy semantic retrieval system according to the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that, in the present invention, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element. In addition, in the present invention, suffixes such as "module", "part", or "unit" used to represent elements are used only for facilitating the description of the present invention, and have no specific meaning in themselves. Thus, "module", "component" or "unit" may be used mixedly.
The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations. In addition, the technical solutions of the respective embodiments may be combined with each other, but must be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination of technical solutions should be considered to be absent and not be within the protection scope of the present invention.
Analysis of the prior art finds that with development of internet technology and popularization of big data cloud service technology, information resources and services face risks of data security and privacy disclosure in the data sharing process, so that contradiction between data fusion and security and privacy is caused.
Query retrieval is an indispensable operation for a user to access cloud data to acquire information, but in the query process, because big data is in the cloud, a query request needs to be uploaded to the cloud, and a cloud service provider performs retrieval operation. Therefore, in the current internet untrusted environment, the query request, the cloud data and the request result of the user face threats such as tampering, deletion and stealing all the time, and privacy security problems such as privacy disclosure and data tampering are easily caused.
In addition, with the progress of internet technology and the change of production and living modes of people, internet application scenes are richer, various applications go deep into all corners of life of people, and data modes are more diverse. Due to the diversity of data modalities, the search requirement among multimodal data is gradually highlighted. The cross-modal retrieval technology is to search for samples of other modes similar to semantics by searching for semantic relationships among different modal samples and utilizing one modal sample. For example, query requests of one modality (such as text) are used to retrieve data content with similar semantics but in other modalities (such as images). The data modalities are very diverse, and besides common texts, images, voice and the like, videos, physiological signals, point cloud data and the like can be used as a part of retrieval contents in cross-modality retrieval.
In the prior art, cross-modal retrieval can be performed on plaintext data, and corresponding modal retrieval can also be performed on privacy data, but the scheme for performing cross-modal retrieval on privacy data is less. The method for cross-modal retrieval of private data has some problems:
1. the multi-mode data structure is complex, and keywords are difficult to effectively extract;
2. the extraction of the key words of the private data depends on a predefined key word library, the semantics of the extracted key words are independent from each other, and the relevance is poor;
3. when searching is carried out according to the keywords, only the keywords are searched, other words which are related to the keywords cannot be searched, so that the keywords are not semantically related to the searched content, the searched result may have deviation, and the accuracy of cross-modal retrieval is low.
In view of the technical problem of lower accuracy of a method for searching multi-modal private data in the prior art, the invention provides a cross-modal private semantic search method, which has the following general idea:
a client acquires multi-mode data; extracting semantic features of the multi-modal data through a multi-modal joint representation model to obtain a semantic representation vector; encrypting the semantic representation vector to obtain a secret semantic representation vector; sending the dense semantic representation vector to a server; and receiving a dense semantic retrieval result sent by the server, and decrypting and displaying the dense semantic retrieval result.
The server receives a dense semantic representation vector sent by the client; determining semantic retrieval keywords associated with the dense-state semantic representation vectors according to a preset retrieval index table; the preset retrieval index table comprises a mapping relation between a semantic retrieval keyword and a data address of stored data; and searching a data address corresponding to the semantic retrieval key word in the preset retrieval index table to obtain a dense-state semantic retrieval result, and sending the dense-state semantic retrieval result to the client.
According to the technical scheme, firstly, a client side extracts semantic features of multi-modal data based on a multi-modal combined representation model to obtain a semantic representation vector, then encrypts the semantic representation vector to obtain a secret semantic representation vector, sends the secret semantic representation vector to a server, the server determines semantic retrieval keywords related to the secret semantic representation vector according to a preset retrieval index table, then searches data addresses corresponding to the semantic retrieval keywords in the preset retrieval index table to obtain a secret semantic retrieval result, and returns the secret semantic retrieval result to the client side, and the client side decrypts and displays the secret semantic retrieval result to achieve the purpose of private semantic retrieval of the multi-modal data. According to the invention, semantic representation is not required to be carried out by relying on a keyword bank, so that the semantic correlation between the retrieval keywords and the retrieval result can be ensured, and the effect of improving the accuracy of the semantic retrieval result on the premise of ensuring the query privacy and the stored data privacy is realized.
The cross-modal privacy semantic retrieval method, the cross-modal privacy semantic retrieval system and the storage medium provided by the invention are described in detail by specific embodiments and implementation manners with reference to the drawings.
Example one
Referring to the flowchart illustration of fig. 1, a first embodiment of the cross-modal privacy semantic retrieval method according to the present invention is provided, and the cross-modal privacy semantic retrieval method is applied to a server. The server may be a network device capable of implementing network connection, or may be a cloud service platform capable of implementing network connection.
Fig. 2 is a schematic diagram of a hardware structure of the server. The server may include: a processor 1001, such as a CPU (Central Processing Unit), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Those skilled in the art will appreciate that the hardware configuration shown in fig. 2 is not intended to be limiting of the present invention, and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
Specifically, the communication bus 1002 is used for realizing connection communication among these components; the user interface 1003 is used for connecting a user terminal and performing data communication with the user terminal, the user interface 1003 may include an output unit, such as a display screen, an input unit, such as a keyboard, and optionally, the user interface 1003 may further include other input/output interfaces, such as a standard wired interface and a wireless interface; the network interface 1004 is used for connecting to the backend server and performing data communication with the backend server, and the network interface 1004 may include an input/output interface, such as a standard wired interface, a wireless interface, such as a Wi-Fi interface; the memory 1005 is used for storing various types of data, which may include, for example, instructions of any application program or method in the server and application program-related data, and the memory 1005 may be a high-speed RAM memory, or a stable memory such as a disk memory, and optionally, the memory 1005 may be a storage device independent of the processor 1001;
specifically, with continued reference to fig. 2, the memory 1005 may include an operating system, a network communication module, a user interface module, and a cross-modal privacy semantic retrieval program, where the network communication module is mainly used to connect to a client and perform data communication with the client; the processor 1001 is configured to invoke the cross-modal privacy semantics retrieving program stored in the memory 1005 and perform the following operations:
receiving a dense semantic representation vector sent by a client; the dense semantic representation vector is obtained by performing semantic feature extraction on multi-modal data through the client based on a multi-modal joint representation model to obtain a semantic representation vector and encrypting the semantic representation vector;
determining semantic retrieval keywords associated with the dense-state semantic representation vector according to a preset retrieval index table; the preset retrieval index table comprises a mapping relation between a semantic retrieval keyword and a data address of stored data;
and searching a data address corresponding to the semantic retrieval key word in the preset retrieval index table to obtain a dense-state semantic retrieval result, and sending the dense-state semantic retrieval result to the client so that the client decrypts and displays the dense-state semantic retrieval result.
Based on the above server, the following describes in detail the cross-modal privacy semantic retrieval method according to this embodiment with reference to the flowchart shown in fig. 1. The method may comprise the steps of:
step S120: receiving a dense semantic representation vector sent by a client; and the secret semantic representation vector is obtained by performing semantic feature extraction on multimodal data through the client based on a multimodal joint representation model to obtain a semantic representation vector and encrypting the semantic representation vector.
Specifically, the server is in communication connection with the client to form a retrieval system. In the retrieval system, a user can directly input data to be retrieved on a client, and the input data can be multi-modal data, namely data information comprising a plurality of different modalities. The multimodal data may include one or more of different modality data, such as text data, image data, voice data, video data, and physiological signal data, and may be obtained by different applications of the client.
After the client acquires the multi-modal data, the multi-modal data can be preprocessed, including but not limited to data normalization, format switching, noise elimination, coding and decoding, size scaling and the like, so that the preprocessed data can be directly input into the multi-modal joint characterization model for preprocessing. The specific processing content can be set according to the actual situation. In this embodiment, a multi-modal data including text data and image data is described as an example.
Assuming that the text data is a long sequence composed of english letters, numbers and symbols, when a multi-modal joint Representation model of the client encodes the text data by using a trained BERT (Bidirectional Encoder with pre-training) model, the text data in the multi-modal data may be pre-processed, that is, pre-encoded, by using a BPE (Byte Pair Encoder) method, and then the pre-encoded text data is input to the BERT model.
The BPE method is an algorithm for encoding according to byte pairs, and can compress text data, and is a layer-by-layer iterative process in which a pair of characters most frequently occurring in a character string is replaced by a character not occurring in the character string. Taking English as an example, the algorithm divides the training corpus by taking characters as a unit, combines the training corpus according to character pairs, sorts the results of all combinations according to the occurrence frequency, the higher the occurrence frequency, the more advanced the ranking, the first-ranked character pair with the highest occurrence frequency, and then selects one character to replace the character pair so as to achieve the purpose of compressing text data. In this embodiment, after the text data is preprocessed by the method, the preprocessed text data is obtained. The client side inputs the preprocessed text data into a BERT model, performs word segmentation on the text to obtain a sub-word sequence, obtains a sub-word index sequence according to sub-words and sub-word indexes in a dictionary of the BERT model, and realizes the encoding of the text data, namely, realizes the feature extraction of the text data to obtain a corresponding semantic representation vector which is a 768-dimensional row vector.
In order to solve the semantic retrieval problem under the privacy protection of cross-modal data and ensure the privacy security of user data, the client side also encrypts the extracted semantic representation vectors, and specifically encrypts the extracted semantic representation vectors by using an encryption method supporting multiple relevance calculations to obtain the dense semantic representation vectors corresponding to the text data.
Similarly, the image data in the multi-modal data is also subjected to preprocessing and feature extraction in sequence. When the multi-modal joint representation model of the client side adopts a ResNet50(Residual Network) model to encode the image data, the model has requirements on the size of the input data, so that the image data in the multi-modal data can be preprocessed, namely, subjected to size scaling by using a transform method of a Pythch frame in an open-source Python machine learning library, and then the image data after size scaling is input into a ResNet50 model. In this embodiment, the image data is preprocessed by the method, and then the preprocessed image data is obtained. The client side inputs the preprocessed image data into a ResNet50 model to realize the encoding of the image data, namely, the feature extraction of the image data is realized to obtain a corresponding semantic representation vector which is a 768-dimensional row vector. The client side also continuously encrypts the semantic representation vector to obtain a secret semantic representation vector of the image data.
According to the method, after the client side preprocesses the multi-modal data, the preprocessed multi-modal data are obtained, then the preprocessed multi-modal data are input into the multi-modal combined representation model, the multi-modal data are coded, after the dense-state semantic representation vector corresponding to the multi-modal data is obtained, the dense-state semantic representation vector is sent to the server, and the server correspondingly receives the dense-state semantic representation vector sent by the client side.
Step S140: determining semantic retrieval keywords associated with the dense-state semantic representation vectors according to a preset retrieval index table; the preset retrieval index table comprises a mapping relation between a semantic retrieval keyword and a data address of stored data.
In the preset retrieval index table, there may be a plurality of data addresses of stored data having a mapping relationship with the semantic retrieval keyword, and the preset retrieval index table may also include a cluster of retrieval keywords and their corresponding stored data addresses. That is, in the preset search index table, the semantic search keyword may have a one-to-many relationship with the data address, or when the semantic search keyword has a one-to-many relationship with the search keyword, the search keyword may have a one-to-many relationship with the data address. When the preset retrieval index table with the structure is constructed, clustering can be carried out by adopting a clustering algorithm, and compared with a mode of respectively comparing with each retrieval key word, the clustered preset retrieval index table can prevent the condition that the scale of the index table is linearly increased along with the increase of the scale of data, and the retrieval complexity is reduced when retrieval is convenient.
Specifically, step S140 may include:
step S141: and determining the association values between the dense semantic representation vector and all retrieval keywords in a preset retrieval index table according to the preset retrieval index table.
Wherein, the correlation value calculation formula is as follows:
Figure BDA0003488717500000101
wherein, En () represents an encryption method, En (e) represents a dense semantic representation vector, En (e') represents a search keyword in a preset search index table, it needs to be described that the search keyword is dense, n represents a number, and i represents any one of the number n.
And the server calculates the association values between the dense semantic representation vector and all the retrieval keywords in the preset retrieval index table according to the preset retrieval index table and the association value calculation formula.
Step S142: determining semantic retrieval keywords associated with the dense-state semantic representation vectors according to the relevance values; and the semantic retrieval keywords comprise retrieval keywords which are sequenced according to the relevance values.
And after calculating the relevance values between the dense-state semantic representation vector and all the retrieval keywords in the preset retrieval index table, the server performs descending sorting according to the relevance values, and the obtained sorting result is the semantic retrieval keywords relevant to the dense-state semantic representation vector.
Step S160: and searching a data address corresponding to the semantic retrieval key word in the preset retrieval index table to obtain a dense-state semantic retrieval result, and sending the dense-state semantic retrieval result to the client so that the client decrypts and displays the dense-state semantic retrieval result.
And the server searches data addresses corresponding to the semantic retrieval keywords in a preset retrieval index table respectively for the semantic retrieval keywords which are sequenced according to the calculated relevance values, and then a dense semantic retrieval result in a list form is obtained. The server sends the dense semantic retrieval result to the client, and after receiving the retrieval result, the client decrypts the retrieval result by adopting the same encryption method as that used when the extracted semantic representation vector is encrypted, and then displays the decrypted retrieval result on the client. In the whole cross-modal privacy semantic retrieval process, the operation performed on the server is performed in an encryption state, so that the safety of data privacy can be ensured, the retrieval result returned to the client is decrypted and then displayed, the retrieval result is displayed to a user in a plaintext state, and the user can know the retrieval result visually.
In an embodiment, before step S120, the method further includes:
step S111: receiving secret state data to be stored and corresponding secret state keywords sent by the client; the secret state data to be stored are obtained by encrypting the data to be stored through the client, the secret state keywords are obtained by extracting semantic features of the data to be stored through the client based on a multi-mode combined representation model, and the keywords are obtained by encrypting;
before the actual retrieval operation is performed in the above retrieval system, the server needs to use a preset retrieval index table and stored data, and therefore, before step S120, the preset retrieval index table and stored user privacy data need to be established on the server, that is, stored data in a secret state uploaded to the server by the user through the client is stored.
The method comprises the steps that a server receives secret data to be stored sent by a client, wherein the secret data to be stored is directly uploaded to the server after the client encrypts the data to be stored, and the data to be stored can be monomodal data or multimodal data; the server receives a secret key word corresponding to the secret data to be stored sent by the client, wherein the secret key word is a semantic feature vector obtained by preprocessing, extracting and encrypting the data to be stored by the client. The specific steps of the client performing semantic feature extraction and encryption on the data to be stored through the multi-modal joint representation model may refer to the specific steps of the client performing semantic feature extraction and encryption on the multi-modal data in step S120, which is not described herein again. It should be noted that, different from the foregoing steps, the data to be stored needs to be encrypted separately, so as to obtain the secret data to be stored, and the secret data to be stored is sent to the server for storage, and is used as a database for subsequent retrieval.
In the embodiment, the client replaces a traditional keyword extraction model taking a text as input by using a multi-mode combined representation model, can obtain a representation vector containing abundant semantic features, takes the representation vector as an extracted keyword, and overcomes the defect that the retrieval keyword and the retrieval result are not semantically related in the traditional method; meanwhile, the extracted keywords are encrypted by adopting a secure KNN searchable encryption method, so that the encryption method supports various relevance calculations.
Step S112: and storing the secret state data to be stored, and acquiring the stored data and the data address thereof.
The server receives the secret state data to be stored, stores the secret state data to be stored into the database, the data becomes stored data, and the server obtains the stored data and the data address thereof by combining the storage address of the data in the database.
Step S113: and constructing an index table according to the secret key words and the data addresses of the stored data to obtain a preset retrieval index table.
Due to the modal complexity of the stored data, when the retrieval index table is retrieved and constructed, the stored data is not directly used as the searched object, but the storage address of the stored data is used as the searched object having an association relation with the secret key word. Therefore, a preset retrieval index table comprising the mapping relation between the semantic retrieval key words and the data addresses of the stored data can be obtained.
The client side extracts semantic representation vectors from data to be stored by using a multi-mode combined representation model, encrypts the extracted semantic representation vectors and the data to be stored by using a data encryption method respectively, and after the encrypted semantic representation vectors, namely secret state keywords and the encrypted data to be stored, namely the secret state data to be stored are uploaded to the server, the server can construct a retrieval index table according to the secret state keywords and the data addresses of the secret state data to be stored to obtain a preset retrieval index table.
The user can also store other data on the server according to the steps S111-S113, and enrich the stored data in the database, namely enrich the searched object; and updating the retrieval index table on the server so as to directly use the updated preset retrieval index table in the next retrieval, so that the retrieval result has real-time performance and accuracy.
Specifically, step S113 may include:
step S113.1: constructing an index table according to the secret key words and the data addresses of the stored data to obtain a retrieval index table;
the retrieval index table comprises one-to-one or one-to-many mapping relation of secret key words and data addresses.
Step S113.2: according to the retrieval index table, determining similarity values among the dense-state keywords;
step S113.3: performing clustering processing according to the similarity value to obtain semantic retrieval keywords; the semantic retrieval keywords comprise a cluster of dense keywords of which the similarity value is smaller than a preset threshold value;
step S113.4: and obtaining a preset retrieval index table according to the mapping relation between the semantic retrieval key words and the corresponding data addresses of the stored data.
Specifically, the stored data and the corresponding dense keywords are directly used for establishing the mapping relationship, and the index table corresponding to one may have the situations of more contents and more complex structures, which is inconvenient for subsequent semantic retrieval to be performed quickly, so that the index table can be clustered. Firstly, calculating the similarity value between the dense-state keywords in the retrieval index table constructed in step S113.1, identifying the similarity value meeting the preset requirement, for example, the dense-state keywords meeting the preset threshold value as a class or a cluster with semantic relevance, summarizing the dense-state keywords to obtain an index sub-table, and then correspondingly setting an index for the index sub-table, i.e., representing the class or the cluster of dense-state keywords to obtain semantic retrieval keywords, thereby obtaining the preset retrieval index table for use in steps S140 and S160.
As shown in table 1, an example of a preset search index table obtained by clustering according to the above method is shown:
TABLE 1
Figure BDA0003488717500000131
Based on the preset retrieval index table, during retrieval, only the dense-state semantic representation vectors are required to be respectively subjected to association value calculation with the semantic retrieval key words 1, the semantic retrieval key words 2 and … … and the semantic retrieval key words m, so that dense-state semantic retrieval results are obtained after the semantic retrieval key words 1, the semantic retrieval key words 2 and … … and the semantic retrieval key words m are sequenced.
It should be noted that the steps S111 to S113 may be branches of steps executed independently, so as to achieve the purpose of uploading data; after steps S111 to S113, steps S120 to S160 may be executed in order to achieve the purpose of data retrieval.
It should be noted that, in the method, the method used by the client for data encryption and data decryption may be a conventional symmetric encryption method. The encryption method is selected to ensure that cosine similarity cos (e, e ') between a keyword e obtained by semantic feature extraction of the client on the data to be stored in the data uploading process and a semantic representation vector e' obtained by semantic feature extraction of the client on multi-mode data in the data retrieval process is consistent with cosine similarity cos (En (e), En (e ')) of the dense semantic representation vector En (e') in the data retrieval process, namely, formula 1 is satisfied:
cos(e,e′)=cos(En(e),En(e′)),
the cosine similarity calculation formula is as follows:
Figure BDA0003488717500000141
thus, equation 1 can be converted to equation 2:
Figure BDA0003488717500000142
the keywords e and the semantic representation vector e ', the dense-state keywords En (e) and the dense-state semantic representation vector En (e') are vectors, and the Euclidean distance of the vectors is normalized by the L2 norm after the vectors are subjected to L2 norm normalizationIs 1, i.e.
Figure BDA0003488717500000143
Therefore, equation 2 can be rewritten as equation 3:
Figure BDA0003488717500000144
it can be seen that after L2 norm normalization of vectors, the cosine similarity cos (e, e ') between the keyword e and the semantic representation vector e' is equal to the direct multiplication of two vectors, and equation 3 can be further transformed into equation 4:
cos(e,e′)=cos(En(e),En(e′))=e·e′=En(e)·En(e′),
in this case, it is only necessary to solve the problem that the encryption method can ensure that equation 6 is established. The theorem of multiplying two reciprocal matrixes by 1 can adopt a matrix M as an encryption matrix corresponding to an encryption method and a transposition M of the inverse of the matrix-1TAs a decryption matrix, equation 5 exists:
e·e′=En(e)·En(e′)=(e·M)·(e′·M-1T),
wherein, the vector e is multiplied by the encryption matrix M to obtain an encryption vector En (e), namely a secret key word, and the vector e' is multiplied by the decryption matrix M-1TObtaining an encryption vector En (e'), namely a dense semantic representation vector; plaintext vector multiplication is equivalent to ciphertext vector multiplication. Thus, the client may use a randomly generated encryption matrix M as the encryption key, a transpose of the inverse of M, M-1TAs decryption matrices, the encryption and decryption operations of the client are used respectively.
According to the method, the accuracy and semantic relevance of the retrieval result can be effectively improved on the premise of ensuring the privacy query and the privacy security of cloud data.
According to the cross-modal privacy semantic retrieval method provided by the embodiment, a client performs semantic feature extraction on multi-modal data based on a multi-modal combined representation model to obtain a semantic representation vector, encrypts the semantic representation vector to obtain a dense semantic representation vector, sends the dense semantic representation vector to a server, the server determines semantic retrieval keywords associated with the dense semantic representation vector according to a preset retrieval index table, searches data addresses corresponding to the semantic retrieval keywords in the preset retrieval index table to obtain a dense semantic retrieval result, returns the dense semantic retrieval result to the client, and the client decrypts and displays the dense semantic retrieval result to achieve the purpose of privacy semantic retrieval of the multi-modal data. According to the invention, semantic representation is not required to be carried out by relying on a keyword bank, so that the semantic correlation between the retrieval keywords and the retrieval result can be ensured, and the effect of improving the accuracy of the semantic retrieval result on the premise of ensuring the query privacy and the stored data privacy is realized.
Example two
Based on the same inventive concept, referring to the flow diagram of fig. 1, a second embodiment of the cross-modal privacy semantic retrieval method of the present invention is provided, and the cross-modal privacy semantic retrieval method is applied to a client. The client is a terminal device capable of realizing network connection, and the client can be a mobile phone, a computer, a tablet computer, a portable computer and other terminal devices.
Fig. 4 is a schematic diagram of a hardware structure of the client. The client may include: a processor 2001, such as a CPU (Central Processing Unit), a communication bus 2002, a user interface 2003, a network interface 2004, and a memory 2005. Those skilled in the art will appreciate that the hardware configuration shown in fig. 4 is not meant to limit the present invention, and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
Specifically, the communication bus 2002 is used to implement connection communication between these components; the user interface 2003 is used for connecting other user terminals and communicating data with other user terminals, the user interface 2003 may include an output unit, such as a display screen, an input unit, such as a keyboard, and optionally, the user interface 2003 may also include other input/output interfaces, such as a standard wired interface and a wireless interface; the network interface 2004 is used for connecting to the backend server and performing data communication with the backend server, and the network interface 2004 may include an input/output interface, such as a standard wired interface, a wireless interface, such as a Wi-Fi interface; the memory 2005 is used for storing various types of data, which may include, for example, instructions of any application program or method in the server and application program-related data, and the memory 2005 may be a high-speed RAM memory, or a stable memory such as a disk memory, and optionally, the memory 2005 may be a storage device independent of the processor 2001;
specifically, with continued reference to fig. 4, the memory 2005 may include an operating system, a network communication module, a user interface module, and a cross-modal privacy semantic retrieval program, where the network communication module is mainly used to connect to a server and perform data communication with the server; the processor 1001 is configured to invoke the cross-modal privacy semantics retrieving program stored in the memory 2005 and perform the following operations:
based on the client, the following describes in detail the cross-modal privacy semantic retrieval method according to this embodiment with reference to the flowchart shown in fig. 3. The method may comprise the steps of:
step S210: multimodal data is acquired.
The user can directly input data to be retrieved on the client, and the input data can be multi-modal data, namely data information comprising a plurality of different modalities. The multimodal data may include one or more of different modality data, such as text data, image data, voice data, video data, and physiological signal data, and may be obtained by different applications of the client.
Step S230: and extracting semantic features of the multi-modal data through a multi-modal joint representation model to obtain a semantic representation vector.
Specifically, step S230 may include:
step S231: preprocessing the multi-modal data to obtain preprocessed multi-modal data; wherein the preprocessing includes any one of format switching, text encoding, size scaling, and noise cancellation.
The preprocessing can arrange the multi-modal data into data which can be directly input into the multi-modal joint representation model, and the problem that the model cannot extract the feature vectors due to the fact that the data input into the model are not in accordance with requirements is avoided. For example, when the multi-modal data includes text data and image data, the text data may be preprocessed by using a BPE method to obtain preprocessed text data, the image data may be preprocessed by using a transform method to obtain preprocessed image data, and then the preprocessed different-modal data are summarized to obtain preprocessed multi-modal data.
Step S232: inputting the preprocessed multi-modal data into a multi-modal joint representation model, and outputting semantic feature vectors; wherein the multi-modal joint characterization model comprises a representation learning model that performs modal transformation on the multi-modal data.
Specifically, the multi-modal joint representation model may include a BERT model for performing feature vectorization coding on text data, may also include a ResNet50 model for performing feature vectorization coding on image data, and specifically may also perform model selection respectively according to modality types in the multi-modal data, and then summarize these models into the multi-modal joint representation model, thereby implementing feature vector extraction on the multi-modal data.
And the client inputs the preprocessed multi-modal data into the multi-modal combined representation model, and encodes the multi-modal data to obtain the semantic feature vector of the multi-modal data.
Step S250: and encrypting the semantic representation vector to obtain a secret semantic representation vector.
In order to ensure data privacy security during cross-modal data retrieval, before a client uploads a semantic representation vector to a server, the semantic representation vector needs to be encrypted, and the encryption method needs to support multiple correlation calculation methods so that correlation between the semantic representation vector before encryption and the encrypted dense semantic representation vector of different modal data is consistent. The method can prevent the semantic representation vector from changing due to encryption operation and influence on the accuracy of subsequent retrieval.
Step S270: sending the dense-state semantic representation vector to a server so that the server determines a semantic retrieval keyword associated with the dense-state semantic representation vector according to a preset retrieval index table, searches a data address corresponding to the semantic retrieval keyword in the preset retrieval index table to obtain a dense-state semantic retrieval result, and sends the dense-state semantic retrieval result to the client; the preset retrieval index table comprises a mapping relation between a semantic retrieval keyword and a data address of stored data;
step S290: and receiving a dense semantic retrieval result sent by the server, and decrypting and displaying the dense semantic retrieval result.
In one embodiment, the method may further comprise:
step S221: determining the multi-modal data as data to be stored;
when a user needs to upload data, only corresponding operation is needed to be carried out on a client, and multi-mode data are imported, and the multi-mode data can be determined as data to be stored and cannot be regarded as data to be retrieved;
step S222: encrypting the data to be stored to obtain secret data to be stored;
the client side encrypts the data to be stored to obtain secret data to be stored;
step S223: extracting semantic features of the data to be stored based on a multi-modal joint representation model to obtain keywords;
specifically, the method can further comprise the steps of preprocessing data to be stored, inputting the preprocessed data to the multi-modal joint representation model for semantic feature extraction, and determining the extracted feature vectors as keywords corresponding to the data to be stored;
step S224: encrypting the keywords to obtain secret keywords;
step S225: and sending the secret state data to be stored and the secret state keywords to a server, so that the server stores the secret state data to be stored after receiving the secret state data to be stored and the corresponding secret state keywords, obtains stored data and a data address thereof, and constructs an index table according to the secret state keywords and the data address of the stored data, and obtains a preset retrieval index table.
For further details of the specific implementation of the above method steps, reference may be made to the description of the specific implementation of the first embodiment, and for the sake of brevity of the description, repeated descriptions are omitted here.
According to the cross-modal privacy semantic retrieval method provided by the embodiment, when the keywords are extracted from the client, the keywords do not need to depend on a predefined keyword library, the semantic keywords of the stored data uploaded to the server by the user through the client can be directly used as indexes in the retrieval index table, the relevance calculation is directly carried out on the semantic keywords and the dense semantic representation vectors, and the retrieval with semantic feature privacy is carried out on the multimode heterogeneous data with the optimal performance. On the premise of guaranteeing the safety of inquiry privacy and cloud data privacy, the accuracy of the retrieval result is effectively obtained, and cross-modal data retrieval for guaranteeing the safety of user privacy can be met.
EXAMPLE III
Based on the same inventive concept, referring to fig. 2, a schematic diagram of a hardware structure of the server of the present invention is shown. This embodiment provides a server, which may include a processor and a memory, where the memory stores a cross-modal privacy semantics retrieving program, and when the cross-modal privacy semantics retrieving program is executed by the processor, all or part of the steps of the first embodiment of the cross-modal privacy semantics retrieving method according to the present invention are implemented.
Specifically, the server may be a network device capable of implementing network connection, or may be a cloud service platform capable of implementing network connection.
It will be appreciated that the server may also include a communications bus, a user interface, and a network interface. Wherein the communication bus is used for realizing connection communication among the components. The user interface is used for connecting the user terminal and performing data communication with the user terminal, and the user interface may include an output unit such as a display screen and an input unit such as a keyboard. The network interface is used for connecting the background server and performing data communication with the background server, and the network interface may include an input/output interface, such as a standard wired interface, a wireless interface, such as a Wi-Fi interface.
The memory is used to store various types of data, which may include, for example, instructions for any application or method in the server, as well as application-related data. The Memory may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), Random Access Memory (RAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic or optical disk, or alternatively, the Memory may be a storage device independent of the processor.
The Processor may be a Digital Signal Processor (DSP), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a controller, a microcontroller, a microprocessor, or other electronic components, and is configured to perform all or part of the steps of the above-described first embodiment of the cross-modal privacy semantic retrieval method.
Example four
Based on the same inventive concept, referring to fig. 4, a schematic diagram of a hardware structure of the client is shown. This embodiment provides a client, which may include a processor and a memory, where the memory stores a cross-modal privacy semantic retrieval program, and when the cross-modal privacy semantic retrieval program is executed by the processor, all or part of the steps of the second embodiment of the cross-modal privacy semantic retrieval method according to the present invention are implemented.
Specifically, the client is a terminal device capable of implementing network connection, and may be a mobile phone, a computer, a tablet computer, a portable computer, or other terminal devices.
It will be appreciated that the client may also include a communication bus, a user interface, and a network interface. Wherein the communication bus is used for realizing connection communication among the components. The user interface is used for connecting other client terminals and carrying out data communication with the other client terminals, and the user interface can comprise an output unit such as a display screen and an input unit such as a keyboard. The network interface is used for connecting the background server and performing data communication with the background server, and the network interface may include an input/output interface, such as a standard wired interface, a wireless interface, such as a Wi-Fi interface.
The memory is used to store various types of data, which may include, for example, instructions for any application or method in the client, as well as application-related data. The Memory may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), Random Access Memory (RAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic or optical disk, or alternatively, the Memory may be a storage device independent of the processor.
The Processor may be a Digital Signal Processor (DSP), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a controller, a microcontroller, a microprocessor, or other electronic components, and is configured to perform all or part of the steps of the second embodiment of the cross-modal privacy semantic retrieval method.
EXAMPLE five
Based on the same inventive concept, referring to fig. 5, a first embodiment of the cross-modal privacy semantic retrieval system of the present invention is provided, which may include:
the server 100 as described in embodiment three;
the client 200 as described in example four;
wherein, the server 100 is connected with the client 200 in communication.
The cross-modal privacy semantic retrieval system provided in this embodiment is described in detail below with reference to a functional module diagram shown in fig. 5.
The server 100 may include:
the data receiving module 101 is configured to receive a dense semantic representation vector sent by a client; the secret semantic representation vector is obtained by the client side through carrying out semantic feature extraction on multimodal data based on a multimodal joint representation model to obtain a semantic representation vector and encrypting the semantic representation vector;
the semantic association module 102 is configured to determine a semantic retrieval keyword associated with the dense semantic representation vector according to a preset retrieval index table; the preset retrieval index table comprises a mapping relation between a semantic retrieval keyword and a data address of stored data;
the semantic retrieval module 103 is configured to search a data address corresponding to the semantic retrieval keyword in the preset retrieval index table, obtain a dense-state semantic retrieval result, and send the dense-state semantic retrieval result to the client, so that the client decrypts and displays the dense-state semantic retrieval result.
Further, the semantic association module 102 may include:
the relevancy calculation unit is used for determining relevancy values between the dense semantic representation vector and all retrieval keywords in a preset retrieval index table according to the preset retrieval index table;
the semantic retrieval unit is used for determining semantic retrieval keywords associated with the dense-state semantic representation vector according to the relevance value; and the semantic retrieval keywords comprise retrieval keywords which are sequenced according to the relevance values.
Further, the first receiving module 101 is further configured to receive secret state data to be stored and a corresponding secret state keyword, where the secret state data is sent by the client; the secret state data to be stored are obtained by encrypting the data to be stored through the client, the secret state keywords are obtained by extracting semantic features of the data to be stored through the client based on a multi-mode combined representation model, and the keywords are obtained by encrypting;
the server 100 may further include:
the storage module is used for storing the secret state data to be stored and obtaining the stored data and the data address thereof;
and the index table building module is used for building an index table according to the secret key words and the data addresses of the stored data to obtain a preset retrieval index table.
Still further, the index table building module may include:
the initial construction unit is used for constructing an index table according to the secret key words and the data addresses of the stored data to obtain a retrieval index table;
the similarity calculation unit is used for determining similarity values among the dense-state keywords according to the retrieval index table;
the clustering unit is used for carrying out clustering processing according to the similarity value to obtain a semantic retrieval keyword; the semantic retrieval keywords comprise a cluster of dense keywords of which the similarity value is smaller than a preset threshold value;
and the mapping unit is used for acquiring a preset retrieval index table according to the mapping relation between the semantic retrieval key words and the corresponding data addresses of the stored data.
The client 200 may include:
a data acquisition module 201, configured to acquire multimodal data;
the feature extraction module 202 is configured to perform semantic feature extraction on the multi-modal data through a multi-modal joint representation model to obtain a semantic representation vector;
the encryption module 203 is configured to encrypt the semantic representation vector to obtain a secret semantic representation vector;
the data sending module 204 is configured to send the dense-state semantic representation vector to a server, so that the server determines a semantic retrieval keyword associated with the dense-state semantic representation vector according to a preset retrieval index table, searches a data address corresponding to the semantic retrieval keyword in the preset retrieval index table, obtains a dense-state semantic retrieval result, and sends the dense-state semantic retrieval result to the client; the preset retrieval index table comprises a mapping relation between a semantic retrieval keyword and a data address of stored data;
and the result feedback module 205 is configured to receive the dense-state semantic search result sent by the server, and decrypt and display the dense-state semantic search result.
Further, the feature extraction module 202 may include:
the preprocessing unit is used for preprocessing the multi-modal data to obtain preprocessed multi-modal data; wherein the preprocessing comprises any one of format switching, text encoding, size scaling and noise elimination;
the semantic representation unit is used for inputting the preprocessed multi-modal data into a multi-modal combined representation model and outputting semantic feature vectors; wherein the multi-modal joint characterization model comprises a representation learning model that performs modal transformation on the multi-modal data.
It should be noted that, the functions that can be realized by each module in the cross-modal privacy semantic retrieval system provided by this embodiment and the technical effects that can be correspondingly achieved may refer to the description of the specific implementation manner in each embodiment of the cross-modal privacy semantic retrieval method of the present invention, and for the sake of brevity of the description, no further description is given here.
EXAMPLE six
Based on the same inventive concept, the present embodiment provides a computer-readable storage medium, such as a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, etc., on which a computer program is stored, the computer program being executable by one or more processors, and the computer program, when executed by the processors, implementing all or part of the steps of the various embodiments of the cross-modal privacy semantic retrieval method of the present invention.
It should be noted that the above-mentioned serial numbers of the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments. The above description is only an alternative embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A cross-modal privacy semantic retrieval method is applied to a server, and comprises the following steps:
receiving a dense semantic representation vector sent by a client; the dense semantic representation vector is obtained by performing semantic feature extraction on multi-modal data through the client based on a multi-modal joint representation model to obtain a semantic representation vector and encrypting the semantic representation vector;
determining semantic retrieval keywords associated with the dense-state semantic representation vectors according to a preset retrieval index table; the preset retrieval index table comprises a mapping relation between a semantic retrieval keyword and a data address of stored data;
and searching a data address corresponding to the semantic retrieval key word in the preset retrieval index table to obtain a dense-state semantic retrieval result, and sending the dense-state semantic retrieval result to the client so that the client decrypts and displays the dense-state semantic retrieval result.
2. The cross-modal privacy semantic retrieval method of claim 1 wherein the step of determining semantic retrieval keywords associated with the dense-state semantic representation vector according to a preset retrieval index table comprises:
determining the association values between the dense semantic representation vector and all retrieval keywords in a preset retrieval index table according to the preset retrieval index table;
determining semantic retrieval keywords associated with the dense-state semantic representation vectors according to the relevance values; and the semantic retrieval keywords comprise retrieval keywords which are sequenced according to the relevance values.
3. The cross-modal privacy semantic retrieval method of claim 1 wherein the step of receiving a dense-state semantic representation vector sent by a client is preceded by the method further comprising:
receiving secret state data to be stored and corresponding secret state keywords sent by the client; the secret state data to be stored are obtained by encrypting the data to be stored through the client, the secret state keywords are obtained by extracting semantic features of the data to be stored through the client based on a multi-mode combined representation model, and the keywords are obtained by encrypting;
storing the secret state data to be stored, and acquiring stored data and data addresses thereof;
and constructing an index table according to the secret key words and the data addresses of the stored data to obtain a preset retrieval index table.
4. The cross-modal privacy semantic retrieval method of claim 3, wherein the step of constructing an index table according to the secret key words and the data addresses of the stored data to obtain a preset retrieval index table comprises:
constructing an index table according to the secret key words and the data addresses of the stored data to obtain a retrieval index table;
according to the retrieval index table, determining similarity values among the dense-state keywords;
clustering processing is carried out according to the similarity value to obtain a semantic retrieval keyword; the semantic retrieval keywords comprise a cluster of dense keywords of which the similarity value is smaller than a preset threshold value;
and obtaining a preset retrieval index table according to the mapping relation between the semantic retrieval key words and the corresponding data addresses of the stored data.
5. A cross-modal privacy semantic retrieval method is applied to a client, and comprises the following steps:
obtaining multi-modal data;
extracting semantic features of the multi-modal data through a multi-modal joint representation model to obtain a semantic representation vector;
encrypting the semantic representation vector to obtain a secret semantic representation vector;
sending the dense-state semantic representation vector to a server so that the server determines a semantic retrieval keyword associated with the dense-state semantic representation vector according to a preset retrieval index table, searches a data address corresponding to the semantic retrieval keyword in the preset retrieval index table to obtain a dense-state semantic retrieval result, and sends the dense-state semantic retrieval result to the client; the preset retrieval index table comprises a mapping relation between a semantic retrieval keyword and a data address of stored data;
and receiving a dense semantic retrieval result sent by the server, and decrypting and displaying the dense semantic retrieval result.
6. The cross-modal privacy semantic retrieval method of claim 5, wherein the step of performing semantic feature extraction on the multi-modal data through a multi-modal joint representation model to obtain a semantic representation vector comprises:
preprocessing the multi-modal data to obtain preprocessed multi-modal data; wherein the preprocessing comprises any one of format switching, text encoding, size scaling and noise elimination;
inputting the preprocessed multi-modal data into a multi-modal joint representation model, and outputting semantic feature vectors; wherein the multi-modal joint characterization model comprises a representation learning model that modality converts the multi-modal data.
7. A server, characterized in that the server comprises a memory and a processor, the memory having stored thereon a cross-modal privacy semantics retrieval program that, when executed by the processor, implements a cross-modal privacy semantics retrieval method according to any one of claims 1 to 5.
8. A client, comprising a memory and a processor, the memory having stored thereon a cross-modal privacy semantics retrieval program that, when executed by the processor, implements a cross-modal privacy semantics retrieval method as recited in claim 6.
9. A cross-modal privacy semantic retrieval system, the system comprising:
the server of claim 7;
the client of claim 8;
the server is in communication connection with the client.
10. A computer-readable storage medium having stored thereon a computer program executable by one or more processors to implement a cross-modal privacy semantic retrieval method according to any one of claims 1 to 6.
CN202210089487.XA 2022-01-25 2022-01-25 Cross-modal privacy semantic retrieval method, system and storage medium Pending CN114519202A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210089487.XA CN114519202A (en) 2022-01-25 2022-01-25 Cross-modal privacy semantic retrieval method, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210089487.XA CN114519202A (en) 2022-01-25 2022-01-25 Cross-modal privacy semantic retrieval method, system and storage medium

Publications (1)

Publication Number Publication Date
CN114519202A true CN114519202A (en) 2022-05-20

Family

ID=81596595

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210089487.XA Pending CN114519202A (en) 2022-01-25 2022-01-25 Cross-modal privacy semantic retrieval method, system and storage medium

Country Status (1)

Country Link
CN (1) CN114519202A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117113385A (en) * 2023-10-25 2023-11-24 成都乐超人科技有限公司 Data extraction method and system applied to user information encryption
CN117951745A (en) * 2024-03-25 2024-04-30 腾讯科技(深圳)有限公司 Database construction method, device, equipment, storage medium and program product

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117113385A (en) * 2023-10-25 2023-11-24 成都乐超人科技有限公司 Data extraction method and system applied to user information encryption
CN117113385B (en) * 2023-10-25 2024-03-01 成都乐超人科技有限公司 Data extraction method and system applied to user information encryption
CN117951745A (en) * 2024-03-25 2024-04-30 腾讯科技(深圳)有限公司 Database construction method, device, equipment, storage medium and program product
CN117951745B (en) * 2024-03-25 2024-07-05 腾讯科技(深圳)有限公司 Database construction method, device, equipment, storage medium and program product

Similar Documents

Publication Publication Date Title
Yuan et al. SEISA: Secure and efficient encrypted image search with access control
EP3855324A1 (en) Associative recommendation method and apparatus, computer device, and storage medium
CN110597963B (en) Expression question-answering library construction method, expression search device and storage medium
US9197613B2 (en) Document processing method and system
CN114519202A (en) Cross-modal privacy semantic retrieval method, system and storage medium
US8819408B2 (en) Document processing method and system
CN109992978B (en) Information transmission method and device and storage medium
WO2020206910A1 (en) Product information pushing method, apparatus, computer device, and storage medium
CN111026788A (en) Homomorphic encryption-based multi-keyword ciphertext sorting and retrieving method in hybrid cloud
CN113434636B (en) Semantic-based approximate text searching method, semantic-based approximate text searching device, computer equipment and medium
CN111400513B (en) Data processing method, device, computer equipment and storage medium
CN115017107A (en) Data retrieval method and device based on privacy protection, computer equipment and medium
CN109348262B (en) Calculation method, device, equipment and storage medium for anchor similarity
CN114528588A (en) Cross-modal privacy semantic representation method, device, equipment and storage medium
Gong et al. A privacy-preserving image retrieval method based on improved bovw model in cloud environment
CN112598039A (en) Method for acquiring positive sample in NLP classification field and related equipment
CN116644146A (en) Document searching method, device and system, electronic equipment and storage medium
CN108319659B (en) Social contact discovery method based on encrypted image quick search
CN108509059B (en) Information processing method, electronic equipment and computer storage medium
CN116775980B (en) Cross-modal searching method and related equipment
CN114398883B (en) Presentation generation method and device, computer readable storage medium and server
CN116186708A (en) Class identification model generation method, device, computer equipment and storage medium
Fidaleo et al. Decoherence for Markov chains
CN115129976A (en) Resource recall method, device, equipment and storage medium
CN115203391A (en) Information retrieval method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination