CN118051653B

CN118051653B - Multi-mode data retrieval method, system and medium based on semantic association

Info

Publication number: CN118051653B
Application number: CN202410451457.8A
Authority: CN
Inventors: 赵汝强; 李礼红; 汤冬儿; 朱栩; 邵德伟; 江晓锋; 唐庆宁; 陈铭
Original assignee: Guangzhou Yunqu Information Technology Co ltd
Current assignee: Guangzhou Yunqu Information Technology Co ltd
Priority date: 2024-04-16
Filing date: 2024-04-16
Publication date: 2024-07-05
Anticipated expiration: 2044-04-16
Also published as: CN118051653A

Abstract

The application provides a multi-mode data retrieval method, a multi-mode data retrieval system and a multi-mode data retrieval medium based on semantic association. The method comprises the following steps: acquiring data to be searched in different modes, carrying out semantic recognition, generating semantic feature data in different modes, classifying, generating multi-mode semantic feature data clusters in different categories, carrying out semantic recognition on user search data, processing to obtain expanded search keyword data, carrying out matching recognition on the expanded search keyword data and the multi-mode semantic feature data clusters, obtaining search result data, processing according to user history search log data to obtain a user search preference model, carrying out cross-mode semantic fusion on the search result data, processing according to the user search preference model, obtaining optimized semantic fusion result data, and sending the optimized semantic fusion result data to a user; therefore, the purposes of quickly searching the multi-mode data and accurately adapting to the searching requirements of the user are achieved.

Description

Multi-mode data retrieval method, system and medium based on semantic association

Technical Field

The application relates to the technical field of big data and multi-modal data retrieval, in particular to a multi-modal data retrieval method, a system and a medium based on semantic association.

Background

The multi-modal data retrieval technology is a technology for inputting a single-modal query word and returning related multi-modal data after retrieval, however, due to the huge data volume of a multi-modal data set, the problems of low retrieval speed and low retrieval efficiency are often existed in retrieval, and a technology capable of rapidly retrieving massive multi-modal data and accurately adapting to the retrieval requirements of users is not yet available.

In view of the above problems, an effective technical solution is currently needed.

Disclosure of Invention

The application aims to provide a multi-modal data retrieval method, a system and a medium based on semantic association, wherein the multi-modal data is firstly classified so as to facilitate quick retrieval, then a retrieval keyword is expanded on a semantic identification result of user retrieval data so as to facilitate obtaining a more accurate and comprehensive multi-modal retrieval result, then the multi-modal retrieval result is subjected to semantic fusion so as to improve retrieval precision, and finally the retrieval result is optimized through retrieval preference of a user, so that the aim of accurately adapting to the retrieval requirement of the user is fulfilled.

The application also provides a multi-mode data retrieval method based on semantic association, which comprises the following steps:

acquiring data to be retrieved of different modes, carrying out semantic recognition, and generating semantic feature data of different modes;

classifying the semantic feature data of different modes to generate multi-mode semantic feature data clusters of different categories;

acquiring user retrieval data, carrying out semantic recognition, and processing according to semantic recognition results to acquire expanded retrieval keyword data;

Matching and identifying the expanded search keyword data and the multi-mode semantic feature data cluster to obtain search result data;

Acquiring user history retrieval log data, and processing according to the user history retrieval log data to acquire a user retrieval preference model;

and performing cross-modal semantic fusion on the search result data, processing according to the user search preference model, obtaining optimized semantic fusion result data, and transmitting the optimized semantic fusion result data to a user.

Optionally, in the multi-mode data retrieval method based on semantic association according to the present application, the obtaining data to be retrieved of different modes and performing semantic recognition, generating semantic feature data of different modes includes:

Acquiring data to be retrieved of different modes and corresponding mode data thereof;

inputting the modal data into a preset semantic feature extraction algorithm library for matching and identifying to obtain an adaptive semantic feature extraction algorithm;

And respectively extracting semantic features of the data to be searched in different modes according to the adaptive semantic feature extraction algorithm corresponding to the different modes, and generating semantic feature data in different modes.

Optionally, in the multi-mode data retrieval method based on semantic association according to the present application, the classifying the semantic feature data of different modes to generate multi-mode semantic feature data clusters of different categories includes:

mapping the semantic feature data of different modes into the same semantic space by using a preset cross-mode mapping model;

Calculating the spatial distances of the semantic feature data of different modes by using a preset spatial distance measurement algorithm to obtain a spatial distance measurement value;

Classifying the semantic feature data of different modes according to the spatial distance metric value to generate multi-mode semantic feature data clusters of different categories;

And respectively adding semantic tags to the multi-mode semantic feature data clusters of different categories, and generating a multi-mode semantic tag data set by all semantic tag sets.

Optionally, in the multi-mode data retrieval method based on semantic association according to the present application, the obtaining user retrieval data, performing semantic recognition on the user retrieval data, and processing to obtain expanded retrieval keyword data includes:

acquiring user retrieval data and extracting keywords to acquire retrieval keyword data;

carrying out semantic recognition on the user retrieval data, and extracting retrieval keyword demand data according to a semantic recognition result;

and carrying out keyword expansion on the search keyword data according to the search keyword demand data to obtain expanded search keyword data.

Optionally, in the multi-mode data retrieval method based on semantic association according to the present application, the matching and identifying the expanded search keyword data and the multi-mode semantic feature data cluster to obtain search result data includes:

Matching and identifying the expanded search keyword data and the multi-mode semantic tag data set by using a preset multi-mode contrast learning algorithm to obtain a target semantic tag;

taking the multi-mode semantic feature data cluster corresponding to the target semantic tag as a target multi-mode semantic feature data cluster;

Calculating the space distance between the semantic feature data of different modes in the target multi-mode semantic feature data cluster and the expanded search keyword data by using a preset space distance measurement algorithm to obtain a search space distance measurement value;

Comparing the search space distance measurement value with a preset search space distance measurement threshold value, and taking the semantic feature data corresponding to the search space distance measurement value meeting the threshold value comparison requirement as target semantic feature data;

adding the retrieval space distance measurement values corresponding to all the target semantic feature data in the same mode to obtain a retrieval space distance measurement comprehensive value;

comparing the retrieval space distance measurement comprehensive values corresponding to the target semantic feature data of different modes;

If the comparison result exceeds a preset comparison result threshold, performing differential adjustment on the preset retrieval space distance measurement threshold corresponding to the semantic feature data of different modes to obtain an optimized retrieval space distance measurement threshold;

Comparing the retrieval space distance measurement values corresponding to the semantic feature data of different modes with the optimized retrieval space distance measurement thresholds respectively, and taking the semantic feature data corresponding to the retrieval space distance measurement values meeting the threshold comparison requirement as update target semantic feature data;

And taking the data to be searched corresponding to the update target semantic feature data as search result data.

Optionally, in the multi-mode data retrieval method based on semantic association according to the present application, the obtaining the user history retrieval log data, and obtaining the user retrieval preference model according to the user history retrieval log data processing, includes:

Obtaining user history retrieval log data, comprising: inquiring modification data, clicking behavior data, browsing history data and collection and preservation data;

And processing the query modification data, the clicking behavior data, the browsing history data and the collection storage data by using a preset machine learning algorithm to obtain a user retrieval preference model.

Optionally, in the multi-mode data retrieval method based on semantic association according to the present application, the cross-mode semantic fusion is performed on the retrieval result data, and processing is performed according to the user retrieval preference model, so as to obtain optimized semantic fusion result data, and the optimized semantic fusion result data is sent to a user, including:

Performing cross-modal semantic fusion on the search result data by using a preset cross-modal semantic fusion algorithm to obtain semantic fusion result data;

inputting the semantic fusion result data into the user retrieval preference model for processing, obtaining optimized semantic fusion result data and sending the optimized semantic fusion result data to a user.

In a second aspect, the present application provides a multi-modal data retrieval system based on semantic association, the system comprising: the system comprises a memory and a processor, wherein the memory stores a program of a multi-mode data retrieval method based on semantic association, and the program of the multi-mode data retrieval method based on semantic association realizes the following steps when being executed by the processor:

Optionally, in the multi-mode data retrieval system based on semantic association according to the present application, the obtaining data to be retrieved of different modes and performing semantic recognition, generating semantic feature data of different modes includes:

In a third aspect, the present application also provides a computer readable storage medium storing a multi-modal data retrieval method program based on semantic association, which when executed by a processor, implements the steps of the multi-modal data retrieval method based on semantic association as described in any one of the above.

As can be seen from the above, the multi-modal data retrieval method, system and medium based on semantic association provided by the application firstly classifies the multi-modal data so as to facilitate quick retrieval, then expands the retrieval keywords of the semantic recognition result of the user retrieval data so as to obtain a more accurate and comprehensive multi-modal retrieval result, then performs semantic fusion on the multi-modal retrieval result so as to improve the retrieval precision, and finally optimizes the retrieval result through the retrieval preference of the user so as to achieve the purpose of accurately adapting to the retrieval requirement of the user.

Additional features and advantages of the application will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the embodiments of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and drawings.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a multi-modal data retrieval method based on semantic association according to an embodiment of the present application;

FIG. 2 is a flowchart of generating semantic feature data of different modalities according to a multi-modal data retrieval method based on semantic association provided by an embodiment of the present application;

FIG. 3 is a flowchart of generating multi-mode semantic feature data clusters of different categories according to a multi-mode data retrieval method based on semantic association provided by an embodiment of the present application;

fig. 4 is a flowchart of obtaining expanded search keyword data according to a multi-mode data search method based on semantic association provided by an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present application.

It should be noted that like reference numerals and letters refer to like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only to distinguish the description, and are not to be construed as indicating or implying relative importance.

Referring to fig. 1, fig. 1 is a flowchart of a multi-modal data retrieval method based on semantic association according to some embodiments of the present application. The multi-mode data retrieval method based on semantic association is used in terminal equipment, such as computers, mobile phone terminals and the like. The multi-mode data retrieval method based on semantic association comprises the following steps:

S11, acquiring data to be retrieved in different modes, carrying out semantic recognition, and generating semantic feature data in different modes;

s12, classifying the semantic feature data of different modes to generate multi-mode semantic feature data clusters of different categories;

S13, acquiring user retrieval data, carrying out semantic recognition, and processing according to semantic recognition results to acquire expanded retrieval keyword data;

S14, matching and identifying the expanded search keyword data and the multi-mode semantic feature data cluster to obtain search result data;

S15, acquiring user history retrieval log data, and processing according to the user history retrieval log data to acquire a user retrieval preference model;

s16, cross-modal semantic fusion is conducted on the search result data, processing is conducted according to the user search preference model, and optimized semantic fusion result data are obtained and sent to the user.

It should be noted that, firstly, the multi-modal data is classified so as to facilitate quick retrieval, then the semantic recognition result of the user retrieval data is expanded by the retrieval keywords so as to obtain more accurate and comprehensive multi-modal data, then the multi-modal retrieval result is semantically fused so as to improve the retrieval precision, and finally the retrieval result is optimized through the retrieval preference of the user, so that the purpose of accurately adapting to the retrieval requirement of the user is achieved.

Referring to fig. 2, fig. 2 is a flowchart of generating semantic feature data of different modalities according to a multi-modality data retrieval method based on semantic association in some embodiments of the present application. According to the embodiment of the application, the method for acquiring the data to be retrieved in different modes and carrying out semantic recognition to generate semantic feature data in different modes comprises the following steps:

s21, acquiring data to be retrieved of different modes and corresponding mode data thereof;

S22, inputting the modal data into a preset semantic feature extraction algorithm library for matching recognition to obtain an adaptive semantic feature extraction algorithm;

S23, semantic feature extraction is carried out on the data to be retrieved of different modes according to the adaptive semantic feature extraction algorithm corresponding to the different modes, and semantic feature data of different modes are generated.

It should be noted that, in order to improve the accuracy of extracting semantic features of data in different modes, different semantic feature extraction algorithms need to be adopted for data in different modes.

Referring to fig. 3, fig. 3 is a flowchart of generating multi-modal semantic feature data clusters of different categories according to a multi-modal data retrieval method based on semantic association according to some embodiments of the present application. According to an embodiment of the present application, classifying the semantic feature data of different modalities to generate multi-modal semantic feature data clusters of different categories includes:

s31, mapping semantic feature data of different modes into the same semantic space by using a preset cross-mode mapping model;

S32, calculating the spatial distances of the semantic feature data of different modes by using a preset spatial distance measurement algorithm to obtain a spatial distance measurement value;

S33, classifying semantic feature data of different modes according to the spatial distance metric value to generate multi-mode semantic feature data clusters of different categories;

s34, respectively adding semantic tags to the multi-mode semantic feature data clusters of different categories, and generating a multi-mode semantic tag data set by all semantic tag sets.

In order to determine the semantic similarity of the data of different modes, the data of different modes needs to be mapped into the same semantic space, the semantic space refers to a structural space for storing semantic feature data, the similarity of the semantic features is determined by calculating the spatial distance of the semantic feature data of different modes, the closer the spatial distance is, the higher the similarity of the semantic features is, and in order to improve the speed of data retrieval, the semantic feature data with higher semantic similarity is classified into the same type of data.

Referring to fig. 4, fig. 4 is a flowchart of obtaining expanded search keyword data according to a multi-modal data search method based on semantic association according to some embodiments of the present application. According to an embodiment of the present application, the obtaining user search data, performing semantic recognition on the user search data, and processing to obtain expanded search keyword data includes:

S41, acquiring user retrieval data and extracting keywords to acquire retrieval keyword data;

S42, carrying out semantic recognition on the user retrieval data, and extracting retrieval keyword demand data according to a semantic recognition result;

S43, carrying out keyword expansion on the search keyword data according to the search keyword demand data to obtain expanded search keyword data.

It should be noted that, the expansion of the search keywords is performed according to the semantic recognition result of the user search data so as to obtain more accurate and comprehensive multi-modal data.

According to an embodiment of the present invention, the matching and identifying the expanded search keyword data and the multi-mode semantic feature data cluster to obtain search result data includes:

It should be noted that, the search space distance metric value is a value representing semantic similarity, the smaller the search space distance metric value is, the higher the semantic similarity between the two data is, the same preset search space distance metric threshold value is adopted for different mode data when the target semantic feature data is obtained for the first time, if the search space distance metric comprehensive value difference corresponding to different mode data is larger, the similarity between the semantic feature data corresponding to the search space distance metric comprehensive value with smaller value and the search keyword is higher, the data of the mode is more in accordance with the user requirement, and therefore the preset search space distance metric threshold value of different modes is required to be subjected to differential adjustment to obtain more data in accordance with the user requirement.

According to an embodiment of the present invention, the obtaining user history retrieval log data, and obtaining a user retrieval preference model according to user history retrieval log data processing, includes:

The user search preference model is generated by determining the search preference of the user according to the search log of the user at ordinary times.

According to an embodiment of the present invention, the cross-modal semantic fusion is performed on the search result data, and processing is performed according to the user search preference model, so as to obtain optimized semantic fusion result data, and the optimized semantic fusion result data is sent to a user, including:

It should be noted that, different modal data may provide different information, for example, an image may provide rich visual information, a voice may provide sound features, and text may include detailed semantic information, and these different modal information have complementarity, so that they may be fused to obtain a more comprehensive and more accurate data representation, so that the search results of different modalities need to be fused and processed through a user preference model, so as to accurately adapt to the search requirement of the user.

The invention also discloses a multi-mode data retrieval system based on semantic association, which comprises a memory and a processor, wherein the memory stores a multi-mode data retrieval method program based on semantic association, and the multi-mode data retrieval method program based on semantic association realizes the following steps when being executed by the processor:

According to the embodiment of the invention, the method for acquiring the data to be retrieved in different modes and carrying out semantic recognition to generate semantic feature data in different modes comprises the following steps:

According to an embodiment of the present invention, classifying the semantic feature data of different modalities to generate multi-modal semantic feature data clusters of different categories includes:

According to an embodiment of the present invention, the obtaining user search data, performing semantic recognition on the user search data, and processing to obtain expanded search keyword data includes:

A third aspect of the present invention provides a readable storage medium storing therein a multi-modal data retrieval method program based on semantic association, which when executed by a processor, implements the steps of the multi-modal data retrieval method based on semantic association as described in any one of the above.

According to the multi-modal data retrieval method, system and medium based on semantic association, firstly, multi-modal data are classified so as to facilitate quick retrieval, then, retrieval keywords are expanded on semantic identification results of user retrieval data so as to facilitate obtaining more accurate and comprehensive multi-modal retrieval results, semantic fusion is conducted on the multi-modal retrieval results so as to improve retrieval precision, and finally, retrieval results are optimized through retrieval preference of users, so that the purpose of accurately adapting to user retrieval requirements is achieved.

In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above described device embodiments are only illustrative, e.g. the division of the units is only one logical function division, and there may be other divisions in practice, such as: multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the various components shown or discussed may be coupled or directly coupled or communicatively coupled to each other via some interface, whether indirectly coupled or communicatively coupled to devices or units, whether electrically, mechanically, or otherwise.

The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units; can be located in one place or distributed to a plurality of network units; some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present invention may be integrated in one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated in one unit; the integrated units may be implemented in hardware or in hardware plus software functional units.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, and the foregoing program may be stored in a readable storage medium, where the program, when executed, performs steps including the above method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or optical disk, or the like, which can store program codes.

Or the above-described integrated units of the invention may be stored in a readable storage medium if implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solution of the embodiments of the present invention may be embodied in essence or a part contributing to the prior art in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, ROM, RAM, magnetic or optical disk, or other medium capable of storing program code.

Claims

1. The multi-mode data retrieval method based on semantic association is characterized by comprising the following steps of:

Performing cross-modal semantic fusion on the search result data, processing according to the user search preference model, obtaining optimized semantic fusion result data, and sending the optimized semantic fusion result data to a user;

The obtaining the data to be retrieved of different modes and carrying out semantic recognition to generate semantic feature data of different modes comprises the following steps:

respectively extracting semantic features of the data to be searched in different modes according to the adaptive semantic feature extraction algorithm corresponding to the different modes to generate semantic feature data in different modes;

classifying the semantic feature data of different modalities to generate multi-modality semantic feature data clusters of different categories, including:

Respectively adding semantic tags to the multi-modal semantic feature data clusters of different categories, and generating a multi-modal semantic tag data set by all semantic tag sets;

The step of matching and identifying the expanded search keyword data and the multi-mode semantic feature data cluster to obtain search result data comprises the following steps:

2. The multi-modal data retrieval method based on semantic association according to claim 1, wherein the obtaining user retrieval data, performing semantic recognition on the user retrieval data, and processing to obtain expanded retrieval keyword data, includes:

3. The semantic association-based multimodal data retrieval method according to claim 2, wherein the obtaining the user history retrieval log data and obtaining the user retrieval preference model according to the user history retrieval log data processing comprises:

4. The multi-modal data retrieval method based on semantic association according to claim 3, wherein the cross-modal semantic fusion is performed on the retrieval result data, and processing is performed according to the user retrieval preference model, so as to obtain optimized semantic fusion result data, and the optimized semantic fusion result data is sent to a user, and the method comprises the following steps:

5. The multi-mode data retrieval system based on semantic association is characterized by comprising a memory and a processor, wherein the memory stores a program of a multi-mode data retrieval method based on semantic association, and the program of the multi-mode data retrieval method based on semantic association realizes the following steps when being executed by the processor:

6. A computer readable storage medium, characterized in that the computer readable storage medium stores therein a semantic association based multimodal data retrieval program, which when executed by a processor, implements the steps of the semantic association based multimodal data retrieval method according to any of claims 1 to 4.