CN117033724B - Multi-mode data retrieval method based on semantic association - Google Patents

Multi-mode data retrieval method based on semantic association Download PDF

Info

Publication number
CN117033724B
CN117033724B CN202311071657.2A CN202311071657A CN117033724B CN 117033724 B CN117033724 B CN 117033724B CN 202311071657 A CN202311071657 A CN 202311071657A CN 117033724 B CN117033724 B CN 117033724B
Authority
CN
China
Prior art keywords
data
retrieval
mode
retrieval system
modal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311071657.2A
Other languages
Chinese (zh)
Other versions
CN117033724A (en
Inventor
张鸡环
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Joysim Technology Co ltd
Original Assignee
Guangzhou Joysim Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Joysim Technology Co ltd filed Critical Guangzhou Joysim Technology Co ltd
Priority to CN202311071657.2A priority Critical patent/CN117033724B/en
Publication of CN117033724A publication Critical patent/CN117033724A/en
Application granted granted Critical
Publication of CN117033724B publication Critical patent/CN117033724B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • G06F16/90332Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9035Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a multi-mode data retrieval method based on semantic association, which relates to the technical field of multi-mode data retrieval and comprises the following steps: the method comprises the steps of collecting modal data information and retrieval evaluation index information during operation of a multi-modal data retrieval system based on semantic association, comprehensively analyzing to generate accuracy evaluation indexes, establishing a data set, comprehensively analyzing the accuracy evaluation indexes in the set to generate operation state signals, and respectively sending different prompts. According to the invention, through evaluating the accuracy of the multi-mode data retrieval system during semantic association modeling, when the accuracy is reduced, the system senses in time, and prompts relevant maintenance personnel to take corresponding maintenance and optimization measures, so that the accuracy of the semantic association modeling is ensured, the semantic association among data is well captured by the model, the decrease of the correlation between the retrieval result returned by the system and the user query is effectively prevented, and the misleading retrieval result is effectively prevented from being provided for the user.

Description

Multi-mode data retrieval method based on semantic association
Technical Field
The invention relates to the technical field of multi-mode data retrieval, in particular to a multi-mode data retrieval method based on semantic association.
Background
The multi-modal data retrieval system based on semantic association is a comprehensive software system, and aims to achieve semantic association among various data types (text, images, audio and the like) so as to retrieve related cross-modal data under the condition that a user provides inquiry, and the system can provide more accurate, comprehensive and intelligent information retrieval experience for the user.
The multi-mode data retrieval method based on semantic association generally comprises the processes of data preprocessing and feature extraction, semantic association modeling, query and retrieval, feedback and optimization, and for different data modes, preprocessing and feature extraction are needed to be performed firstly so as to represent the data into a vector form suitable for processing. For example, for text data, natural language processing techniques (e.g., word embedding, TF-IDF, etc.) may be used to convert text to a vector representation. For image data, a Convolutional Neural Network (CNN) may be used to extract image features, while for audio data, a spectrogram or MFCC method may be used to extract audio features. The objective of semantic association modeling is to map data of different modes to a shared semantic space, after the semantic association modeling, when a user provides a query, the query data is converted into a shared semantic representation, and the shared semantic representation is matched with the data which is already converted into the shared semantic representation in a database, and finally, according to feedback of the user, the semantic association of the model can be further optimized.
The prior art has the following defects:
The semantic association modeling is the core of the multi-modal data retrieval system, if the semantic association modeling accuracy is poor, the multi-modal data retrieval system cannot sense in time, the poor semantic association modeling accuracy means that the model cannot capture semantic association between data well, and the correlation between the retrieval result returned by the system and the user query is reduced, so that the retrieval accuracy of the whole multi-modal data retrieval system is affected, and misleading retrieval results can be provided for users.
The above information disclosed in the background section is only for enhancement of understanding of the background of the disclosure and therefore it may include information that does not form the prior art that is already known to a person of ordinary skill in the art.
Disclosure of Invention
The invention aims to provide a multi-mode data retrieval method based on semantic association, which evaluates the accuracy of the multi-mode data retrieval system during semantic association modeling, timely senses the multi-mode data retrieval system when the accuracy of the multi-mode data retrieval system during semantic association modeling is reduced, prompts relevant maintenance personnel to take corresponding maintenance and optimization measures, ensures the accuracy of the semantic association modeling, ensures that the model well captures the semantic association between data, effectively prevents the retrieval accuracy of the whole multi-mode data retrieval system from being influenced by the reduced correlation between the retrieval result returned by the system and the query of a user, and simultaneously effectively prevents the retrieval result providing misleading for the user so as to solve the problems in the background technology.
In order to achieve the above object, the present invention provides the following technical solutions: the multi-mode data retrieval method based on semantic association comprises the following steps:
S100, acquiring a plurality of data messages based on semantic association when the multi-mode data retrieval system operates, wherein the plurality of data messages comprise mode data messages and retrieval evaluation index messages, and processing the mode data messages and the retrieval evaluation index messages after acquisition;
s200, comprehensively analyzing the processed modal data information and the retrieval evaluation index information in the operation process of the multi-modal data retrieval system to generate an accuracy evaluation index;
S300, establishing a data set of a plurality of accuracy evaluation indexes generated when the multi-mode data retrieval system operates, and comprehensively analyzing the accuracy evaluation indexes in the data set to generate an operation state signal;
S400, respectively sending different prompts to running state signals generated when the multi-mode data retrieval system runs.
Preferably, the modal data information comprises a modal sample data volume balance coefficient and a modal data volume similarity degree anomaly coefficient, after acquisition, the modal sample data volume balance coefficient and the modal data volume similarity degree anomaly coefficient are respectively calibrated to be PH MT and XS MT, the retrieval evaluation index information comprises a retrieval recall rate anomaly concealment coefficient, and after acquisition, the retrieval recall rate anomaly concealment coefficient is calibrated to be JS YC.
Preferably, the logic for obtaining the modal sample data size balance coefficient is as follows:
S101, acquiring sample data amounts in different modes at the same moment in the operation process of the multi-mode data retrieval system, and calibrating the sample data amounts to be beta SJ x, wherein x represents the numbers of the modes in the different modes at the same moment in the operation process of the multi-mode data retrieval system, and x=1, 2, 3, 4, … …, m and m are positive integers;
S102, calculating standard deviations of sample data amounts in different modes at the same moment in the operation process of the multi-mode data retrieval system, and calibrating the standard deviations of the sample data amounts as R, wherein the standard deviations are as follows:
Wherein, For the average value of sample data amounts in different modes at the same moment in the operation process of the multi-mode data retrieval system, the acquired calculation formula is as follows: /(I)
S103, obtaining standard deviations of sample data amounts generated when the multi-mode data retrieval system operates at different moments in T time, and recalibrating the standard deviations of the sample data amounts to be R y,y to represent numbers of the standard deviations of the sample data amounts generated when the multi-mode data retrieval system operates at different moments in T time, wherein y=1, 2, 3,4, … … and n are positive integers;
S104, establishing a data set of sample data volume standard deviations generated in the operation T time of the multi-mode data retrieval system, sequencing the sample data volume standard deviations in the data set according to the sequence, and calibrating the maximum sample data volume standard deviation in the data set as R max;
S105, calculating a modal sample data volume balance coefficient through a maximum sample data volume standard deviation R max in a data set, wherein the calculated expression is as follows:
Preferably, the logic for obtaining the similarity degree anomaly coefficient of the modal data volume is as follows:
s201, converting all modal data into vector representation;
S202, carrying out normalization processing on each vector to ensure that data of different modes have the same weight in distance calculation and have unit norms;
s203, for each mode, calculating the internal Euclidean distance of the mode;
For the ith modality, assuming that its vector is expressed as Aiv, the euclidean distance calculation formula is:
wherein, aiv' is the number of corresponding elements of Aiv and other corresponding vectors, v represents the i-th mode on the same dimension, v=1, 2,3, 4, … …, p is a positive integer.
S204, acquiring internal Euclidean distances of each mode of the multi-mode data retrieval system at different moments in T time, and calibrating the internal Euclidean distances as Distance j, wherein j represents the number of the internal Euclidean distances of each mode of the multi-mode data retrieval system at different moments in T time, and j=1, 2, 3, 4, … …, q and q are positive integers;
S205, establishing a data set of the internal Euclidean Distance of each mode of the multi-mode data retrieval system in the time T, sequencing the internal Euclidean distances in the data set according to the sequence, and calibrating the maximum internal Euclidean Distance in each mode as a Distance Maximum value ;
S206, calculating a similarity degree anomaly coefficient of the modal data volume, wherein the calculated expression is as follows: Wherein x represents the number of the mode of the multi-mode data retrieval system, and x=1, 2,3, 4, … …, m are positive integers.
Preferably, the logic for retrieving the recall anomaly concealment coefficients is as follows:
S301, acquiring an optimal retrieval recall rate range of a multi-mode data retrieval system, and calibrating the optimal retrieval recall rate range as gamma ZH min~γZH max;
s302, acquiring retrieval recall rates of the multi-mode data retrieval system in different time periods within T time, and calibrating the retrieval recall rates as gamma ZH r, wherein r represents the number of the retrieval recall rates of the multi-mode data retrieval system in different time periods within T time, and r=1, 2, 3, 4, … … and a are positive integers;
the calculation formula of the recall rate is as follows: recall = number of relevant data retrieved/total number of all relevant data;
S303, calibrating a retrieval recall rate smaller than an optimal retrieval recall rate range gamma ZH min~γZH max as gamma ZH u, wherein u represents the number of the retrieval recall rate smaller than an optimal retrieval recall rate range gamma ZH min~γZH max, and u=1, 2, 3, 4, … … and e are positive integers;
s304, calculating a retrieval recall abnormal hiding coefficient, wherein the calculated expression is as follows: Wherein/>
Preferably, after the modal sample data volume balance coefficient PH MT, the modal data volume similarity degree anomaly coefficient XS MT and the retrieval recall anomaly concealment coefficient JS YC are obtained, an evaluation model is established, and an accuracy evaluation index theta zqd w is generated according to the following formula:
Wherein x1, x2 and x3 are respectively preset scale coefficients of a modal sample data volume balance coefficient PH MT, a modal data volume similarity degree anomaly coefficient XS MT and a retrieval recall rate anomaly concealment coefficient JS YC, and x1, x2 and x3 are all larger than 0.
Preferably, a data set is established by a plurality of accuracy evaluation indexes generated when the multi-mode data retrieval system is operated, and the data set is calibrated to be F, then F={θzqd w}={θzqd 1、θzqd 2、…、θzqd s},w=1、2、3、4、……、s,s is a positive integer;
Calculating the average value and standard deviation of a plurality of accuracy evaluation indexes in a data set, respectively calibrating the accuracy evaluation index average value and the accuracy evaluation index standard deviation as P1 and P2, and respectively comparing the accuracy evaluation index average value P1 and the accuracy evaluation index standard deviation P2 with a preset accuracy evaluation index reference threshold K1 and a preset standard deviation reference threshold K2 to generate the following conditions:
If P1 is greater than or equal to K1, generating a first running state signal;
If P1 is smaller than K1 and P2 is larger than or equal to K2, generating a second running state signal;
if P1 is less than K1 and P2 is less than K2, a third operating state signal is generated.
Preferably, when the first running state signal is obtained, a first-level accuracy early warning prompt is sent out to prompt relevant maintainers that the accuracy is poor in semantic association modeling when the multi-mode data retrieval system runs, and the multi-mode data retrieval system needs to be maintained and optimized in time;
When a second running state signal is obtained, a secondary accuracy early warning prompt is sent out to prompt relevant maintainers that the accuracy is good and bad when the multi-mode data retrieval system runs, the running state is extremely unstable, and the multi-mode data retrieval system needs to be maintained and optimized in time;
and when the third running state signal is acquired, no early warning prompt is sent out.
In the technical scheme, the invention has the technical effects and advantages that:
According to the invention, through evaluating the accuracy of the multi-mode data retrieval system during semantic association modeling, when the accuracy of the multi-mode data retrieval system during semantic association modeling is reduced, the system senses in time and prompts relevant maintainers to take corresponding maintenance and optimization measures, so that the accuracy of the semantic association modeling is ensured, the semantic association among data is well captured by the model, the influence of the reduced correlation between the retrieval result returned by the system and the user query on the retrieval accuracy of the whole multi-mode data retrieval system is effectively prevented, and the misleading retrieval result is effectively prevented from being provided for the user;
According to the invention, the data set is established for comprehensive analysis by establishing the accuracy condition during semantic association modeling of the multi-mode data retrieval system, so that the analysis of the abnormal condition during semantic association modeling of the multi-mode data retrieval system can be realized, the related maintainers can know the abnormal condition during semantic association modeling conveniently, and can conduct targeted maintenance and optimization, and the maintenance and optimization efficiency is improved;
According to the method, the data set is established for comprehensive analysis on the accuracy condition during semantic association modeling of the multi-mode data retrieval system, the early warning prompt can be prevented from being sent out due to sudden abnormality of the accuracy during semantic association modeling, the early warning accuracy can be improved, the trust degree of relevant maintainers on the early warning prompt is further improved, and the stable and efficient operation of the multi-mode data retrieval system is ensured.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings required for the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments described in the present application, and other drawings may be obtained according to these drawings for those skilled in the art.
FIG. 1 is a flow chart of a method of the multi-modal data retrieval method based on semantic association.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these example embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art.
The invention provides a multi-mode data retrieval method based on semantic association as shown in figure 1, which comprises the following steps:
S100, acquiring a plurality of data messages based on semantic association when the multi-mode data retrieval system operates, wherein the plurality of data messages comprise mode data messages and retrieval evaluation index messages, and processing the mode data messages and the retrieval evaluation index messages after acquisition;
The modal data information comprises a modal sample data volume balance coefficient and a modal data volume similarity degree anomaly coefficient, and after acquisition, the modal sample data volume balance coefficient and the modal data volume similarity degree anomaly coefficient are respectively calibrated to be PH MT and XS MT;
In the semantic association modeling stage, the large sample data size difference between different modalities can cause the following serious influence on the accuracy of the multi-modality data retrieval system:
Unbalanced sample problem: sample data volume unbalance may cause that information of certain modes is ignored or not fully considered in the modeling process, while other modes with more samples may dominate the learning process of the whole semantic association model, which causes that the response of the system to certain modes is poor in query, thereby affecting the accuracy of the retrieval result;
Overfitting problem: modalities with smaller sample data sizes can easily cause overfitting, and particularly in the case of higher sample dimensions, the model can excessively depend on the characteristics of a small number of samples, so that more data conditions cannot be generalized, higher errors can be generated during actual retrieval, and the robustness of the system is reduced;
Cross-modality consistency is difficult to capture: in semantic association modeling, cross-modal consistency is a key problem, and modalities with smaller data volume often cannot capture consistency information between other modalities well, so that the model is difficult to accurately establish semantic association between modalities, and accuracy of a retrieval result is affected;
Information loss: modalities with a smaller data volume may not fully express their inherent semantic information, resulting in information loss, which affects the accurate understanding of semantic association between the system and the modalities, thereby reducing the overall accuracy performance of the retrieval system;
Therefore, the sample data volume condition of the multi-mode data retrieval system in the semantic association modeling stage is monitored, and the problem that the sample data volume difference between different modes is larger to influence the semantic association modeling accuracy can be perceived;
The logic for acquiring the modal sample data volume balance coefficient is as follows:
S101, acquiring sample data amounts in different modes at the same moment in the operation process of the multi-mode data retrieval system, and calibrating the sample data amounts to be beta SJ x, wherein x represents the numbers of the modes in the different modes at the same moment in the operation process of the multi-mode data retrieval system, and x=1, 2, 3, 4, … …, m and m are positive integers;
It should be noted that, for a system using a database to store data, a database monitoring tool may be used to monitor the number of samples of different modes in the database in real time, where the tools may provide information such as the size, the number of rows, and the index usage situation of the database table, for example DataDog is a powerful cloud monitoring platform, and supports multiple databases, including MySQL, postgreSQL, mongoDB, etc., dataDog provides real-time monitoring and alarm functions, and may monitor performance indexes and data volume information of the database; for another example, prometheus is an open source system monitoring and alarm tool that supports a variety of databases, such as MySQL, postgreSQL, etc., and Prometheus obtains sample data amounts and other metrics of the databases in real time through custom query language PromQL;
S102, calculating standard deviations of sample data amounts in different modes at the same moment in the operation process of the multi-mode data retrieval system, and calibrating the standard deviations of the sample data amounts as R, wherein the standard deviations are as follows:
Wherein, For the average value of sample data amounts in different modes at the same moment in the operation process of the multi-mode data retrieval system, the acquired calculation formula is as follows: /(I)
The sample data size standard deviation R can be known that the larger the representation value of the sample data size standard deviation R in different modes at the same moment in the operation process of the multi-mode data retrieval system is, the worse the balance of the sample data sizes in different modes at the same moment is, and otherwise, the better the balance of the sample data sizes in different modes at the same moment is;
s103, obtaining standard deviations of sample data amounts generated when the multi-mode data retrieval system operates at different moments in T time, and recalibrating the standard deviations of the sample data amounts to be R y,y to represent numbers of the standard deviations of the sample data amounts generated when the multi-mode data retrieval system operates at different moments in T time, wherein y=1, 2, 3,4, … … and n are positive integers;
S104, establishing a data set of sample data volume standard deviations generated in the operation T time of the multi-mode data retrieval system, sequencing the sample data volume standard deviations in the data set according to the sequence, and calibrating the maximum sample data volume standard deviation in the data set as R max;
S105, calculating a modal sample data volume balance coefficient through a maximum sample data volume standard deviation R max in a data set, wherein the calculated expression is as follows:
The calculation expression of the modal sample data volume balance coefficient shows that the larger the expression value of the modal sample data volume balance coefficient generated when the multi-modal data retrieval system operates in the T time is, the worse the accuracy of the multi-modal data retrieval system in semantic association modeling is, and otherwise, the better the accuracy of the multi-modal data retrieval system in semantic association modeling is;
After semantic association modeling, the euclidean distance between data is large, which will have the following serious effects on the accuracy of semantic association modeling:
Low correlation matching: a larger euclidean distance means that the vector representations between the data are far apart in the semantic space, which may result in the associated data not being matched exactly, e.g., a pair of related images and text should be a smaller distance in the semantic space, but if the distance is larger, the system may not be able to match correctly;
Inaccurate search results: in a multi-mode data retrieval task, a user generally hopes that the system can return data of multiple modes related to query, and if the distance of the data in a semantic space is large, the system can return irrelevant or inaccurate data, so that the quality of a retrieval result is reduced;
misunderstanding semantic association: the larger distance of the data in the semantic space may cause a deviation of the understanding of the semantic association by the system, and the system may erroneously consider the irrelevant data as relevant or ignore some data which is actually relevant, thereby misunderstanding the real semantic association between the data;
Reducing system efficiency: a vector representation with a larger distance may increase the computational burden of the retrieval system, requiring more time and resources for distance computation of high-dimensional vectors on a large-scale dataset, which will reduce the efficiency and response speed of the system;
It should be noted that, similarity measurement indexes, such as cosine similarity, euclidean distance, manhattan distance, etc., between the data representations after semantic association modeling are obtained, and these indexes can be used to measure the similarity degree of different data in the semantic space, so as to evaluate the modeling accuracy;
Therefore, the Euclidean distance between sample data after semantic association modeling of the multi-mode data retrieval system is monitored, and the problem that the Euclidean distance abnormality between the sample data affects the semantic association modeling accuracy can be perceived;
The logic for acquiring the abnormal coefficient of the similarity degree of the modal data volume is as follows:
s201, converting all modal data into vector representation;
this can be achieved by various methods, such as extracting features using a pre-trained deep learning model, or converting text data into a vector representation using word embedding techniques, assuming n modalities, one for each modality;
S202, carrying out normalization processing on each vector to ensure that data of different modes have the same weight in distance calculation and the data of different modes have unit norms (namely L2 norms are 1);
by doing so, the order of magnitude difference between modes can be eliminated, so that the distance calculation is more robust;
s203, for each mode, calculating the internal Euclidean distance of the mode;
For the ith modality, assuming that its vector is expressed as Aiv, the euclidean distance calculation formula is:
wherein, aiv' is the number of corresponding elements of Aiv and other corresponding vectors, v represents the i-th mode on the same dimension, v=1, 2,3, 4, … …, p is a positive integer.
S204, acquiring internal Euclidean distances of each mode of the multi-mode data retrieval system at different moments in T time, and calibrating the internal Euclidean distances as Distance j, wherein j represents the number of the internal Euclidean distances of each mode of the multi-mode data retrieval system at different moments in T time, and j=1, 2, 3, 4, … …, q and q are positive integers;
S205, establishing a data set of the internal Euclidean Distance of each mode of the multi-mode data retrieval system in the time T, sequencing the internal Euclidean distances in the data set according to the sequence, and calibrating the maximum internal Euclidean Distance in each mode as a Distance Maximum value ;
S206, calculating a similarity degree anomaly coefficient of the modal data volume, wherein the calculated expression is as follows: Wherein x represents the mode number of the multi-mode data retrieval system, and x=1, 2,3, 4, … … and m are positive integers;
The calculation expression of the abnormal coefficient of the similarity degree of the modal data volume shows that the larger the expression value of the abnormal coefficient of the similarity degree of the modal data volume generated when the multi-modal data retrieval system operates in the T time is, the worse the accuracy of the multi-modal data retrieval system in semantic association modeling is, and otherwise, the better the accuracy of the multi-modal data retrieval system in semantic association modeling is;
The retrieval evaluation index information comprises a retrieval recall rate abnormal hiding coefficient, and after acquisition, the retrieval recall rate abnormal hiding coefficient is marked as JS YC;
In a multi-mode data retrieval system, a semantic retrieval stage refers to that after a user inputs a query, the system performs semantic search and retrieves related data according to query content, and if the retrieval recall rate of the system in the semantic retrieval stage is low, that is, the data related to the query cannot be effectively retrieved, the accuracy of semantic association modeling can be seriously affected as follows:
Incomplete semantic association modeling: the low recall means that the system fails to retrieve all query-related data, which results in a lack of a portion of important data samples during the semantic association modeling phase, and thus the semantic association model may lack a complete understanding of global data relationships, thereby affecting modeling accuracy and comprehensiveness;
Semantic association model of bias errors: because some relevant data samples are missed, the system may be more prone to bias and error in modeling, which may lead to inaccurate understanding of the semantic association model for certain categories or topics;
data imbalance problem: the low recall rate may lead to an imbalance in the distribution of data samples of different categories or topics in semantic association modeling, which would make the model perform better on some categories and worse on other categories, resulting in instability and unreliability of the model performance;
The accuracy of the search result is reduced: the accuracy of semantic association modeling depends on the effect of the retrieval stage, and if the retrieval recall rate is low, the model can be modeled in limited data samples, and the data samples can not necessarily represent the whole semantic association, so that the accuracy of the final retrieval result is reduced;
therefore, the retrieval recall rate of the multi-mode data retrieval system after semantic association modeling is monitored, and the problem that the abnormal retrieval recall rate affects the semantic association modeling accuracy can be perceived;
The logic for retrieving recall anomaly concealment coefficients is as follows:
S301, acquiring an optimal retrieval recall rate range of a multi-mode data retrieval system, and calibrating the optimal retrieval recall rate range as gamma ZH min~γZH max;
It should be noted that, a control experiment is designed, different recall thresholds or other semantic association models are used for comparison, and performance under different recall ranges can be evaluated by using a plurality of data sets, different query types and different models, so that an optimal retrieval recall range of the multi-mode data retrieval system is obtained, the optimal retrieval recall range is not specifically limited, and is set according to different requirements;
S302, acquiring retrieval recall rates of the multi-mode data retrieval system in different time periods (the time periods can be equal or unequal) within the T time, and calibrating the retrieval recall rates to be gamma ZH r, wherein r represents the number of the retrieval recall rates of the multi-mode data retrieval system in different time periods within the T time, and r=1, 2, 3, 4, … …, a and a are positive integers;
the calculation formula of the recall rate is as follows: recall = number of relevant data retrieved/total number of all relevant data;
It should be noted that if there is a labeled data set, which includes related data samples of various categories or topics, and a standard answer of related data is defined for each query, then the number and total number of related data can be directly obtained from the labeled data set, and in the semantic retrieval stage, the number of data samples that are returned by the system and matched with the standard answer is the number of related data that is retrieved;
S303, calibrating a retrieval recall rate smaller than an optimal retrieval recall rate range gamma ZH min~γZH max as gamma ZH u, wherein u represents the number of the retrieval recall rate smaller than an optimal retrieval recall rate range gamma ZH min~γZH max, and u=1, 2, 3, 4, … … and e are positive integers;
s304, calculating a retrieval recall abnormal hiding coefficient, wherein the calculated expression is as follows: Wherein/>
The calculation expression of the retrieval recall abnormal hiding coefficient shows that the larger the expression value of the retrieval recall abnormal hiding coefficient generated when the multi-mode data retrieval system operates in the T time is, the worse the accuracy of the multi-mode data retrieval system in semantic association modeling is shown, and otherwise, the better the accuracy of the multi-mode data retrieval system in semantic association modeling is shown;
s200, comprehensively analyzing the processed modal data information and the retrieval evaluation index information in the operation process of the multi-modal data retrieval system to generate an accuracy evaluation index;
After the modal sample data volume balance coefficient PH MT, the modal data volume similarity degree anomaly coefficient XS MT and the retrieval recall anomaly concealment coefficient JS YC are obtained, an evaluation model is built, an accuracy evaluation index theta zqd w is generated according to the following formula:
Wherein x1, x2 and x3 are respectively preset proportional coefficients of a modal sample data volume balance coefficient PH MT, a modal data volume similarity degree anomaly coefficient XS MT and a retrieval recall rate anomaly concealment coefficient JS YC, and x1, x2 and x3 are all larger than 0;
The calculation formula shows that the larger the modal sample data volume balance coefficient generated by the multi-modal data retrieval system when running in the T time is, the larger the modal data volume similarity degree anomaly coefficient is, the larger the retrieval recall ratio anomaly concealment coefficient is, namely the larger the expression value of the accuracy evaluation index theta zqd w generated by the multi-modal data retrieval system when running in the T time is, the worse the accuracy of the multi-modal data retrieval system when carrying out semantic association modeling is shown, the smaller the modal sample data volume balance coefficient generated by the multi-modal data retrieval system when running in the T time is, the smaller the modal data volume similarity degree anomaly coefficient is, the smaller the retrieval recall ratio anomaly concealment coefficient is, namely the smaller the expression value of the accuracy evaluation index theta zqd w generated by the multi-modal data retrieval system when running in the T time is, and the better the accuracy of the multi-modal data retrieval system when carrying out semantic association modeling is shown;
S300, establishing a data set of a plurality of accuracy evaluation indexes generated when the multi-mode data retrieval system operates, and comprehensively analyzing the accuracy evaluation indexes in the data set to generate an operation state signal;
Establishing a data set from a plurality of accuracy evaluation indexes generated when the multi-mode data retrieval system operates, and calibrating the data set as F, wherein F={θzqd w}={θzqd 1、θzqd 2、…、θzqd s},w=1、2、3、4、……、s,s is a positive integer;
Calculating the average value and standard deviation of a plurality of accuracy evaluation indexes in a data set, respectively calibrating the accuracy evaluation index average value and the accuracy evaluation index standard deviation as P1 and P2, and respectively comparing the accuracy evaluation index average value P1 and the accuracy evaluation index standard deviation P2 with a preset accuracy evaluation index reference threshold K1 and a preset standard deviation reference threshold K2 to generate the following conditions:
if P1 is greater than or equal to K1, generating a first running state signal, wherein the first running state signal indicates that the accuracy of semantic association modeling is poor when the multi-mode data retrieval system runs;
If P1 is smaller than K1 and P2 is larger than or equal to K2, generating a second running state signal, wherein the second running state signal indicates that the accuracy in semantic association modeling is good and bad when the multi-mode data retrieval system runs, and the running state is extremely unstable;
if P1 is smaller than K1 and P2 is smaller than K2, generating a third running state signal, wherein the third running state signal indicates that the accuracy of semantic association modeling is better when the multi-mode data retrieval system runs;
it should be noted that, if the accuracy evaluation index is greater than or equal to the accuracy evaluation index reference threshold, the accuracy of the multi-mode data retrieval system in semantic association modeling is relatively poor, and if the accuracy evaluation index is less than the accuracy evaluation index reference threshold, the accuracy of the multi-mode data retrieval system in semantic association modeling is relatively good;
S400, respectively sending different prompts to running state signals generated when the multi-mode data retrieval system runs;
When a first running state signal is acquired, a first-level accuracy early warning prompt is sent out to prompt relevant maintainers that the accuracy of a multi-mode data retrieval system is poor in semantic association modeling when the multi-mode data retrieval system runs, the multi-mode data retrieval system needs to be maintained and optimized in time, the accuracy of the semantic association modeling is ensured, the semantic association among data is well captured by a model, the influence of the decrease of the correlation between a retrieval result returned by the system and user query on the retrieval accuracy of the whole multi-mode data retrieval system is effectively prevented, and meanwhile misleading retrieval results are effectively prevented from being provided for users;
When a second running state signal is obtained, a secondary accuracy early warning prompt is sent out to prompt relevant maintainers that the accuracy is good and bad when the multi-mode data retrieval system runs, the running state is extremely unstable, and the multi-mode data retrieval system needs to be maintained and optimized in time so that the system can run stably and efficiently;
When a third running state signal is obtained, no early warning prompt is sent out, which indicates that the accuracy of semantic association modeling is better when the multi-mode data retrieval system runs;
According to the invention, through evaluating the accuracy of the multi-mode data retrieval system during semantic association modeling, when the accuracy of the multi-mode data retrieval system during semantic association modeling is reduced, the system senses in time and prompts relevant maintainers to take corresponding maintenance and optimization measures, so that the accuracy of the semantic association modeling is ensured, the semantic association among data is well captured by the model, the influence of the reduced correlation between the retrieval result returned by the system and the user query on the retrieval accuracy of the whole multi-mode data retrieval system is effectively prevented, and the misleading retrieval result is effectively prevented from being provided for the user;
According to the invention, the data set is established for comprehensive analysis by establishing the accuracy condition during semantic association modeling of the multi-mode data retrieval system, so that the analysis of the abnormal condition during semantic association modeling of the multi-mode data retrieval system can be realized, the related maintainers can know the abnormal condition during semantic association modeling conveniently, and can conduct targeted maintenance and optimization, and the maintenance and optimization efficiency is improved;
According to the method, the data set is established for comprehensive analysis on the accuracy condition during semantic association modeling of the multi-mode data retrieval system, the early warning prompt can be prevented from being sent out due to sudden abnormality of the accuracy during semantic association modeling, the early warning accuracy can be improved, the trust degree of relevant maintainers on the early warning prompt is further improved, and the stable and efficient operation of the multi-mode data retrieval system is ensured.
The above formulas are all formulas with dimensions removed and numerical values calculated, the formulas are formulas with a large amount of data collected for software simulation to obtain the latest real situation, and preset parameters in the formulas are set by those skilled in the art according to the actual situation.
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, the processes or functions described in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired or wireless means (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more sets of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.
It should be understood that, in various embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present application.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the several embodiments provided by the present application, it should be understood that the disclosed systems and methods may be implemented in other ways. For example, the embodiments described above are merely illustrative, e.g., the division of the elements is merely a logical functional division, and there may be additional divisions in actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (2)

1. The multi-mode data retrieval method based on semantic association is characterized by comprising the following steps of:
S100, acquiring a plurality of data messages based on semantic association when the multi-mode data retrieval system operates, wherein the plurality of data messages comprise mode data messages and retrieval evaluation index messages, and processing the mode data messages and the retrieval evaluation index messages after acquisition;
The modal data information comprises a modal sample data volume balance coefficient and a modal data volume similarity degree anomaly coefficient, the modal sample data volume balance coefficient and the modal data volume similarity degree anomaly coefficient are respectively calibrated to be PH MT and XS MT after acquisition, the retrieval evaluation index information comprises a retrieval recall rate anomaly concealment coefficient, and the retrieval recall rate anomaly concealment coefficient is calibrated to be JS YC after acquisition;
The logic for acquiring the modal sample data volume balance coefficient is as follows:
S101, acquiring sample data amounts in different modes at the same moment in the operation process of the multi-mode data retrieval system, and calibrating the sample data amounts to be beta SJ x, wherein x represents the numbers of the modes in the different modes at the same moment in the operation process of the multi-mode data retrieval system, and x=1, 2, 3, 4, … …, m and m are positive integers;
S102, calculating standard deviations of sample data amounts in different modes at the same moment in the operation process of the multi-mode data retrieval system, and calibrating the standard deviations of the sample data amounts as R, wherein the standard deviations are as follows:
Wherein, For the average value of sample data amounts in different modes at the same moment in the operation process of the multi-mode data retrieval system, the acquired calculation formula is as follows: /(I)
S103, obtaining standard deviations of sample data amounts generated when the multi-mode data retrieval system operates at different moments in T time, and recalibrating the standard deviations of the sample data amounts to R y, wherein y represents numbers of the standard deviations of the sample data amounts generated when the multi-mode data retrieval system operates at different moments in T time, and y=1, 2, 3, 4, … …, n and n are positive integers;
S104, establishing a data set of sample data volume standard deviations generated in the operation T time of the multi-mode data retrieval system, sequencing the sample data volume standard deviations in the data set according to the sequence, and calibrating the maximum sample data volume standard deviation in the data set as R max;
S105, calculating a modal sample data volume balance coefficient through a maximum sample data volume standard deviation R max in a data set, wherein the calculated expression is as follows:
The logic for acquiring the abnormal coefficient of the similarity degree of the modal data volume is as follows:
s201, converting all modal data into vector representation;
S202, carrying out normalization processing on each vector to ensure that data of different modes have the same weight in distance calculation and have unit norms;
s203, for each mode, calculating the internal Euclidean distance of the mode;
For the ith modality, assuming that its vector is expressed as Aiv, the euclidean distance calculation formula is:
wherein, aiv' is the number of corresponding elements of Aiv and other corresponding vectors, v represents the i-th mode on the same dimension, v=1, 2, 3, 4, … …, p is a positive integer;
s204, acquiring internal Euclidean distances of each mode of the multi-mode data retrieval system at different moments in T time, and calibrating the internal Euclidean distances as Distance j, wherein j represents the number of the internal Euclidean distances of each mode of the multi-mode data retrieval system at different moments in T time, and j=1, 2, 3, 4, … …, q and q are positive integers;
S205, establishing a data set of the internal Euclidean Distance of each mode of the multi-mode data retrieval system in the time T, sequencing the internal Euclidean distances in the data set according to the sequence, and calibrating the maximum internal Euclidean Distance in each mode as a Distance Maximum value ;
S206, calculating a similarity degree anomaly coefficient of the modal data volume, wherein the calculated expression is as follows: Wherein x represents the mode number of the multi-mode data retrieval system, and x=1, 2,3, 4, … … and m are positive integers;
The logic for retrieving recall anomaly concealment coefficients is as follows:
S301, acquiring an optimal retrieval recall rate range of a multi-mode data retrieval system, and calibrating the optimal retrieval recall rate range as gamma ZH min~γZH max;
s302, acquiring retrieval recall rates of the multi-mode data retrieval system in different time periods within T time, and calibrating the retrieval recall rates as gamma ZH r, wherein r represents the number of the retrieval recall rates of the multi-mode data retrieval system in different time periods within T time, and r=1, 2, 3, 4, … … and a are positive integers;
the calculation formula of the recall rate is as follows: recall = number of relevant data retrieved/total number of all relevant data;
S303, calibrating a retrieval recall rate smaller than an optimal retrieval recall rate range gamma ZH min~γZH max as gamma ZH u, wherein u represents the number of the retrieval recall rate smaller than an optimal retrieval recall rate range gamma ZH min~γZH max, and u=1, 2, 3, 4, … … and e are positive integers;
s304, calculating a retrieval recall abnormal hiding coefficient, wherein the calculated expression is as follows: Wherein/>
S200, comprehensively analyzing the processed modal data information and the retrieval evaluation index information in the operation process of the multi-modal data retrieval system to generate an accuracy evaluation index;
After the modal sample data volume balance coefficient PH MT, the modal data volume similarity degree anomaly coefficient XS MT and the retrieval recall anomaly concealment coefficient JS YC are obtained, an evaluation model is built, an accuracy evaluation index theta zqd w is generated according to the following formula:
Wherein x1, x2 and x3 are respectively preset proportional coefficients of a modal sample data volume balance coefficient PH MT, a modal data volume similarity degree anomaly coefficient XS MT and a retrieval recall anomaly concealment coefficient JS YC, and x1, x2 and x3 are all larger than 0;
S300, establishing a data set of a plurality of accuracy evaluation indexes generated when the multi-mode data retrieval system operates, and comprehensively analyzing the accuracy evaluation indexes in the data set to generate an operation state signal;
Establishing a data set from a plurality of accuracy evaluation indexes generated when the multi-mode data retrieval system operates, and calibrating the data set as F, wherein F={θzqd w}={θzqd 1、θzqd 2、…、θzqd s},w=1、2、3、4、……、s,s is a positive integer;
Calculating the average value and standard deviation of a plurality of accuracy evaluation indexes in a data set, respectively calibrating the accuracy evaluation index average value and the accuracy evaluation index standard deviation as P1 and P2, and respectively comparing the accuracy evaluation index average value P1 and the accuracy evaluation index standard deviation P2 with a preset accuracy evaluation index reference threshold K1 and a preset standard deviation reference threshold K2 to generate the following conditions:
If P1 is greater than or equal to K1, generating a first running state signal;
If P1 is smaller than K1 and P2 is larger than or equal to K2, generating a second running state signal;
if P1 is smaller than K1 and P2 is smaller than K2, generating a third running state signal;
S400, respectively sending different prompts to running state signals generated when the multi-mode data retrieval system runs.
2. The multi-modal data retrieval method based on semantic association according to claim 1, wherein when a first running state signal is obtained, a first-level accuracy early warning prompt is sent out to prompt relevant maintenance personnel that the accuracy is poor in semantic association modeling when a multi-modal data retrieval system runs, and the multi-modal data retrieval system needs to be maintained and optimized in time;
When a second running state signal is obtained, a secondary accuracy early warning prompt is sent out to prompt relevant maintainers that the accuracy is good and bad when the multi-mode data retrieval system runs, the running state is extremely unstable, and the multi-mode data retrieval system needs to be maintained and optimized in time;
and when the third running state signal is acquired, no early warning prompt is sent out.
CN202311071657.2A 2023-08-24 2023-08-24 Multi-mode data retrieval method based on semantic association Active CN117033724B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311071657.2A CN117033724B (en) 2023-08-24 2023-08-24 Multi-mode data retrieval method based on semantic association

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311071657.2A CN117033724B (en) 2023-08-24 2023-08-24 Multi-mode data retrieval method based on semantic association

Publications (2)

Publication Number Publication Date
CN117033724A CN117033724A (en) 2023-11-10
CN117033724B true CN117033724B (en) 2024-05-03

Family

ID=88624418

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311071657.2A Active CN117033724B (en) 2023-08-24 2023-08-24 Multi-mode data retrieval method based on semantic association

Country Status (1)

Country Link
CN (1) CN117033724B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104317837A (en) * 2014-10-10 2015-01-28 浙江大学 Cross-modal searching method based on topic model
CN107402993A (en) * 2017-07-17 2017-11-28 山东师范大学 The cross-module state search method for maximizing Hash is associated based on identification
CN110990597A (en) * 2019-12-19 2020-04-10 中国电子科技集团公司信息科学研究院 Cross-modal data retrieval system based on text semantic mapping and retrieval method thereof
CN112819609A (en) * 2021-02-24 2021-05-18 深圳前海微众银行股份有限公司 Risk assessment method, apparatus, computer-readable storage medium, and program product
CN113590772A (en) * 2021-02-25 2021-11-02 腾讯科技(深圳)有限公司 Abnormal score detection method, device, equipment and computer readable storage medium
CN115080778A (en) * 2022-05-31 2022-09-20 同济大学 Cross-modal three-dimensional model retrieval method based on noise data cleaning
CN116611712A (en) * 2023-07-17 2023-08-18 国网信息通信产业集团有限公司 Semantic inference-based power grid work ticket evaluation system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090287680A1 (en) * 2008-05-14 2009-11-19 Microsoft Corporation Multi-modal query refinement
US8700655B2 (en) * 2010-11-08 2014-04-15 At&T Intellectual Property I, L.P. Systems, methods, and computer program products for location salience modeling for multimodal search
US11158010B2 (en) * 2015-08-31 2021-10-26 International Business Machines Corporation Incremental search based multi-modal journey planning
US20220284321A1 (en) * 2021-03-03 2022-09-08 Adobe Inc. Visual-semantic representation learning via multi-modal contrastive training

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104317837A (en) * 2014-10-10 2015-01-28 浙江大学 Cross-modal searching method based on topic model
CN107402993A (en) * 2017-07-17 2017-11-28 山东师范大学 The cross-module state search method for maximizing Hash is associated based on identification
CN110990597A (en) * 2019-12-19 2020-04-10 中国电子科技集团公司信息科学研究院 Cross-modal data retrieval system based on text semantic mapping and retrieval method thereof
CN112819609A (en) * 2021-02-24 2021-05-18 深圳前海微众银行股份有限公司 Risk assessment method, apparatus, computer-readable storage medium, and program product
CN113590772A (en) * 2021-02-25 2021-11-02 腾讯科技(深圳)有限公司 Abnormal score detection method, device, equipment and computer readable storage medium
CN115080778A (en) * 2022-05-31 2022-09-20 同济大学 Cross-modal three-dimensional model retrieval method based on noise data cleaning
CN116611712A (en) * 2023-07-17 2023-08-18 国网信息通信产业集团有限公司 Semantic inference-based power grid work ticket evaluation system

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Performance evaluation of sensor networks by statistical modeling and euclidean model checking;Youngmin Kwon 等;《ACM Transactions on Sensor Networks》;第9卷(第4期);1–38 *
Power-Aware Design Techniques of Secure Multimode Embedded Systems;Ke Jiang 等;《ACM Transactions on Embedded Computing Systems》;第15卷(第1期);1–29 *
具有语义一致性的跨模态关联学习与信息检索;花妍;《中国博士学位论文全文数据库 信息科技辑》(第03期);I138-164 *
基于深度哈希方法的多媒体检索研究;宋歌;《中国博士学位论文全文数据库 信息科技辑》(第02期);I138-246 *
基于群体智能算法的多模态特征选择研究;张首荣;《中国优秀硕士学位论文全文数据库 信息科技辑》(第03期);I140-228 *

Also Published As

Publication number Publication date
CN117033724A (en) 2023-11-10

Similar Documents

Publication Publication Date Title
Budaev Using principal components and factor analysis in animal behaviour research: caveats and guidelines
CN107168995B (en) Data processing method and server
CN111898366A (en) Document subject word aggregation method and device, computer equipment and readable storage medium
CN105518656A (en) A cognitive neuro-linguistic behavior recognition system for multi-sensor data fusion
WO2020232898A1 (en) Text classification method and apparatus, electronic device and computer non-volatile readable storage medium
CN118211882B (en) Product quality management system and method based on big data
CN117170915A (en) Data center equipment fault prediction method and device and computer equipment
CN116842053A (en) Distributed cloud data retrieval system and method
Wu et al. Construction of an intelligent processing platform for equestrian event information based on data fusion and data mining
CN114692778A (en) Multi-modal sample set generation method, training method and device for intelligent inspection
JP7081454B2 (en) Processing equipment, processing method, and processing program
Lu et al. A modified active learning intelligent fault diagnosis method for rolling bearings with unbalanced samples
CN117033724B (en) Multi-mode data retrieval method based on semantic association
CN117390170A (en) Method and device for matching data standards, electronic equipment and readable storage medium
CN115839344B (en) Wear supervision method, device, equipment and storage medium for slurry pump
CN116126807A (en) Log analysis method and related device
CN104572820A (en) Method and device for generating model and method and device for acquiring importance degree
CN110263811B (en) Equipment running state monitoring method and system based on data fusion
CN114706856A (en) Fault processing method and device, electronic equipment and computer readable storage medium
Restat et al. Towards a Holistic Data Preparation Tool.
CN116486185A (en) Oil well working condition identification method, device, equipment and storage medium
JP7081455B2 (en) Learning equipment, learning methods, and learning programs
CN114926154B (en) Protection switching method and system for multi-scene data identification
CN116451056B (en) Terminal feature insight method, device and equipment
CN117852788A (en) Digital production and information management method and system for thin film capacitor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20240415

Address after: Room 301, No. 309 Tianfu Road, Tianhe District, Guangzhou City, Guangdong Province, 510000 (this residence is limited to office building functions)

Applicant after: GUANGZHOU JOYSIM TECHNOLOGY CO.,LTD.

Country or region after: China

Address before: Room 11103, 11th Floor, Building A, Qinghai University Science and Technology Park, No. 30 Haihu Avenue Extension, Chengbei District, Xining City, Qinghai Province, 810000

Applicant before: Qinghai Shengyun Information Technology Co.,Ltd.

Country or region before: China

GR01 Patent grant
GR01 Patent grant