CN116049454A - Intelligent searching method and system based on multi-source heterogeneous data - Google Patents

Intelligent searching method and system based on multi-source heterogeneous data Download PDF

Info

Publication number
CN116049454A
CN116049454A CN202211356086.2A CN202211356086A CN116049454A CN 116049454 A CN116049454 A CN 116049454A CN 202211356086 A CN202211356086 A CN 202211356086A CN 116049454 A CN116049454 A CN 116049454A
Authority
CN
China
Prior art keywords
data
modal
mode
result
heterogeneous
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211356086.2A
Other languages
Chinese (zh)
Inventor
马玉辉
郭子瑜
李学伟
肖保臣
郎公福
胡怀迪
张翠翠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qilu Aerospace Information Research Institute
Aerospace Information Research Institute of CAS
Original Assignee
Qilu Aerospace Information Research Institute
Aerospace Information Research Institute of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qilu Aerospace Information Research Institute, Aerospace Information Research Institute of CAS filed Critical Qilu Aerospace Information Research Institute
Priority to CN202211356086.2A priority Critical patent/CN116049454A/en
Publication of CN116049454A publication Critical patent/CN116049454A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Library & Information Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Fuzzy Systems (AREA)
  • Animal Behavior & Ethology (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an intelligent searching method based on multi-source heterogeneous data, which comprises the following steps: carrying out data connection and pretreatment on the multi-source heterogeneous data; carrying out data relation extraction and data relation representation on the preprocessed multi-source heterogeneous data to obtain multi-mode data; utilizing a dynamic heterogram attention embedding mechanism to perform multi-mode association relation mining and multi-mode semantic association fusion on the multi-mode data, and constructing a multi-mode semantic graph; utilizing a heterogeneous graph neural network to perform cross-modal knowledge fusion on the multi-modal semantic graph, and constructing a heterogeneous cross-modal knowledge graph based on the result of the cross-modal knowledge fusion; performing topic analysis and element extraction processing on the search term of the target object, and searching the target object by utilizing the heterogeneous cross-modal knowledge graph to obtain multi-modal associated information of the target object; and carrying out holographic image on the target object according to the multi-mode associated information of the target object, and outputting the intelligent sorting result and the holographic image of the target object.

Description

Intelligent searching method and system based on multi-source heterogeneous data
Technical Field
The invention relates to the fields of multi-mode knowledge graph and computer vision, in particular to an intelligent searching method and system based on multi-source heterogeneous data.
Background
In recent years, with the continuous development of related technologies of artificial intelligence and big data, the coupling degree of a massive heterogeneous data is too high, the architecture is complex and the stripe is blocked, so that various information resources of a system are scattered and split, isolated and independent, the intelligent degree is low, the organic coordination of data and business cannot be realized, a great deal of time is required to search and correlate and analyze information manually, and the traditional manual processing mode can not meet the current intelligent searching requirement. By utilizing artificial intelligence technologies such as knowledge graph, natural language processing, computer vision and the like, uniform retrieval of heterogeneous data sources is realized, the space-time data association analysis capability and the intelligent degree of a search engine are improved, and the retrieval speed and the working efficiency can be effectively improved.
At present, the retrieval task in the public safety field generally carries out fuzzy query on the attribute of the target through a relational database, and the processed data is generally structured data, so that the realization is simpler. The part of service system utilizes NLP (Natural Language Processing) natural language processing to carry out semantic analysis on the text and automatically extracts the elements of the search, but the intelligent level and accuracy of the automatic search are not high, the problems of slow search, incomplete search and inaccurate search exist, and the requirement of the actual service in the field cannot be met. For unstructured data such as images, audios and the like, content-based retrieval is performed on media content characteristics and context semantic environments, the existing multimedia search coverage is small, the retrieval function is not perfect, and the effect is not ideal in the field.
The existing intelligent searching method has the following technical defects: (1) The traditional searching method is often difficult to mine potential association relation for isolated data with different sources, so that the problems of difficult fusion, incomplete searching and the like of mass data are caused, and the mass data cannot be fully and effectively treated and applied; (2) The traditional keyword-based matching search mode is single, has the defects of more results with low correlation, low search speed and poor understanding ability to natural language, and does not support the search of multi-source heterogeneous data; (3) For heterogeneous data such as images, audio and space-time geographic information, the existing searching method cannot provide uniform searching service and cannot conduct global searching according to local attributes and features of targets.
Disclosure of Invention
In view of the above, the present invention provides an intelligent searching method and system based on multi-source heterogeneous data, so as to solve at least one of the above problems.
According to a first aspect of the present invention, there is provided an intelligent searching method based on multi-source heterogeneous data, comprising:
carrying out data connection and pretreatment on the multi-source heterogeneous data based on a big data platform to obtain pretreated multi-source heterogeneous data, wherein the pretreated multi-source heterogeneous data comprises structured data, semi-structured data and unstructured data;
Respectively extracting data relation and representing the data relation of the structured data, the semi-structured data and the unstructured data to obtain multi-mode data with data nodes and data node relations, wherein the multi-mode data comprises resource description framework triples, multi-element events and concept-entity relation diagrams;
utilizing a dynamic heterogram attention embedding mechanism to perform multi-mode association relation mining and multi-mode semantic association fusion on the multi-mode data, and constructing a multi-mode semantic graph based on a multi-mode semantic association fusion result;
utilizing a heterogeneous graph neural network to perform cross-modal knowledge fusion on the multi-modal semantic graph, and constructing a heterogeneous cross-modal knowledge graph based on the result of the cross-modal knowledge fusion;
performing topic analysis and element extraction processing on the search term of the target object, and performing multi-path query and search on the target object by utilizing the heterogeneous cross-modal knowledge graph according to the processing result to obtain multi-modal associated information of the target object;
and intelligently sequencing the multi-modal associated information of the target object, carrying out holographic imaging on the target object according to the multi-modal associated information of the target object, and outputting an intelligent sequencing result and the holographic imaging of the target object.
According to an embodiment of the present invention, the extracting data relationships and representing data relationships of the structured data, the semi-structured data, and the unstructured data respectively, to obtain multi-modal data having data nodes and data node relationships includes:
extracting and representing the data relation of the structured data by using an ETL tool to obtain a resource description framework triplet of the structured data;
extracting and representing the data relation of the semi-structured data to obtain a multi-group event of the semi-structured data;
performing voice real-time recognition on the audio data in the unstructured data to obtain a voice recognition result, and performing entity recognition, data relation extraction and representation on the voice recognition result and the file data in the unstructured data to obtain a text data processing result;
and identifying the image data in the unstructured data by using the target image detection model to obtain an image processing result, and obtaining a concept-entity relation diagram of the unstructured data according to the file image result and the image processing result.
According to an embodiment of the present invention, the identifying image data in unstructured data using the target image detection model, to obtain an image processing result includes:
Positioning and detecting the target images with different sizes by utilizing an image positioning and detecting network of the target image detecting model to obtain a target image database;
extracting features of image data in unstructured data by utilizing a target image recognition network of a target image detection model, and carrying out L2 normalization processing on the extracted features to obtain embedded feature vectors of the image data;
calculating the Euclidean distance between the embedded feature vector of the image data and each target image in the target image database;
and obtaining the similarity between the embedded feature vector of the image data and each target image in the target image database according to the Euclidean distance, and identifying the image data according to the similarity to obtain an image processing result.
According to an embodiment of the present invention, the image localization and detection network includes a backbone network, a feature pyramid unit, a context modeling unit, and a multi-task learning unit.
According to an embodiment of the present invention, the performing multi-modal association relation mining and multi-modal semantic association fusion on multi-modal data by using a dynamic heterogram attention embedding mechanism, and constructing a multi-modal semantic graph based on a result of the multi-modal semantic association fusion includes:
The node level attention mechanism in the dynamic heterograph attention embedding mechanism is utilized to carry out consistency measurement on the multi-mode data with the same data node relation, and a consistency measurement result is obtained;
performing transmembrane complementarity measurement on the multi-mode data by utilizing an edge level attention mechanism in a dynamic heterograph attention embedding mechanism to obtain a complementarity measurement result, wherein the complementarity measurement result comprises single-mode specific information and multi-mode shared information;
and processing the consistency measurement result and the complementation measurement result by using a time-level attention mechanism in the dynamic heterograph attention embedding mechanism, completing multi-mode semantic association fusion of multi-mode data, and constructing a multi-mode semantic graph based on the multi-mode semantic association fusion result.
According to an embodiment of the present invention, the foregoing performing, by using a node level attention mechanism in a dynamic iso-composition attention embedding mechanism, a consistency metric on multi-modal data having the same data node relationship, where obtaining a consistency metric result includes:
aggregating the neighborhood data nodes of the multi-mode data with the same data node relation by using a node attention mechanism to obtain the weight of the neighborhood data nodes of the multi-mode data with the same data node relation;
And embedding the data nodes into the multi-mode data with the same data node relation according to the weights of the neighborhood data nodes of the multi-mode data with the same data node relation, so as to finish the consistency measurement of the multi-mode data with the same data node relation.
According to an embodiment of the present invention, the foregoing performing transmembrane complementarity measurement on the multimodal data by using an edge level attention mechanism in the dynamic iso-composition attention embedding mechanism, to obtain a complementarity measurement result includes:
learning and normalizing different modes of each data node in the multi-mode data by utilizing an edge level attention mechanism to obtain a cross-mode weight of each data node;
and according to the cross-modal weight of each data node, cross-modal aggregation is carried out on each data node in the multi-modal data, and cross-modal complementarity measurement of the multi-modal data is completed.
According to an embodiment of the present invention, the cross-modal knowledge fusion of the multi-modal semantic graph by using the heterogeneous graph neural network, and constructing the heterogeneous cross-modal knowledge graph based on the result of the cross-modal knowledge fusion includes:
utilizing a coding module of the heterogeneous graph neural network to aggregate and embed the data nodes and the data node relations of the multi-mode semantic graph;
According to the aggregation embedding result, fusing the data nodes of the multi-mode semantic graph and the data node relation in a unified feature space;
and performing cross-modal decoding on the data nodes and the data node relations of the multi-modal semantic graph fused in the unified feature space by utilizing a decoding module of the heterogeneous graph neural network to obtain a cross-modal knowledge fusion result, and constructing a heterogeneous cross-modal knowledge graph based on the cross-modal knowledge fusion result.
According to a second aspect of the present invention, there is provided an intelligent search system based on multi-source heterogeneous data, comprising:
the data connection and preprocessing module is used for carrying out data connection and preprocessing on the multi-source heterogeneous data based on the big data platform to obtain preprocessed multi-source heterogeneous data, wherein the preprocessed multi-source heterogeneous data comprises structured data, semi-structured data and unstructured data;
the multi-mode data acquisition module is used for respectively extracting data relationship and representing data relationship of the structured data, the semi-structured data and the unstructured data to obtain multi-mode data with data nodes and data node relationships, wherein the multi-mode data comprises a resource description framework triplet, a multi-element event and a concept-entity relationship diagram;
The multi-mode semantic graph construction module is used for carrying out multi-mode association relation mining and multi-mode semantic association fusion on the multi-mode data by utilizing a dynamic heterogram attention embedding mechanism, and constructing a multi-mode semantic graph based on a multi-mode semantic association fusion result;
the heterogeneous cross-modal knowledge graph construction module is used for carrying out cross-modal knowledge fusion on the multi-modal semantic graph by utilizing the heterogeneous graph neural network and constructing a heterogeneous cross-modal knowledge graph based on a result of the cross-modal knowledge fusion;
the search request processing module is used for carrying out topic analysis and element extraction processing on the search words of the target object, and carrying out multi-path query and search on the target object by utilizing the heterogeneous cross-modal knowledge graph according to the processing result to obtain multi-modal associated information of the target object;
the intelligent sorting and result output module is used for intelligently sorting the multi-mode associated information of the target object, carrying out holographic image on the target object according to the multi-mode associated information of the target object, and outputting the intelligent sorting result and the holographic image of the target object.
According to the embodiment of the invention, the search request processing module can complete comprehensive search, batch search, association search and image search;
The comprehensive search comprises full text search, key word combination search, range search, fuzzy search, pinyin and initial search, intelligent search prompt and advanced search.
The method can extract the entity and the relation of the multi-source heterogeneous data such as text, event, image, audio, space-time and the like by utilizing a multi-element knowledge extraction frame, and the holographic portrait of the target is constructed through the deep fusion of the multi-source data, so that the attribute of the target is described in the whole dimension, and the method has the characteristics of multi-mode, space-time crossing, full fusion and general object; meanwhile, the intelligent searching method can realize information association fusion of multi-mode heterogeneous data by utilizing the constructed knowledge graph, and deeply dig potential association relations among targets, so that searching is more intelligent.
Drawings
FIG. 1 is a flow chart of a method of intelligent searching based on multi-source heterogeneous data according to an embodiment of the present invention;
FIG. 2 is a flow chart of acquiring multi-modal data in accordance with an implementation of the invention;
FIG. 3 is a flowchart of acquiring image processing results according to an embodiment of the present invention;
FIG. 4 is a flow chart of constructing a multi-modal semantic graph according to an embodiment of the present invention;
FIG. 5 is a flow chart of constructing a heterogeneous cross-modal knowledge-graph in accordance with an embodiment of the invention;
FIG. 6 is a schematic diagram of a structure of an intelligent search system based on multi-source heterogeneous data according to the present invention;
FIG. 7 is a process diagram of a smart search method based on multi-source heterogeneous data according to another embodiment of the present invention;
FIG. 8 is a flow chart of a face targeting feature alignment according to another embodiment of the present invention;
fig. 9 is a functional block diagram of an intelligent search system based on multi-source heterogeneous data according to another embodiment of the present invention.
Detailed Description
The present invention will be further described in detail below with reference to specific embodiments and with reference to the accompanying drawings, in order to make the objects, technical solutions and advantages of the present invention more apparent.
By utilizing knowledge graph, natural language processing, computer vision and other artificial intelligence technologies, uniform retrieval of heterogeneous data sources is realized, space-time data association analysis capability and intelligent degree of a search engine are improved, retrieval speed and working efficiency can be effectively improved, however, the existing intelligent search method has the problems of poor processing capability of massive multi-modal data and the like
For the next time, the intelligent searching method and system based on the multi-source heterogeneous data can treat a large amount of data accumulated for a long time, fully and comprehensively mine potential relations among targets by constructing a knowledge graph, complete reasoning of unknown information, and solve the problems of low search intellectualization, scattered and isolated data, single search mode and the like.
It should be noted that, the image data (such as face image data) of the present invention strictly complies with the requirements of the related laws and regulations during the process of obtaining, processing, applying and storing, and meets the requirements of the related public welfare.
Fig. 1 is a flowchart of an intelligent search method based on multi-source heterogeneous data according to an embodiment of the present invention.
As shown in fig. 1, the intelligent searching method based on multi-source heterogeneous data includes operations S110 to S160.
In operation S110, data splicing and preprocessing are performed on the multi-source heterogeneous data based on the big data platform, so as to obtain preprocessed multi-source heterogeneous data, where the preprocessed multi-source heterogeneous data includes structured data, semi-structured data and unstructured data.
The multi-source heterogeneous data is a hundred million-level mass data, so that data access and data preprocessing can be performed through a large data platform.
In operation S120, data relationship extraction and data relationship representation are performed on the structured data, the semi-structured data, and the unstructured data, respectively, to obtain multi-modal data having data nodes and data node relationships, where the multi-modal data includes a resource description framework triplet, a multi-tuple event, and a concept-entity relationship diagram.
For structured data accessed by a unified data source, data extraction is performed by using an ETL tool, an RDF (resource description framework (Resource Description Framework, RDF) is constructed, which is a data model (Datamodel) expressed by using XML grammar and is used for describing the characteristics of Web resources and the relationship between resources, and for semi-structured data, a multi-group event is constructed by related data processing operation (for example, using a wrapper), so that two types of data knowledge extraction and expression processes are completed.
The ETL tool is Extract-Transform-Load, which is used to describe the process of extracting (Extract), transforming (Transform), and loading (Load) data from a source to a destination, and belongs to one of the data warehouse technologies.
Aiming at unstructured data such as text, audio and images, the concepts, the entities and the association relations among the concepts, the entities are respectively obtained by utilizing entity identification, relation extraction, voice real-time identification and a target detection algorithm, and data support is provided for the construction of the multi-mode semantic graph.
In operation S130, the multi-modal data is subjected to multi-modal association relation mining and multi-modal semantic association fusion by using a dynamic heterogram attention embedding mechanism, and a multi-modal semantic graph is constructed based on the result of the multi-modal semantic association fusion.
The relevance of different concept layers is mined in the multi-modal data, cross-modal association relation among nodes is established, a multi-modal semantic graph is established, intelligent association and target multi-modal attribute alignment are carried out on multi-source heterogeneous data, and association fusion of multi-modal semantics is completed.
In operation S140, the multimodal knowledge fusion is performed on the multimodal semantic graph by using the heterogeneous graph neural network, and a heterogeneous multimodal knowledge graph is constructed based on the result of the multimodal knowledge fusion.
And (3) performing multi-relation connection on the entity objects by utilizing the heterogeneous graph neural network, constructing a heterogeneous cross-modal large-scale knowledge graph, and completing the knowledge fusion process.
In operation S150, the subject analysis and element extraction are performed on the search term of the target object, and according to the processing result, the multi-modal related information of the target object is obtained by performing multi-path query and search on the target object by using the heterogeneous cross-modal knowledge graph.
And carrying out multi-path inquiry and detection on the search target on the constructed knowledge graph by carrying out topic analysis and element extraction on the search word.
In operation S160, the multi-modal associated information of the target object is intelligently ordered, the holographic image of the target object is performed according to the multi-modal associated information of the target object, and the intelligent ordering result and the holographic image of the target object are output.
The method can extract the entity and the relation of the multi-source heterogeneous data such as text, event, image, audio, space-time and the like by utilizing a multi-element knowledge extraction frame, and the holographic portrait of the target is constructed through the deep fusion of the multi-source data, so that the attribute of the target is described in the whole dimension, and the method has the characteristics of multi-mode, space-time crossing, full fusion and general object; meanwhile, the intelligent searching method can realize information association fusion of multi-mode heterogeneous data by utilizing the constructed knowledge graph, and deeply dig potential association relations among targets, so that the searching is more intelligent, the actual problem in the public safety field can be effectively solved, and intelligent searching, analysis and research and judgment in a man-machine combination mode are realized; in addition, the method is different from matching retrieval among isomorphic data, can establish a relation among cross-media data, perform feature analysis and match, realize interactive retrieval among heterogeneous media data according to the semantics of the retrieval words, has the capabilities of multi-source massive data organization association and quick retrieval, and provides a universal and basic data resource query and retrieval function and a unified service platform of a commonality algorithm.
FIG. 2 is a flow chart of acquiring multi-modal data in accordance with an implementation of the invention.
As shown in fig. 2, the above-mentioned data relationship extraction and data relationship representation are performed on the structured data, the semi-structured data and the unstructured data, respectively, and obtaining multi-modal data having data nodes and data node relationships includes operations S210 to S240.
In operation S210, the structured data is extracted and represented in data relation by using the ETL tool, to obtain a resource description framework triplet of the structured data.
In operation S220, the semi-structured data is subjected to data relationship extraction and representation, resulting in a multi-tuple event of the semi-structured data.
In operation S230, voice real-time recognition is performed on the audio data in the unstructured data to obtain a voice recognition result, and entity recognition and data relationship extraction and representation are performed on the voice recognition result and the file data in the unstructured data to obtain a text data processing result.
In operation S240, the image data in the unstructured data is identified using the target image detection model to obtain an image processing result, and a concept-entity relationship diagram of the unstructured data is obtained according to the file image result and the image processing result.
Fig. 3 is a flowchart of acquiring an image processing result according to an embodiment of the present invention.
As shown in fig. 3, the above-mentioned identification of the image data in the unstructured data by using the target image detection model, the image processing result is obtained including operations S310 to S340.
In operation S310, the target images having different sizes are located and detected using the image locating and detecting network of the target image detection model, resulting in a target image database.
In operation S320, feature extraction is performed on the image data in the unstructured data by using the target image recognition network of the target image detection model, and after L2 normalization processing is performed on the extracted features, an embedded feature vector of the image data is obtained.
In operation S330, a euclidean distance between the embedded feature vector of the image data and each target image in the target image database is calculated.
In operation S340, according to the euclidean distance, a similarity between the embedded feature vector of the image data and each target image in the target image database is obtained, and the image data is identified according to the similarity, so as to obtain an image processing result.
According to an embodiment of the present invention, the image localization and detection network includes a backbone network, a feature pyramid unit, a context modeling unit, and a multi-task learning unit.
The flow shown in fig. 3 is described in further detail below in connection with specific embodiments.
The target detection model mainly aims at the face targets, supports the human image searching function in the intelligent searching system, carries out target detection and positioning on the faces through an improved RetinaFace network (target image positioning and detection network), provides accurate position information for all the face targets with different scales so as to intercept the faces, and constructs a face database; then, a face in a real-time image is subjected to feature extraction by using a Facenet network (target image recognition network), an embedded vector of feature representation of the face can be obtained after L2 normalization is performed on the face, the face features are compared with a face database, and Euclidean distance of each face is calculated to perform similarity evaluation; the improved RetinaFace network adopts a multi-task learning strategy, detects the position of a human face and predicts the scoring of the human face, and simultaneously regresses the position information of 5 key points of the human face, and the network structure is divided into 4 components including a Backbone network of backbones, an FPN feature pyramid, context modeling and multi-task learning; the Backbone network of the backhaul adopts a Resnet50 network to extract characteristics, and in order to enhance the context reasoning capability of the network on the small target face, a context module is added on a characteristic pyramid, so as to enhance the receptive field obtained from Euclidean distance grids, and the pixel-level positioning can be carried out on faces with different sizes; in the multi-task learning branch, since the 3D position density corresponding to each face pixel has a low contribution degree in the multi-task loss function, and the face occlusion problem can seriously affect the accuracy of recognition, the dense point regression loss is replaced by the occlusion loss, and the improved multi-task loss function f (l) is represented by the formula (1):
Figure BDA0003919931780000101
In the case of the formula (1),
Figure BDA0003919931780000102
for face classification loss, p i For predicting the probability of being a face +.>
Figure BDA0003919931780000103
Taking the true probability as 0 or 1, and carrying out two classification by using a Softmax function to distinguish the face from the background information; />
Figure BDA0003919931780000104
R is regression loss of human face frame i And->
Figure BDA0003919931780000105
Respectively indicate->
Figure BDA0003919931780000106
Figure BDA0003919931780000107
The corresponding position of the prediction frame and the position of the real labeling frame are calculated by using a Smooth L1 robustness regression function, and lambda is the weight of frame regression loss and is set to be 0.25;
Figure BDA0003919931780000108
is a face key point regression function, l i And->
Figure BDA0003919931780000109
Respectively representing the predicted coordinates and the real coordinates of 5 key points corresponding to the positive sample face, carrying out normalization processing based on the anchor point center coordinates, wherein theta is the weight of the face key point regression loss and is set to be 0.1; />
Figure BDA00039199317800001010
To block the loss function, m i And->
Figure BDA00039199317800001011
The real labels of the predicted occlusion probability and occlusion corresponding to the face of the positive sample are divided into non-occlusion, partial occlusion, moderate occlusion and serious occlusion according to different occlusion degrees, and the real labels are->
Figure BDA00039199317800001012
The value ω can be set to 1 as the weight of the occlusion loss.
FIG. 4 is a flow chart of constructing a multi-modal semantic graph according to an embodiment of the present invention.
As shown in fig. 4, the above-mentioned multi-modal data mining and multi-modal semantic association fusion using the dynamic iso-composition attention embedding mechanism, and constructing the multi-modal semantic graph based on the multi-modal semantic association fusion result includes operations S410 to S430.
In operation S410, the consistency metric is performed on the multi-modal data having the same data node relationship using the node level attention mechanism in the dynamic iso-composition attention embedding mechanism, to obtain a consistency metric result.
In operation S420, the multi-modal data is subjected to transmembrane complementarity measurement by using an edge level attention mechanism in the dynamic iso-composition attention embedding mechanism, so as to obtain a complementarity measurement result, wherein the complementarity measurement result comprises unique information of a single mode and shared information of multiple modes.
In operation S430, the consistency measurement result and the complementarity measurement result are processed by using a time-level attention mechanism in the dynamic iso-composition attention embedding mechanism, so as to complete multi-modal semantic association fusion of multi-modal data, and a multi-modal semantic graph is constructed based on the result of the multi-modal semantic association fusion.
According to an embodiment of the present invention, the foregoing performing, by using a node level attention mechanism in a dynamic iso-composition attention embedding mechanism, a consistency metric on multi-modal data having the same data node relationship, where obtaining a consistency metric result includes:
aggregating the neighborhood data nodes of the multi-mode data with the same data node relation by using a node attention mechanism to obtain the weight of the neighborhood data nodes of the multi-mode data with the same data node relation;
And embedding the data nodes into the multi-mode data with the same data node relation according to the weights of the neighborhood data nodes of the multi-mode data with the same data node relation, so as to finish the consistency measurement of the multi-mode data with the same data node relation.
According to an embodiment of the present invention, the foregoing performing transmembrane complementarity measurement on the multimodal data by using an edge level attention mechanism in the dynamic iso-composition attention embedding mechanism, to obtain a complementarity measurement result includes:
learning and normalizing different modes of each data node in the multi-mode data by utilizing an edge level attention mechanism to obtain a cross-mode weight of each data node;
and according to the cross-modal weight of each data node, cross-modal aggregation is carried out on each data node in the multi-modal data, and cross-modal complementarity measurement of the multi-modal data is completed.
The process of constructing the multimodal semantic graph is described in further detail below in connection with the detailed description.
The invention utilizes a dynamic heterogram embedding method to construct a multi-mode semantic graph so as to complete the association fusion of multi-mode semantics, utilizes different levels of attention to learn different levels of subgraph embedding, captures the association strength of different mode semantic aggregation, and comprises three parts of node level attention, edge level attention and time level attention, wherein each part is used without using any attention The same attention layer aggregates different sub-graph information; the node-level attention is utilized to finish consistency measurement of different modes between sub-graphs of the same edge type, the edge-level attention is utilized to finish cross-mode complementation measurement between different sub-graphs, unique information of a single mode and multi-mode sharing information are reserved, the time-level attention is utilized to finish general expression of a multi-mode graph, the understanding capability of natural language is enhanced, and the interactive intelligent retrieval process is finished; the node level attention layer aims at learning the neighborhood weight of each node and completing the embedding of the subgraphs by gathering the characteristics of the important neighborhoods, and for each time step snapshot, the time step snapshot is divided into different subgraphs according to the edge types, and the node embedding is carried out on each subgraph with the same edge type by adopting a self-attention mechanism; for edge type r and the t-th snapshot, the weight coefficients of data node (i, j)
Figure BDA0003919931780000121
Can be expressed by the formula (2):
Figure BDA0003919931780000122
in equation (2), σ is the activation function, which is the ReLU function, x i Is the initial eigenvector of data node i, W r A linear transformation projection matrix representing the edge type r, ||represents a join operation,
Figure BDA0003919931780000123
sampling neighbor node representing edge class r data node i in snapshot t, ++ >
Figure BDA0003919931780000124
The parameterized weight vector representing the attention function on the edge class r can be used for gathering the potential embedding of the neighbors through the calculated weight coefficient, so that the final result of the edge class r of the data node i and the t snapshot can be obtained, as shown in a formula (3):
Figure BDA0003919931780000125
in the formula (3) of the present invention,
Figure BDA0003919931780000126
the method is characterized in that the method comprises the steps of aggregating and embedding an edge type r and a data node i of a t snapshot, and a node level attention model with a multi-head mechanism is adopted for obtaining stable and effective characteristics; specifically, k independent node-level attention layers are run in parallel and these learned features are connected together as an output for embedding; the node level attention layer may capture single edge type specific information, whereas heterogeneous multi-modal data typically contains multiple types of edges, to integrate the multiple edge specific information of each node, the edge level attention layer is employed to learn the importance weights of the different types of edges, the importance of each edge type is calculated by a layer of multi-layer perceptron MLP, these different types of specific information are aggregated to generate a new embedding, the normalized weight coefficients of the data node i of the edge type r and the t-th snapshot can be represented by equation (4):
Figure BDA0003919931780000127
in formula (4), q T Attention vector, w, expressed as edge hierarchy m And b m Is a parameter of a single layer MLP, a learnable weight matrix and a bias vector, respectively, which are shared in different time snapshots and different edge types,
Figure BDA0003919931780000131
representing the same feature space mapped by inputting a particular edge-embedding into a nonlinear transformation function by computing a mapped particular edge-embedding and edge-level attention parameterization vector q T Similarity between them to measure the importance coefficient of the input specific edge embeddings, and then aggregate these specific edge embeddings to generate the final representation feature of the data node i of the t-th snapshot +.>
Figure BDA0003919931780000132
Represented by formula (5):
Figure BDA0003919931780000133
in the formula (5) of the present invention,
Figure BDA0003919931780000134
representing data node i fusing node level concerns>
Figure BDA0003919931780000135
Delta is vector space, node embedding of each time snapshot is obtained, node embedding of time level attention is calculated by aggregating node embedding in a series of time snapshots, node embedding of time level attention is calculated by using +.>
Figure BDA0003919931780000136
To process all the historic times of the node, denoted +.>
Figure BDA0003919931780000137
Time-level attention is able to capture time evolution features.
FIG. 5 is a flow chart of constructing a heterogeneous cross-modal knowledge-graph in accordance with an embodiment of the invention.
As shown in fig. 5, the above-mentioned cross-modal knowledge fusion of the multi-modal semantic graph by using the heterogeneous graph neural network, and constructing the heterogeneous cross-modal knowledge graph based on the result of the cross-modal knowledge fusion includes operations S510 to S530.
In operation S510, the multi-modal semantic graph is embedded with aggregation of data nodes and data node relationships using the encoding module of the heterogram neural network.
In operation S520, the data nodes and the data node relationships of the multi-modal semantic graph are fused in a unified feature space according to the aggregate embedding result.
In operation S530, the data nodes and the data node relationships of the multi-modal semantic graph fused in the unified feature space are subjected to cross-modal decoding by using the decoding module of the heterogeneous graph neural network, so as to obtain a cross-modal knowledge fusion result, and a heterogeneous cross-modal knowledge graph is constructed based on the cross-modal knowledge fusion result.
The above-mentioned flow chart for constructing the heterogeneous cross-modal knowledge graph is described in further detail below with reference to specific embodiments.
The heterogeneous graph neural network used in the invention can jointly learn node embedding and relation expression in a multi-relation graph, a coding-decoding module is added on the graph neural network, the node embedding and relation embedding are aggregated in a coding stage, initial expressions of the nodes and the relation are fused in a unified feature space, and then a transition method is adopted in a decoding stage to decode the triples; specifically, edges in the multiple relationship graph may be represented as (u, v, r), representing the presence of an edge of type r pointing from data node u to data node v, and a corresponding opposite edge (u, v, r -1 ) And self-connecting edges (u, v, T) which can be connected to the self-connecting edges, comprehensively considering the above 3 edge types on the multi-relation graph, and aggregating corresponding neighbors in the neural network of the multi-relation graph is shown in a formula (6):
Figure BDA0003919931780000141
in equation (6), (u, r) ∈Δ (v) is the neighbor set of data node v under multiple relationship r, λ (r) Is of the 3 edge types described above,
Figure BDA0003919931780000142
representing projection matrix corresponding to 3 edge types at t time>
Figure BDA0003919931780000143
And->
Figure BDA0003919931780000144
The feature embedded representation of data nodes u and v, respectively, the phi function will take into account the impact of node and edge relationships.
On the basis of unified characterization and management of multi-mode, heterogeneous and dynamic target information data, after object-oriented information association fusion is constructed, unified semantic expression modeling is required to be carried out on different types of information, so that unified expression of multi-mode information connotation is realized; and (3) carrying out knowledge fusion by utilizing a heterogeneous graph neural network, embedding non-Euclidean information in a virtual space and a real space into the Euclidean space for unified characterization, further realizing fusion of characteristics and semantic layers, and completing the construction process of the cross-modal knowledge graph by carrying out alignment modeling on the multi-modal information in the same granularity and semantic. Based on the method, the functions of intelligent semantic analysis, intelligent recommendation of search terms, deep association search and the like of search problems are realized by combining natural language processing and other technologies.
The invention uses the topic analysis and element extraction to analyze the topic of the input search word through the slot filling algorithm and distribute to the corresponding predefined intelligent search engine, on the basis, the element extraction is carried out on the problem through the named entity recognition algorithm, the triplet of the entity attribute value is extracted from the text, the entity comprises the elements of characters, organizations, place names, vehicles, contact ways and the like, finally, the corresponding path inquiry is carried out around the recognized main entity on the constructed knowledge graph by combining the rule matching method.
Fig. 6 is a schematic diagram of a structure of an intelligent search system based on multi-source heterogeneous data according to the present invention.
As shown in fig. 6, the intelligent search system based on multi-source heterogeneous data includes a data receiving and preprocessing module 610, a multi-mode data acquisition module 620, a multi-mode semantic graph construction module 630, a heterogeneous cross-mode knowledge graph construction module 640, a search request processing module 650, and an intelligent ranking and result output module 660.
The data splicing and preprocessing module 610 is configured to perform data splicing and preprocessing on the multi-source heterogeneous data based on the big data platform to obtain preprocessed multi-source heterogeneous data, where the preprocessed multi-source heterogeneous data includes structured data, semi-structured data and unstructured data.
The multi-modal data obtaining module 620 is configured to extract and represent data relationships of the structured data, the semi-structured data, and the unstructured data, respectively, to obtain multi-modal data having data nodes and data node relationships, where the multi-modal data includes a resource description framework triplet, a multi-tuple event, and a concept-entity relationship graph.
The multi-mode semantic graph construction module 630 is configured to perform multi-mode association relation mining and multi-mode semantic association fusion on the multi-mode data by using a dynamic heterogram attention embedding mechanism, and construct a multi-mode semantic graph based on the result of the multi-mode semantic association fusion.
The heterogeneous cross-modal knowledge graph construction module 640 is configured to perform cross-modal knowledge fusion on the multi-modal semantic graph by using the heterogeneous graph neural network, and construct a heterogeneous cross-modal knowledge graph based on the result of the cross-modal knowledge fusion.
The search request processing module 650 is configured to perform topic parsing and element extraction processing on the search term of the target object, and perform multi-path query and search on the target object by using the heterogeneous cross-modal knowledge graph according to the processing result, so as to obtain multi-modal associated information of the target object.
The intelligent sorting and result output module 660 is configured to intelligently sort the multi-modal associated information of the target object, perform holographic imaging on the target object according to the multi-modal associated information of the target object, and output an intelligent sorting result and the holographic image of the target object.
According to the embodiment of the invention, the search request processing module can complete comprehensive search, batch search, association search and image search;
the comprehensive search comprises full text search, key word combination search, range search, fuzzy search, pinyin and initial search, intelligent search prompt and advanced search.
The intelligent searching system based on the multi-source heterogeneous data comprises comprehensive searching, batch searching, association searching and portrait searching; the comprehensive search integrates various functions such as full-text search, keyword combination search, range search, fuzzy search, pinyin and initial search, intelligent search prompt, advanced search and the like in the same input box, and the search target covers a plurality of entities such as characters, vehicles, organization mechanisms, events, contact modes and the like; the batch searching comprises file searching and oversized file searching, wherein the file searching performs batch searching by analyzing file content, and the oversized file searching performs large-scale searching tasks in an offline batch processing mode; the association search comprises single-relation search and multi-relation combined joint search, the association search and shortest path search among a plurality of entities are supported, and search results are displayed in a knowledge graph mode; the face image searching supports uploading face images to search, and similarity calculation and matching are carried out on targets in a mass face library through a face detection and recognition algorithm, so that accurate searching is carried out; the system can realize information association fusion of heterogeneous data, and utilizes the constructed multi-mode knowledge graph to mine potential association relations among targets, so that searching is more intelligent.
In order to better illustrate the advantages of the intelligent searching method for multi-source heterogeneous data provided by the present invention, the present invention is further described below with reference to another embodiment and fig. 7 to 9.
Fig. 7 is a process diagram of an intelligent search method based on multi-source heterogeneous data according to another embodiment of the present invention.
Fig. 8 is a flowchart of face targeting feature alignment according to another embodiment of the present invention.
Fig. 9 is a functional block diagram of an intelligent search system based on multi-source heterogeneous data according to another embodiment of the present invention.
As shown in FIG. 7, the mass data is first connected and preprocessed, and based on a data (heterogeneous data source data exchange tool) offline data processing tool, a distributed data synchronization system with ten thousand levels per second is realized, and the single machine pressure is reduced in a resource scheduling mode, so that the overall system performance is improved. The method is suitable for multiple heterogeneous databases, achieves batch synchronization of the databases such as Oracle, mySQL, hive and the like, and achieves accurate execution of tasks such as investigation, account checking and the like of massive data through achieving self structured query language toolkits.
In the multi-mode data processing stage, structured data is processed based on a distributed computing framework Spark, structured, semi-structured and unstructured data sources are covered by a full operator, the data is subjected to repeated removal, data filtering, associated extraction, associated backfilling and the like, mass data calculation is realized through visual configuration, and calculation results are stored in different data memories. For unstructured data such as text, audio and images, entity recognition and relation extraction are carried out on the text data through a Bert model, audio data are processed through a voice real-time recognition algorithm, and for face image data, target detection and positioning are carried out on a face through an improved RetinaFace network. When a face database is constructed, all pictures in the database need to be traversed, a Retinaface network is utilized to detect the face position in each picture, the detected face part is intercepted, face correction and alignment are carried out, when alignment is carried out, the inclination angle of an eye connecting line relative to a horizontal line and the central coordinates of the images need to be calculated, and the rotation alignment is carried out by using binocular coordinates; and then the face is encoded by using a Facenet network, and the encoded result of all faces is stored in a npy file form. The flow chart of face target feature comparison is shown in fig. 8, each face feature in the real-time image is obtained through the Facenet network, the distances between each face feature and all targets in the face database are calculated, similarity comparison is carried out, the sequence numbers of the most similar faces are obtained, finally, the distance threshold judgment is carried out, the distance threshold is set to be 0.7, and if the distance threshold is smaller than the distance threshold, the face corresponding to the sequence number is judged to be the recognition result.
In the multi-mode semantic graph construction stage, sub-graph embedding of different levels is learned by utilizing different levels of attention based on dynamic heterogeneous graph embedding, the availability and high efficiency of an algorithm are considered in specific implementation, a model layer based on a multi-head attention mechanism fuses multiple dimensions through multiple heads, and each head can generate different attention distribution, so that the problem of long-term dependence in information is solved. Firstly, a Bi-gating circulation unit (Bi-GRU) is utilized to consider the inter-dependency relationship between semantics in each mode, and internal information of each mode is obtained; secondly, combining the intra-mode information with interaction between modes through a cross-mode attention interaction network layer; and finally, introducing an attention mechanism of the contribution degree of each mode, determining the attention weight of each mode, and effectively aligning the characteristics of each mode.
In the transmembrane state knowledge association fusion stage, the entities and the relations of the heterogeneous graph neural network can be expressed as two complementary subgraphs, and the nodes are updated in an iterative manner according to the values of surrounding edges. In the encoding stage, node embedding and relation embedding are aggregated, and the aggregation mode adopts variants of four different node aggregation technologies such as average value, average pooling, maximum pooling and LSTM, etc., entities and relations in different heterogeneous graphs are mapped into a unified vector space, and the mapping of the entities and the relations is converted into a vector distance calculation problem. In the decoding stage, the TransE treats the relationships in each triplet instance as vector addition from head entity to tail entity based on a distributed vector representation of the entities and relationships, with the goal of maximizing the nearest positive and negative sample distance. Wherein the negative sample is self-constructed by replacing, an erroneous triplet is realized by randomly replacing the head entity, or an erroneous triplet is formed by randomly replacing the tail entity. The process of aggregating corresponding neighbors of multiple relational nodes through the heterogeneous graph neural network is to sample neighbor nodes from a fixed point, and return to the fixed point with a certain probability to restart sampling, so that all types can be sampled until a fixed number is reached, then the sampled nodes are classified according to types, and the node with the occurrence frequency topk is selected from the similar nodes to serve as the neighbor node.
When intelligent searching is performed, entity elements are provided for intelligent searching identification problems through search term topic analysis and element extraction, and the triplet relations comprise information such as person-age, person-address, person-occupation, person-birth date, person-household registration, person-birth place, person-gender, person-religion belief, person-ethnicity, person-education degree, person-marital status, person-height and the like. The invention also provides an intelligent searching system based on the multi-source heterogeneous data, as shown in fig. 9, which is a functional structure diagram of the intelligent searching system based on the multi-source heterogeneous data. The system comprises: comprehensive searching, batch searching, association searching and portrait searching; the comprehensive search integrates various functions such as full-text search, keyword combination search, range search, fuzzy search, pinyin and initial search, intelligent search prompt, advanced search and the like in the same input box, and the search target covers a plurality of entities such as characters, vehicles, organization mechanisms, events, contact modes and the like; the batch searching comprises file searching and oversized file searching, wherein the file searching performs batch searching by analyzing file content, and the oversized file searching performs large-scale searching tasks in an offline batch processing mode; the association search comprises single-relation search and multi-relation combined joint search, the association search and shortest path search among a plurality of entities are supported, and search results are displayed in a knowledge graph mode; the face image searching supports uploading face images to search, and similarity calculation and matching are carried out on targets in a mass face library through a face detection and recognition algorithm, so that accurate searching is carried out; the system can realize information association fusion of heterogeneous data, and utilizes the constructed multi-mode knowledge graph to mine potential association relations among targets, so that searching is more intelligent. By analyzing and designing the system functions, a complete solution and framework are established for solving the problem of accurate searching of massive multi-source heterogeneous data in a big data environment.
And finally outputting the coordinate information of the boundary box of the target in the image and the corresponding target category.
The intelligent searching method and the system based on the multi-source heterogeneous data can mine potential association relations among targets, establish cross-modal data relations, analyze and match features, realize interactive searching among heterogeneous modal data according to the semantics of search words, have the capabilities of multi-source mass data organization association and quick searching, solve the problems of low searching intellectualization, data dispersion isolation, single searching mode and the like, and can be applied to multiple fields.
The foregoing embodiments have been provided for the purpose of illustrating the general principles of the present invention, and are not meant to limit the scope of the invention, but to limit the invention thereto.

Claims (10)

1. An intelligent searching method based on multi-source heterogeneous data comprises the following steps:
carrying out data connection and preprocessing on the multi-source heterogeneous data based on a big data platform to obtain preprocessed multi-source heterogeneous data, wherein the preprocessed multi-source heterogeneous data comprises structured data, semi-structured data and unstructured data;
Respectively extracting data relation and representing the data relation of the structured data, the semi-structured data and the unstructured data to obtain multi-modal data with data nodes and data node relations, wherein the multi-modal data comprises resource description framework triples, multi-element events and concept-entity relation diagrams;
utilizing a dynamic heterogram attention embedding mechanism to perform multi-mode association relation mining and multi-mode semantic association fusion on the multi-mode data, and constructing a multi-mode semantic graph based on the multi-mode semantic association fusion result;
utilizing a heterogeneous graph neural network to perform cross-modal knowledge fusion on the multi-modal semantic graph, and constructing a heterogeneous cross-modal knowledge graph based on a result of the cross-modal knowledge fusion;
performing topic analysis and element extraction processing on the search term of the target object, and performing multi-path query and search on the target object by utilizing the heterogeneous cross-modal knowledge graph according to the processing result to obtain multi-modal associated information of the target object;
and intelligently sequencing the multi-mode associated information of the target object, carrying out holographic imaging on the target object according to the multi-mode associated information of the target object, and outputting an intelligent sequencing result and the holographic imaging of the target object.
2. The method of claim 1, wherein the performing data relationship extraction and data relationship representation on the structured data, the semi-structured data, and the unstructured data, respectively, to obtain multi-modal data having data nodes and data node relationships comprises:
extracting and representing the data relation of the structured data by using an ETL tool to obtain a resource description framework triplet of the structured data;
extracting and representing the data relationship of the semi-structured data to obtain a multi-group event of the semi-structured data;
performing voice real-time recognition on the audio data in the unstructured data to obtain a voice recognition result, and performing entity recognition, data relation extraction and representation on the voice recognition result and the file data in the unstructured data to obtain a text data processing result;
and identifying the image data in the unstructured data by using a target image detection model to obtain an image processing result, and obtaining a concept-entity relation diagram of the unstructured data according to the file image result and the image processing result.
3. The method of claim 2, wherein the identifying image data in the unstructured data using a target image detection model to obtain an image processing result comprises:
Positioning and detecting the target images with different sizes by utilizing an image positioning and detecting network of the target image detecting model to obtain a target image database;
extracting features of image data in the unstructured data by utilizing a target image recognition network of the target image detection model, and carrying out L2 normalization processing on the extracted features to obtain embedded feature vectors of the image data;
calculating Euclidean distance between the embedded feature vector of the image data and each target image in the target image database;
and obtaining the similarity between the embedded feature vector of the image data and each target image in the target image database according to the Euclidean distance, and identifying the image data according to the similarity to obtain an image processing result.
4. The method of claim 3, wherein the image localization and detection network comprises a backbone network, a feature pyramid unit, a context modeling unit, and a multi-tasking learning unit.
5. The method of claim 1, wherein the performing multi-modal association relation mining and multi-modal semantic association fusion on the multi-modal data using a dynamic heterogram attention embedding mechanism, and constructing a multi-modal semantic graph based on a result of the multi-modal semantic association fusion comprises:
The node level attention mechanism in the dynamic heterograph attention embedding mechanism is utilized to carry out consistency measurement on the multi-mode data with the same data node relation, and a consistency measurement result is obtained;
performing transmembrane complementarity measurement on the multi-mode data by utilizing an edge level attention mechanism in a dynamic heterograph attention embedding mechanism to obtain a complementarity measurement result, wherein the complementarity measurement result comprises single-mode specific information and multi-mode shared information;
and processing the consistency measurement result and the complementarity measurement result by utilizing a time-level attention mechanism in a dynamic heterogram attention embedding mechanism to complete multi-mode semantic association fusion of multi-mode data, and constructing a multi-mode semantic graph based on the multi-mode semantic association fusion result.
6. The method of claim 5, wherein said utilizing a node level attention mechanism of the dynamic iso-composition attention embedding mechanism to perform a consistency metric on multi-modal data having the same data node relationship, resulting in a consistency metric result comprises:
aggregating the neighborhood data nodes of the multi-mode data with the same data node relation by using the node attention mechanism to obtain the weight of the neighborhood data nodes of the multi-mode data with the same data node relation;
And embedding the data nodes into the multi-mode data with the same data node relation according to the weight of the neighborhood data nodes of the multi-mode data with the same data node relation, so as to finish the consistency measurement of the multi-mode data with the same data node relation.
7. The method of claim 5, wherein said performing a transmembrane state complementarity metric on the multimodal data using an edge level attention mechanism in a dynamic iso-patterning attention embedding mechanism, resulting in a complementarity metric result comprises:
learning and normalizing different modes of each data node in the multi-mode data by utilizing the edge level attention mechanism to obtain a cross-mode weight of each data node;
and according to the cross-modal weight of each data node, cross-modal aggregation is carried out on each data node in the multi-modal data, and the cross-modal complementarity measurement of the multi-modal data is completed.
8. The method of claim 1, wherein the cross-modal knowledge fusion of the multi-modal semantic graph using the heterogeneous graph neural network and constructing a heterogeneous cross-modal knowledge graph based on a result of the cross-modal knowledge fusion comprises:
Utilizing the coding module of the heterogeneous graph neural network to aggregate and embed the data nodes and the data node relations of the multi-mode semantic graph;
according to the aggregation embedding result, fusing the data nodes of the multi-mode semantic graph and the data node relation in a unified feature space;
and performing cross-modal decoding on the data nodes and the data node relations of the multi-modal semantic graph fused in the unified feature space by utilizing a decoding module of the heterogeneous graph neural network to obtain a cross-modal knowledge fusion result, and constructing a heterogeneous cross-modal knowledge graph based on the cross-modal knowledge fusion result.
9. An intelligent search system based on multi-source heterogeneous data, comprising:
the data connection and preprocessing module is used for carrying out data connection and preprocessing on the multi-source heterogeneous data based on the big data platform to obtain preprocessed multi-source heterogeneous data, wherein the preprocessed multi-source heterogeneous data comprises structured data, semi-structured data and unstructured data;
the multi-modal data acquisition module is used for respectively extracting data relationship and representing data relationship of the structured data, the semi-structured data and the unstructured data to obtain multi-modal data with data nodes and data node relationships, wherein the multi-modal data comprises a resource description framework triplet, a multi-tuple event and a concept-entity relationship diagram;
The multi-mode semantic graph construction module is used for carrying out multi-mode association relation mining and multi-mode semantic association fusion on the multi-mode data by utilizing a dynamic iso-composition attention embedding mechanism, and constructing a multi-mode semantic graph based on the multi-mode semantic association fusion result;
the heterogeneous cross-modal knowledge graph construction module is used for carrying out cross-modal knowledge fusion on the multi-modal semantic graph by utilizing a heterogeneous graph neural network and constructing a heterogeneous cross-modal knowledge graph based on a result of the cross-modal knowledge fusion;
the search request processing module is used for carrying out topic analysis and element extraction processing on the search words of the target object, and carrying out multi-path query and search on the target object by utilizing the heterogeneous cross-modal knowledge graph according to the processing result to obtain multi-modal associated information of the target object;
and the intelligent sorting and result output module is used for intelligently sorting the multi-mode associated information of the target object, carrying out holographic image on the target object according to the multi-mode associated information of the target object, and outputting an intelligent sorting result and the holographic image of the target object.
10. The system of claim 9, wherein the search request processing module is capable of performing comprehensive, batch, associative, and image retrieval;
The comprehensive search comprises full text search, keyword combination search, range search, fuzzy search, pinyin and initial search, intelligent search prompt and advanced search.
CN202211356086.2A 2022-11-01 2022-11-01 Intelligent searching method and system based on multi-source heterogeneous data Pending CN116049454A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211356086.2A CN116049454A (en) 2022-11-01 2022-11-01 Intelligent searching method and system based on multi-source heterogeneous data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211356086.2A CN116049454A (en) 2022-11-01 2022-11-01 Intelligent searching method and system based on multi-source heterogeneous data

Publications (1)

Publication Number Publication Date
CN116049454A true CN116049454A (en) 2023-05-02

Family

ID=86130089

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211356086.2A Pending CN116049454A (en) 2022-11-01 2022-11-01 Intelligent searching method and system based on multi-source heterogeneous data

Country Status (1)

Country Link
CN (1) CN116049454A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116245260A (en) * 2023-05-12 2023-06-09 国网吉林省电力有限公司信息通信公司 Optimization method for deploying 5G base station based on substation resources
CN116756375A (en) * 2023-05-09 2023-09-15 中电科大数据研究院有限公司 Processing system of heterogeneous data based on atlas
CN116881482A (en) * 2023-06-27 2023-10-13 四川九洲视讯科技有限责任公司 Cross-media intelligent sensing and analyzing processing method for public safety data
CN117493490A (en) * 2023-11-17 2024-02-02 南京信息工程大学 Topic detection method, device, equipment and medium based on heterogeneous multi-relation graph
US11907339B1 (en) * 2018-12-13 2024-02-20 Amazon Technologies, Inc. Re-identification of agents using image analysis and machine learning

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11907339B1 (en) * 2018-12-13 2024-02-20 Amazon Technologies, Inc. Re-identification of agents using image analysis and machine learning
CN116756375A (en) * 2023-05-09 2023-09-15 中电科大数据研究院有限公司 Processing system of heterogeneous data based on atlas
CN116756375B (en) * 2023-05-09 2024-05-07 中电科大数据研究院有限公司 Processing system of heterogeneous data based on atlas
CN116245260A (en) * 2023-05-12 2023-06-09 国网吉林省电力有限公司信息通信公司 Optimization method for deploying 5G base station based on substation resources
CN116245260B (en) * 2023-05-12 2023-08-08 国网吉林省电力有限公司信息通信公司 Optimization method for deploying 5G base station based on substation resources
CN116881482A (en) * 2023-06-27 2023-10-13 四川九洲视讯科技有限责任公司 Cross-media intelligent sensing and analyzing processing method for public safety data
CN117493490A (en) * 2023-11-17 2024-02-02 南京信息工程大学 Topic detection method, device, equipment and medium based on heterogeneous multi-relation graph
CN117493490B (en) * 2023-11-17 2024-05-14 南京信息工程大学 Topic detection method, device, equipment and medium based on heterogeneous multi-relation graph

Similar Documents

Publication Publication Date Title
CN116049454A (en) Intelligent searching method and system based on multi-source heterogeneous data
WO2022068196A1 (en) Cross-modal data processing method and device, storage medium, and electronic device
CN104899253B (en) Towards the society image across modality images-label degree of correlation learning method
CN111914156B (en) Cross-modal retrieval method and system for self-adaptive label perception graph convolution network
CN107133569B (en) Monitoring video multi-granularity labeling method based on generalized multi-label learning
CN112612902A (en) Knowledge graph construction method and device for power grid main device
CN112528519A (en) Method, system, readable medium and electronic device for engine quality early warning service
CN116049397B (en) Sensitive information discovery and automatic classification method based on multi-mode fusion
CN113761219A (en) Knowledge graph-based retrieval method and device, electronic equipment and storage medium
CN113761208A (en) Scientific and technological innovation information classification method and storage device based on knowledge graph
CN117390299A (en) Interpretable false news detection method based on graph evidence
CN117221087A (en) Alarm root cause positioning method, device and medium
CN116108363A (en) Incomplete multi-view multi-label classification method and system based on label guidance
CN114579769B (en) Small sample knowledge graph completion method, system, equipment and storage medium
CN116304252A (en) Communication network fraud prevention method based on graph structure clustering
Wu et al. Dmtmv: a unified learning framework for deep multi-task multi-view learning
CN114925210A (en) Knowledge graph construction method, device, medium and equipment
CN113516118A (en) Image and text combined embedded multi-mode culture resource processing method
CN111611981A (en) Information identification method and device and information identification neural network training method and device
CN117708746B (en) Risk prediction method based on multi-mode data fusion
Zhang et al. A review of data fusion techniques for government big data
CN114880588B (en) News heat prediction method based on knowledge graph
CN117911662B (en) Digital twin scene semantic segmentation method and system based on depth hough voting
CN114691888A (en) Target association identification method and system based on capability data base map
Ding Efficient Redundancy Processing Framework of Association Rules Model based on Hypergraph in Information Pattern Recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination