CN116756375B - Processing system of heterogeneous data based on atlas - Google Patents

Processing system of heterogeneous data based on atlas Download PDF

Info

Publication number
CN116756375B
CN116756375B CN202310517761.3A CN202310517761A CN116756375B CN 116756375 B CN116756375 B CN 116756375B CN 202310517761 A CN202310517761 A CN 202310517761A CN 116756375 B CN116756375 B CN 116756375B
Authority
CN
China
Prior art keywords
module
data
user
topic
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310517761.3A
Other languages
Chinese (zh)
Other versions
CN116756375A (en
Inventor
黄海峰
韩国权
仲恺
熊子奇
李响
李华峰
韩伟
祁纲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taiji Computer Corp Ltd
CETC Big Data Research Institute Co Ltd
Original Assignee
Taiji Computer Corp Ltd
CETC Big Data Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taiji Computer Corp Ltd, CETC Big Data Research Institute Co Ltd filed Critical Taiji Computer Corp Ltd
Priority to CN202310517761.3A priority Critical patent/CN116756375B/en
Publication of CN116756375A publication Critical patent/CN116756375A/en
Application granted granted Critical
Publication of CN116756375B publication Critical patent/CN116756375B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of information processing, and discloses a heterogeneous data processing system based on a map, wherein a user module manages user identities and provides an interactive interface; the application module is used for providing application service and configuring balanced operation, and the user performs inquiry and call of the corresponding application service instance when accessing through the user module; the data module is used for providing collective storage for all data resources and comprises one or more relational databases RDBMS; a topic map index module is configured between the data module and the application module, and is used for extracting data from each RDBMS and forming a structural semantic index layer by utilizing the topic map; and realizing semantic retrieval processing of the retrieval statement.

Description

Processing system of heterogeneous data based on atlas
Technical Field
The invention relates to the field of information processing, in particular to a system for processing heterogeneous data based on a map.
Background
In a distributed system, data is distributed among multiple data sources, each of which is read, used, updated, maintained, and analyzed by different components and service instances. The information in the Web can be regarded as a huge, complex, distributed database, each site is an independent data source, and the data organization form and structure between the sites are different. The information on the Web can be completely regarded as a heterogeneous database environment. How to mine these data needs to solve the problems such as compatibility and utilization of heterogeneous data between web sites, and at the same time, ensuring the access processing efficiency is also a concern for users.
Disclosure of Invention
In order to solve one of the above problems, the present invention provides a processing system for heterogeneous data based on a graph, which includes a user module, an application module, a data module and a topic graph indexing module, wherein,
The user module is used for managing the user identity and providing an interactive interface for user access;
the application module is used for providing application service, and the user inquires about the application service instance when accessing through the user module;
The data module is used for providing collective storage for all data resources and comprises one or more relational databases RDBMS;
A topic map index module is configured between the data module and the application module, and is used for extracting data from each RDBMS and forming a structural semantic index layer by utilizing topic maps; the method comprises the steps that a receiving application module sends a retrieval request for data according to the operation of a user module, a topic map indexing module extracts keywords and analyzes sentences of the retrieval request, performs primary semantic retrieval processing on retrieval sentences, then matches sentence types, verb words and key nouns identified according to semantic retrieval processing results with retrieval intentions of natural sentences to obtain semantic network subgraphs, determines formal description of the natural retrieval sentences, and returns retrieval results to the user module; the semantic network subgraph comprises key concepts and inter-concept semantic association attributes meeting the requirements of users.
Preferably, the topics in the topic map indexing module are dynamic, structured indexes independent of specific resources; the user module retrieves the corresponding actual resources through accessing the theme relation application service instance, and directs the user to obtain information from the corresponding network address.
Preferably, the topic map indexing module is further configured to integrate data of different web site systems by using topic maps, map and navigate the data, and organize abstract and isolated data to form a structured semantic network.
Preferably, the Topic map in the Topic map index module configures a Topic index of the data resource according to three elements of the Topic map, namely Topic (Topic), association (Association) and event (Occurrence); identifying a theme (Topic), a relation (Association) and an event (Occurrence) in the data information according to the principles of the theme, the relation and the event of the theme map; the identified information is described by the element nodes specified by the XTM respectively, the XTM document is generated, and three sub-topic diagrams are formed respectively.
Preferably, the topic map indexing module is configured to perform similarity analysis on the built sub topic map under a global mode of the data, combine topics with high similarity or consistency according to a certain rule, and combine the sub topic map according to a bottom-up mode to form the global topic map.
Preferably, the topic map indexing module is further configured to represent concepts and interrelationships between concepts in the knowledge structure of the information resource based on the ontology knowledge and the topic map organization system.
Preferably, through the topic map indexing module, semantic mapping between the global pattern and the data sources is established, and when a user provides a query through the user module, the query submitted by the user based on the global pattern is rewritten into an executable query for each data source by utilizing the semantic mapping.
Preferably, the data module integrates the data of each data source through the global mode, the integrated data is still stored in each local data source, and the data is converted into the global mode through the wrapper of each data source.
Preferably, the query in the user module is based on global query, and the query is rewritten into an acceptable grammar form through the mapping relation between the data source mode and the global mode and is transmitted to the server where the data source is located, so as to execute the query on the data source.
Preferably, the sentence class identified according to the semantic search processing result includes: bearing sentence, active reaction sentence and passive reaction sentence.
The processing system of the heterogeneous data based on the map comprises a user module, a user module and a control module, wherein the user module is used for managing the user identity and providing an interactive interface; the application module is used for providing application service and configuring balanced operation, and the user performs inquiry and call of the corresponding application service instance when accessing through the user module; the data module is used for providing collective storage for all data resources and comprises one or more relational databases RDBMS; a topic map index module is configured between the data module and the application module, and is used for extracting data from each RDBMS and forming a structural semantic index layer by utilizing the topic map; and realizing semantic retrieval processing of the retrieval statement.
Drawings
The features and advantages of the present invention will be more clearly understood by reference to the accompanying drawings, which are schematic and should not be interpreted as limiting the invention in any way.
FIG. 1 is a schematic diagram of a system framework of the present invention.
Detailed Description
These and other features and characteristics of the present invention, as well as the methods of operation and functions of the related elements of structure, the combination of parts and economies of manufacture, may be better understood with reference to the following description and the accompanying drawings, all of which form a part of this specification. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended as a definition of the limits of the invention. It will be understood that the figures are not drawn to scale. Various block diagrams are used in the description of the various embodiments according to the present invention.
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In this context "/" means "or" for example, a/B may mean a or B; "and/or" herein is merely an association relationship describing an association object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone.
It should be noted that, in order to clearly describe the technical solution of the embodiments of the present application, in the embodiments of the present application, the terms "first", "second", and the like are used to distinguish the same item or similar items having substantially the same function or effect, and those skilled in the art will understand that the terms "first", "second", and the like do not limit the number and execution order. For example, the first information and the second information are used to distinguish between different information, and not to describe a particular order of information.
It should be noted that, in the embodiments of the present invention, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "e.g." in an embodiment should not be taken as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
Example 1
As shown in fig. 1, a system for processing heterogeneous data based on a map is disclosed: the system comprises a user module, an application module, a data module and a theme graph index module, wherein the user module manages the user identity and provides an interactive interface; the application module is used for providing application service, and the user performs inquiry of the corresponding application service instance when accessing through the user module;
The application module can be configured to comprise a plurality of servers, and the servers are stored with application service instances; and the corresponding application module is provided with an equalization unit which is used for executing the equalization operation on the server where the application service instance is located when the user module executes the query of the application service instance. It has been found in practice that the processing of language text class query operations based on atlas classes requires a significant amount of parallel computing resources, and to this end, configuring multiple IP access server addresses provides application call execution and operations while load balancing is implemented in each server.
The equalizing operation may specifically include: distributing a server key value key for a server where an application service instance is located; when a user access module service instance inquires and accesses, an equalization operation is configured; and performing hash operation on the key value key of the corresponding server, mapping the key value key to a hash ring, wherein the integer value calculated by the key value key is the position of the service instance corresponding to the server on the hash ring, calculating the position of the requested service instance corresponding to the key value key of the server on the hash ring by adopting the same hash operation, and finding the key of the server where the first service instance larger than or equal to the hash ring value is located according to the clockwise direction, thereby obtaining the server where the service instance requiring distribution is located. The server key value corresponds to a uniform identification feature parameter of an application service instance configuration. Such as an IP address, or a MAC address or a user name.
The data module is used for providing collective storage for all data resources and comprises one or more relational databases RDBMS; a topic map index module is configured between the data module and the application module, and is used for extracting data from each RDBMS and forming a structural semantic index layer by utilizing topic maps; the application module sends out a retrieval request for data according to the operation of the user module, the topic map index module extracts keywords and analyzes sentences of the retrieval request, performs preliminary semantic retrieval processing on retrieval sentences, then matches the retrieval intention of natural sentences with sentence classes, verb vocabularies and key nouns identified according to semantic retrieval processing results to obtain semantic association attributes among concepts containing key concepts and meeting the user requirements in a semantic network subgraph, determines formalized description of the natural retrieval sentences, and returns the result to the user module.
Illustratively, web data should provide at least two aspects of mining functionality: inquiring network information and data; taking analysis processing and knowledge discovery of Web data as examples. Firstly, a user module is set to manage the user identity and provide an interactive interface; the application module is used for providing application service, and is used for inquiring the corresponding application service instance and constructing a forwarding access request when a user performs access through the user module; the data module is configured as a set of all data resources, optionally a set of one or more Relational Databases (RDBMSs); due to the complexity and heterogeneity of web data components and the thematic nature, a theme graph index module is configured between the data module and the service application module for realizing convenient access of web data, and firstly, the theme graph index module extracts data from each RDBMS and forms a structured semantic index layer by using the theme graph.
The application module sends out a retrieval request for data according to the operation of the user module, the semantic indexing layer of the topic indexing module firstly responds to and carries out primary semantic retrieval processing on the data, then the actual data is pointed according to the retrieval result or the retrieval keywords subjected to data arrangement and semantic processing are provided for the RDBMS, and finally the retrieval result is returned to the user module. It is known that the application module may integrate some or all of the functions of the topic map indexing module when executing service instance running computations.
The topic in the topic map index module is independent of a dynamic and structured index of a specific resource, related actual resources can be retrieved through accessing topic relation examples, so that a user is guided to a specific address to acquire information, and the topic map index module is used for integrating data of different web site systems by utilizing topic maps, mapping and navigating the data and organizing abstract and isolated data to form a structured semantic network.
The topic map is exemplified by the realization of topic indexing of data resources according to the principle of three elements of the topic map TAO. According to the principle of three elements of theme, association and event of the theme map, the corresponding information identifies the theme (Topic), association and event (Occurrence) in the data information. Taking company type information as an example, topics comprise staff, duty, school and the like, wherein the staff and the like are topic types; the contact type comprises tenure, leader, working relation and the like; events include legal persons, names, codes, etc. The identified information is described by the element nodes specified by the XTM respectively, the XTM document is generated, and three sub-topic diagrams are formed respectively.
The topic map index module has good expandability and fusibility based on topic maps, so that under the guidance of a global mode, similarity analysis can be carried out on three established sub-topic maps, high-similarity or consistent topics are combined according to a certain rule, and the sub-topic maps are combined according to a bottom-up mode to form the global topic map. The mode can conveniently realize the addition, deletion and modification of the bottom data.
The topic indexing module is further configured to represent concepts and interrelationships between concepts in the knowledge structure of the information resource based on a knowledge organization system combining ontology knowledge and topic graph. The knowledge organization system mainly comprises two layers of a concept aggregation system and a concept association system.
The concept class aggregation system emphasizes inter-concept hierarchical clusters and category clusters, such as setting up classification tables and topic word tables. Concept association systems represent concepts and the disclosure and organization of interrelationships between concepts, such as topic diagrams and ontologies. The topic map is used for reflecting the interrelationship among knowledge concepts, organizing abstract knowledge content into a knowledge map with coordinate concepts, forming a structured semantic network, and guiding a user to acquire related information resources by means of a link technology.
And a concept network constructed based on the topic map index module and according to topics, association among topics and relation expression among topics and information resource objects is used for searching the network by the user module. The ontology is a conceptual system reflecting the knowledge structure of the specific domain, and is a method for definitely defining, standardizing and sharing the knowledge of the specific domain, and the ontology is divided into a domain ontology and a general ontology.
The domain ontology describes concepts, concept attributes, relationships among concepts and rules to be followed in a specific domain; the universal ontology is a shared ontology in a plurality of fields and is a conceptual combination with a common meaning.
Through the graphic indexing module, semantics between the global pattern and the data sources are established, and when a user provides a query through the user module, the query based on the global pattern submitted by the user is rewritten into a series of executable queries for each data source by utilizing semantic mapping.
Optionally, the data integration subsystem is configured to integrate data of each data source through a global mode, but the integrated data is still stored in each local data source, and the data is converted into a global mode through a wrapper of each data source. The user's query is based on a global query without knowledge of the pattern of each data source, i.e., the pattern of each data source is transparent to the user. When the underlying data source changes, only the virtual logic diagram of the global mode needs to be modified, so that the maintenance cost of the system is remarkably reduced.
Optionally, the user submits a query based on the global mode, the data integration system rewrites the query into an acceptable grammar form through a mapping relationship between the data source mode and the global mode, and transmits the query to the data source, and the query based on the data source is optimized and executed.
For example, when a query request of a user is received, noun terms are firstly analyzed and matched based on a concept knowledge space in an ontology knowledge base, 4 matching concepts are set, and hierarchical relations exist among concepts corresponding to the concepts. Such as year, region, topic, location, etc. There are multiple semantic relationships between the 4 concepts that are matched, such as mapping relationships that have a mutual semantic association between different concepts. Such as a relationship between the topic and the location where it occurs. All semantic connections associated with the concepts are searched, related resources are searched, and all resources related to the concepts and containing semantic relations of various aspects are searched.
And applying a natural sentence semantic analysis method for knowledge description of clusters and verb relations and basic sentence analysis in the ontology to preliminarily understand verb meanings in the user retrieval sentences, and selecting semantic relations corresponding to the verb meanings from a plurality of term attributes.
And carrying out verb vocabulary analysis and matching based on the natural search statement understanding application ontology, taking the example that influence exists in the search words. The sentence class containing the verb may include: and constructing hierarchical relation diagrams in the bearing sentences, the active reaction sentences and the passive reaction sentences.
The verb 'influence' matched in the natural search sentence can appear in 3 sentence classes, respectively have different sentence class composition structures, and further identify the exact sentence class corresponding to the sentence according to the sentence class composition structure in order to identify the accurate meaning of the 'influence' in the sentence.
For example, the influence of who the sentence structure affects on who is born, and the influence of who the sentence structure of the active reaction sentence affects who is born, and the influence of who the sentence structure of the passive reaction sentence is affected by or is born. By means of the comparison analysis, the matched type of the search statement input by the user is matched.
The vocabulary corresponding to the affected vocabulary is collected and the subset of concepts C adjacent to the verb. Further analyzing a concept association attribute set L in a concept set C according to a verb vocabulary set S obtained after verb matching and sentence recognition, screening a semantic association set L 1 associated with verbs in natural retrieval sentences, and obtaining a semantic relationship set L 2 related to key nouns according to semantic relationships associated with concepts contained in a concept sub-set C.
And according to the concept hierarchical relationship in the ontology and the obtained core thing related semantic relationship, combining the L 1 and the keyword noun related semantic relationship set L 2, obtaining the semantic network subgraph after the user input natural sentence analysis.
Finally, according to the fact that the recognized sentence class, verb vocabulary and key nouns are matched with the retrieval intention of the natural sentence, semantic association attributes among concepts containing key concepts and meeting the user requirements in the semantic network subgraph are finally obtained, formal description of the natural retrieval sentence is obtained, and therefore the concept network which is built based on the topic map index module and is established according to the topic, the association among the topics and the relation expression among the topics and the information resource objects is achieved, and the concept network is used for retrieval of the network by the user module.
Taking the example of searching for "population birth changes in a place in recent years", one of birth and change is first extracted as a core verb, i.e. the type of sentence is determined. Based on the knowledge in the ontology knowledge base in the topic map indexing module, the concept hierarchical relationship map corresponding to the concept can be searched for 4 years, and the number, population and population of the place can be searched. And carrying out verb vocabulary analysis and matching based on the natural search statement understanding application ontology, and searching a set of birth and change. I.e., s= { birth, variation }, the sentence types of different verbs are analyzed. For example, the changed sentence classes include general effect sentences and basic effect sentences, and the hierarchical relation diagram of the sentence classes can be obtained.
As described above, two verbs appear in a natural sentence, first, the core verbs thereof are analyzed to determine the basic type of the sentence, then the contexts of the verbs and the like are obtained at the same time, corresponding verb vocabularies are obtained, matched sentence classes and concept subsets adjacent to the verbs are deduced, the 'change' is known to be a core vocabulary, the birth is an auxiliary word, a verb vocabulary set is obtained according to verb matching and distance recognition, and further, a concept association attribute set in the concept set is analyzed, so that a semantic association set related to the verbs in the natural search sentence can be screened out, and then an accurate access search result is obtained according to semantic relationships related to concepts and hierarchical relationships among concepts in a body.
It will be appreciated by those skilled in the art that implementing all or part of the above-described embodiment method may be implemented by a computer program to instruct related hardware, where the program may be stored in a computer readable storage medium, and the program may include the above-described embodiment method when executed. Wherein the storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a Flash Memory (Flash Memory), a hard disk (HARD DISK DRIVE, abbreviated as HDD), a Solid state disk (Solid-state-STATE DRIVE, SSD), or the like; the storage medium may also comprise a combination of memories of the kind described above.
As used in this disclosure, the terms "component," "module," "system," and the like are intended to refer to a computer-related entity, either hardware, firmware, a combination of hardware and software, or software in execution. For example, the components may be, but are not limited to: a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of example, both an application running on a computing device and the computing device can be a component. One or more components may reside within a process and/or thread of execution and a component may be localized on one computer and/or distributed between two or more computers. Furthermore, these components can execute from various computer readable media having various data structures thereon. The components may communicate by way of local and/or remote processes such as in accordance with a signal having one or more data packets (e.g., data from one component interacting with another component in a local system, distributed system, and/or across a network such as the internet with other systems by way of the signal).
It should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that the technical solution of the present invention may be modified or substituted without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered in the scope of the claims of the present invention.

Claims (7)

1. A system for processing heterogeneous data based on a map, characterized in that:
Comprises a user module, an application module, a data module and a theme graph index module, wherein,
The user module is used for managing the user identity and providing an interactive interface for user access;
The application module is used for providing application service, and the user performs inquiry of the corresponding application service instance when accessing through the user module;
The data module is used for providing collective storage for all data resources and comprises one or more relational databases RDBMS;
A topic map index module is configured between the data module and the application module, and is used for extracting data from each RDBMS and forming a structural semantic index layer by utilizing the topic map; the method comprises the steps that a receiving application module sends a retrieval request for data according to the operation of a user module, a topic map indexing module extracts keywords and analyzes sentences of the retrieval request, performs primary semantic retrieval processing on retrieval sentences, then matches sentence types, verb words and key nouns identified according to semantic retrieval processing results with retrieval intentions of natural sentences to obtain semantic network subgraphs, determines formal description of the natural retrieval sentences, and returns retrieval results to the user module; the semantic network subgraph comprises key concepts and inter-concept semantic association attributes meeting the requirements of users;
The subject in the subject map indexing module is a dynamic, structured index independent of the specific resource; the user module retrieves corresponding actual resources through accessing the theme relation application service instance, and guides the user to obtain information from the corresponding network address;
The topic map indexing module is also used for integrating data of different web site systems by utilizing topic maps, mapping and navigating the data, and organizing abstract and isolated data to form a structured semantic network;
The topic map in the topic map index module configures topic indexes of the data resources according to three elements of the topic map, namely topics, connections and events; identifying the topics, the connections and the events in the data information according to the principles of the topics, the connections and the events of the topic map; the identified information is described by the element nodes specified by the XTM respectively, the XTM document is generated, and three sub-topic diagrams are formed respectively.
2. The system of claim 1, wherein: the topic map index module is used for carrying out similarity analysis on the established sub topic map under the global mode of the data, merging topics with high similarity or consistency according to a certain rule, and merging the sub topic map according to the mode from bottom to top to form the global topic map.
3. The system according to claim 2, wherein: the topic map indexing module is also used for representing concepts and interrelationships among the concepts in the information resource knowledge structure based on the ontology knowledge and topic map organization system.
4. A system as claimed in claim 3, wherein: and establishing semantic mapping between the global mode and the data sources through the topic map indexing module, and when a user provides a query through the user module, rewriting and converting the query submitted by the user and based on the global mode into an executable query of each data source by utilizing the semantic mapping.
5. The system as recited in claim 4, wherein: the data module integrates the data of each data source through the global mode, the integrated data is still stored in each local data source, and the data is converted into the global mode through the wrapper of each data source.
6. The system according to claim 5, wherein: the query in the user module is based on global query, and the query is rewritten into an acceptable grammar form through the mapping relation between the data source mode and the global mode and is transmitted to the server where the data source is located, so as to execute the query on the data source.
7. The system of claim 6, wherein: the sentence class identified according to the semantic retrieval processing result comprises: bearing sentence, active reaction sentence and passive reaction sentence.
CN202310517761.3A 2023-05-09 2023-05-09 Processing system of heterogeneous data based on atlas Active CN116756375B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310517761.3A CN116756375B (en) 2023-05-09 2023-05-09 Processing system of heterogeneous data based on atlas

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310517761.3A CN116756375B (en) 2023-05-09 2023-05-09 Processing system of heterogeneous data based on atlas

Publications (2)

Publication Number Publication Date
CN116756375A CN116756375A (en) 2023-09-15
CN116756375B true CN116756375B (en) 2024-05-07

Family

ID=87954088

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310517761.3A Active CN116756375B (en) 2023-05-09 2023-05-09 Processing system of heterogeneous data based on atlas

Country Status (1)

Country Link
CN (1) CN116756375B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101201842A (en) * 2007-10-30 2008-06-18 北京航空航天大学 Digital museum gridding and construction method thereof
CN101246492A (en) * 2008-02-26 2008-08-20 华中科技大学 Full text retrieval system based on natural language
CN102236664A (en) * 2010-04-28 2011-11-09 百度在线网络技术(北京)有限公司 Retrieval system, retrieval method and information processing method based on semantic normalization
CN103838833A (en) * 2014-02-24 2014-06-04 华中师范大学 Full-text retrieval system based on semantic analysis of relevant words
CN109783067A (en) * 2018-11-30 2019-05-21 复旦大学 Intelligent knowledge integration and searching system and method based on ontology CallCenter platform
CN111061828A (en) * 2019-11-29 2020-04-24 华中师范大学 Digital library knowledge retrieval method and device
CN113505234A (en) * 2021-06-07 2021-10-15 中国科学院地理科学与资源研究所 Construction method of ecological civilization geographical knowledge map
CN115510214A (en) * 2022-10-21 2022-12-23 国泰君安证券股份有限公司 A stock index knowledge graph construction processing system, method, device, processor and storage medium for intelligent question answering in financial field
CN116049454A (en) * 2022-11-01 2023-05-02 齐鲁空天信息研究院 Intelligent searching method and system based on multi-source heterogeneous data

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7685118B2 (en) * 2004-08-12 2010-03-23 Iwint International Holdings Inc. Method using ontology and user query processing to solve inventor problems and user problems
US8335754B2 (en) * 2009-03-06 2012-12-18 Tagged, Inc. Representing a document using a semantic structure
JP2020514935A (en) * 2017-03-15 2020-05-21 ファウナ, インク.Fauna, Inc. Method and system for a database
US11036768B2 (en) * 2018-06-21 2021-06-15 LeapAnalysis Inc. Scalable capturing, modeling and reasoning over complex types of data for high level analysis applications

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101201842A (en) * 2007-10-30 2008-06-18 北京航空航天大学 Digital museum gridding and construction method thereof
CN101246492A (en) * 2008-02-26 2008-08-20 华中科技大学 Full text retrieval system based on natural language
CN102236664A (en) * 2010-04-28 2011-11-09 百度在线网络技术(北京)有限公司 Retrieval system, retrieval method and information processing method based on semantic normalization
CN103838833A (en) * 2014-02-24 2014-06-04 华中师范大学 Full-text retrieval system based on semantic analysis of relevant words
CN109783067A (en) * 2018-11-30 2019-05-21 复旦大学 Intelligent knowledge integration and searching system and method based on ontology CallCenter platform
CN111061828A (en) * 2019-11-29 2020-04-24 华中师范大学 Digital library knowledge retrieval method and device
CN113505234A (en) * 2021-06-07 2021-10-15 中国科学院地理科学与资源研究所 Construction method of ecological civilization geographical knowledge map
CN115510214A (en) * 2022-10-21 2022-12-23 国泰君安证券股份有限公司 A stock index knowledge graph construction processing system, method, device, processor and storage medium for intelligent question answering in financial field
CN116049454A (en) * 2022-11-01 2023-05-02 齐鲁空天信息研究院 Intelligent searching method and system based on multi-source heterogeneous data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"国内外知识图谱发展趋势和研究热点演变分析";张玉柳等;《图书馆理论与实践》;20210131;121-128 *

Also Published As

Publication number Publication date
CN116756375A (en) 2023-09-15

Similar Documents

Publication Publication Date Title
US8862458B2 (en) Natural language interface
US20040010491A1 (en) User interface framework
WO1997045800A1 (en) Querying heterogeneous data sources distributed over a network using context interchange and data extraction
US11250044B2 (en) Term-cluster knowledge graph for support domains
Brisaboa et al. Exploiting geographic references of documents in a geographical information retrieval system using an ontology-based index
CN111563141A (en) Natural language for structured query generation via restitution
CN109885665A (en) A kind of data query method, apparatus and system
US9201905B1 (en) Semantically mediated access to knowledge
Pokorný Integration of relational and graph databases functionally
Truică et al. TextBenDS: a generic textual data benchmark for distributed systems
Zhao et al. Forestry big data platform by Knowledge Graph
WO2016159819A1 (en) System and method for data search in a graph database
CN108804580B (en) Method for querying keywords in federal RDF database
CN116756375B (en) Processing system of heterogeneous data based on atlas
Vasilyeva et al. Leveraging flexible data management with graph databases
Kettouch et al. SemiLD: mediator-based framework for keyword search over semi-structured and linked data
Bright et al. Linguistic support for semantic identification and interpretation in multidatabases
Sellami et al. MidSemI: a middleware for semantic integration of business data with large-scale social and linked data
JP2023536005A (en) Data clustering method and system, data storage method and system, and storage medium
Kononova et al. Contextual knowledge extraction: terminological landscape of digital economy
Ostrowski et al. A semantic based framework for the purpose of big data integration
Kuchmann-Beauger Question answering system in a business intelligence context
Varanka et al. Topographic mapping data semantics through data conversion and enhancement
Khan Processing big data with natural semantics and natural language understanding using brain-like approach
US20240184793A1 (en) Deep mining of enterprise data sources

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant