CN111177244A - Data association analysis method for multiple heterogeneous databases - Google Patents

Data association analysis method for multiple heterogeneous databases Download PDF

Info

Publication number
CN111177244A
CN111177244A CN201911352580.XA CN201911352580A CN111177244A CN 111177244 A CN111177244 A CN 111177244A CN 201911352580 A CN201911352580 A CN 201911352580A CN 111177244 A CN111177244 A CN 111177244A
Authority
CN
China
Prior art keywords
data
data source
source
heterogeneous databases
association
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911352580.XA
Other languages
Chinese (zh)
Inventor
肖明
黄冠铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Winshare Education Science & Technology Co ltd
Original Assignee
Sichuan Winshare Education Science & Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Winshare Education Science & Technology Co ltd filed Critical Sichuan Winshare Education Science & Technology Co ltd
Priority to CN201911352580.XA priority Critical patent/CN111177244A/en
Publication of CN111177244A publication Critical patent/CN111177244A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data

Abstract

The invention provides a data association analysis method facing a plurality of heterogeneous databases, which increases the association query support for data of a plurality of data sources, and helps a user to analyze and find the association of the data between the heterogeneous data sources; better expandability is realized, and the future novel data source and novel data of an enterprise are supported to be accessed; better fault tolerance and better analysis performance: synchronizing data of multiple data sources to a target data source can avoid single-point faults, and performance expansion can be better performed by using the target data source; and the heterogeneous databases are accessed through the global view based on the intermediary model, the distribution and heterogeneous differences of the databases are shielded, a uniform data view is provided, the problem of mode conflicts among the heterogeneous databases is solved, and the efficient access to the heterogeneous databases is realized.

Description

Data association analysis method for multiple heterogeneous databases
Technical Field
The invention particularly relates to a data association analysis method for a plurality of heterogeneous databases.
Background
Heterogeneous data is a collection of related pieces of data. The heterogeneous database system is a collection of related database systems, can realize sharing and transparent access of data, and a plurality of database systems exist before being added into the heterogeneous database system. Each component part which is provided with the own data array management system and the external database has autonomy, and each database system still has own application characteristic, integrity control and security control while realizing data sharing.
For heterogeneous database systems, achieving data sharing should achieve two points: firstly, database conversion is realized; and secondly, realizing transparent access of data. The commercialized database management system DM3 system developed by the university of science and technology in Huazhong and possessing independent copyright realizes the two points through the provided database conversion tool and API interface. DM3 provides a database transformation tool that transforms a model defined in one database system into a model defined in another database system, and then reloads the data as needed, so that users can use their familiar database systems and familiar query languages to achieve the goal of data sharing.
In the present stage, different business data of many large-scale enterprises may be stored in heterogeneous databases, and with the continuous development of businesses, a new business model and a new data storage mode brought about will increase the possibility that the enterprises use heterogeneous data sources to store the data of the enterprises. The existing data analysis tools are mostly used for extracting, analyzing and displaying data in independent data sources, various data sources exist in the face of enterprises, data of multiple data sources need to be comprehensively analyzed, and the existing data analysis tools can not meet business requirements in the process of correlation analysis.
Disclosure of Invention
The present invention is directed to a method for analyzing data association of multiple heterogeneous databases, which can solve the above problems.
In order to meet the requirements, the technical scheme adopted by the invention is as follows: the data association analysis method for the plurality of heterogeneous databases comprises the following steps:
s1: creating a data access layer and a data access program;
s2: creating a dynamic link library plug-in corresponding to each type of data source;
s3: for each type of data source, creating a corresponding data access service in the data access layer;
s4: accessing and analyzing data of a data source, and sending the analyzed data to a service which is dynamically configured with the data source of the type in advance in an upper service system service;
s5: respectively setting a corresponding data source strategy category for a plurality of preset old data sources;
s6: when a plurality of new data sources are detected to be accessed, setting a corresponding data source strategy category for each different new data source;
s7: operating the data source strategy category to isolate each new data source and analyze the structure of all tables in each data source;
s8: generating a unique corresponding table structure for each data source table in a single target data source, and migrating the table structure in each data source to a single target data source system;
s9: modeling and inquiring SQL generation;
s10: modeling is carried out on a data analysis platform, a system analyzes a data source list corresponding to a model, and data of a specified table in a specified data source are loaded into the target data source;
s11: and after the query statement is executed aiming at the target data source, returning the data in a JSON format, and displaying the data by using a visualization tool based on the data.
The data association analysis method facing the heterogeneous databases increases the association query support for the data of multiple data sources, and helps users analyze and find the association of the data between the heterogeneous data sources; better expandability is realized, and the future novel data source and novel data of an enterprise are supported to be accessed; better fault tolerance and better analysis performance: synchronizing data of multiple data sources to a target data source can avoid single-point faults, and performance expansion can be better performed by using the target data source; and the heterogeneous databases are accessed through the global view based on the intermediary model, the distribution and heterogeneous differences of the databases are shielded, a uniform data view is provided, the problem of mode conflicts among the heterogeneous databases is solved, and the efficient access to the heterogeneous databases is realized.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 schematically shows a flowchart of a data association analysis method for multiple heterogeneous databases according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail with reference to the accompanying drawings and specific embodiments.
In the following description, references to "one embodiment," "an embodiment," "one example," "an example," etc., indicate that the embodiment or example so described may include a particular feature, structure, characteristic, property, element, or limitation, but every embodiment or example does not necessarily include the particular feature, structure, characteristic, property, element, or limitation. Moreover, repeated use of the phrase "in accordance with an embodiment of the present application" although it may possibly refer to the same embodiment, does not necessarily refer to the same embodiment.
Certain features that are well known to those skilled in the art have been omitted from the following description for the sake of simplicity.
According to an embodiment of the present application, there is provided a data association analysis method for multiple heterogeneous databases, as shown in fig. 1, including the following steps:
s1: creating a data access layer and a data access program;
s2: creating a dynamic link library plug-in corresponding to each type of data source;
s3: for each type of data source, creating a corresponding data access service in the data access layer;
s4: accessing and analyzing data of a data source, and sending the analyzed data to a service which is dynamically configured with the data source of the type in advance in an upper service system service;
s5: respectively setting a corresponding data source strategy category for a plurality of preset old data sources;
s6: when a plurality of new data sources are detected to be accessed, setting a corresponding data source strategy category for each different new data source;
s7: operating the data source strategy category to isolate each new data source and analyze the structure of all tables in each data source;
s8: generating a unique corresponding table structure for each data source table in a single target data source, and migrating the table structure in each data source to a single target data source system;
s9: modeling and inquiring SQL generation;
s10: modeling is carried out on a data analysis platform, a system analyzes a data source list corresponding to a model, and data of a specified table in a specified data source are loaded into the target data source;
s11: and after the query statement is executed aiming at the target data source, returning the data in a JSON format, and displaying the data by using a visualization tool based on the data.
According to an embodiment of the present application, the method for analyzing data association facing multiple heterogeneous databases further includes the following steps:
s13: and after the query statement is executed aiming at the target data source, returning the data in a JSON format, and performing subsequent analysis based on the data.
According to an embodiment of the application, the specific steps of performing data synchronization in the data association analysis method for multiple heterogeneous databases are as follows:
s14: a first connection of a data platform with at least one data source is established, and a second connection of the data platform with the application server is established.
According to an embodiment of the present application, the method for analyzing data association facing multiple heterogeneous databases further includes the following steps:
s15: receiving a source table and a data source type which are selected by a user and need to be synchronized;
s16: and generating a table establishing task and a data synchronization task corresponding to each data source according to the source table structure of each data source obtained by pre-analysis.
According to an embodiment of the present application, the method for analyzing data association facing multiple heterogeneous databases further includes the following steps:
s17: issuing the table establishing task and the data synchronization task corresponding to each generated data source to a workflow scheduling engine system;
s18: and executing the table building task and the data synchronization task corresponding to each data source through the preset workflow scheduling engine system.
According to one embodiment of the application, the data access service of the data association analysis method for the heterogeneous databases is used for calling the dynamic link library plug-in corresponding to the type of data source through the main program of the data access service.
According to an embodiment of the application, the accessed data source in the data association analysis method facing the multiple heterogeneous databases is one of the databases based on Mysql, Oracle and SqlServer.
According to an embodiment of the application, in the data association analysis method for the multiple heterogeneous databases, a preset dynamic link library plug-in corresponding to each type of data source is created according to a preset access mode and a preset data protocol format of each type of data source.
According to one embodiment of the application, the data source policy category of the data association analysis method for the heterogeneous databases is used for encapsulating the event and node data information of the new data source.
According to an embodiment of the application, the step of displaying the data of the data association analysis method for multiple heterogeneous databases by using a visualization tool specifically includes:
s19: acquiring description information corresponding to the visual display area;
s20: determining data parameters of the object to be displayed according to the object to be displayed;
s21: determining a display attribute of the visualization display area indicated by the attribute description content;
s22: and setting the display attribute according to the value of the data parameter so as to enable the display attribute to be matched with the value.
According to one embodiment of the application, the description information of the data association analysis method for the plurality of heterogeneous databases comprises parameter description content and attribute description content.
According to one embodiment of the application, the method for analyzing data association facing to a plurality of heterogeneous databases
According to one embodiment of the application, the data association analysis method facing multiple heterogeneous databases
The above-mentioned embodiments only show some embodiments of the present invention, and the description thereof is more specific and detailed, but should not be construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the claims.

Claims (10)

1. A data association analysis method oriented to a plurality of heterogeneous databases is characterized by comprising the following steps:
s1: creating a data access layer and a data access program;
s2: creating a dynamic link library plug-in corresponding to each type of data source;
s3: for each type of data source, creating a corresponding data access service in the data access layer;
s4: accessing and analyzing data of a data source, and sending the analyzed data to a service which is dynamically configured with the data source of the type in advance in an upper service system service;
s5: respectively setting a corresponding data source strategy category for a plurality of preset old data sources;
s6: when a plurality of new data sources are detected to be accessed, setting a corresponding data source strategy category for each different new data source;
s7: operating the data source strategy category to isolate each new data source and analyze the structure of all tables in each data source;
s8: generating a unique corresponding table structure for each data source table in a single target data source, and migrating the table structure in each data source to a single target data source system;
s9: modeling and inquiring SQL generation;
s10: modeling is carried out on a data analysis platform, a system analyzes a data source list corresponding to a model, and data of a specified table in a specified data source are loaded into the target data source;
s11: and after the query statement is executed aiming at the target data source, returning the data in a JSON format, and displaying the data by using a visualization tool based on the data.
2. The method for analyzing data association of multiple heterogeneous databases according to claim 1, further comprising the steps of:
s12: and after the query statement is executed aiming at the target data source, returning the data in a JSON format, and performing subsequent analysis based on the data.
3. The data association analysis method for the multiple heterogeneous databases according to claim 1, wherein the specific steps for performing data synchronization are as follows:
s14: a first connection of a data platform with at least one data source is established, and a second connection of the data platform with the application server is established.
4. The method for analyzing data association of multiple heterogeneous databases according to claim 2, further comprising the steps of:
s15: receiving a source table and a data source type which are selected by a user and need to be synchronized;
s16: and generating a table establishing task and a data synchronization task corresponding to each data source according to the source table structure of each data source obtained by pre-analysis.
5. The method for analyzing data association of multiple heterogeneous databases according to claim 3, further comprising the steps of:
s17: issuing the table establishing task and the data synchronization task corresponding to each generated data source to a workflow scheduling engine system;
s18: and executing the table building task and the data synchronization task corresponding to each data source through the preset workflow scheduling engine system.
6. The method for analyzing data association oriented to multiple heterogeneous databases according to claim 1, wherein: and the data access service is used for calling the dynamic link library plug-in corresponding to the data source of the type through the main program of the data access service.
7. The method for analyzing data association oriented to multiple heterogeneous databases according to claim 1, wherein: the accessed data source is a database based on Mysql, Oracle and SqlServer.
8. The method for analyzing data association oriented to multiple heterogeneous databases according to claim 1, wherein: and creating a dynamic link library plug-in corresponding to each preset type of data source according to the preset access mode and data protocol format of each type of data source.
9. The method for analyzing data association oriented to multiple heterogeneous databases according to claim 1, wherein: the data source policy category is used for encapsulating event and node data information of the new data source.
10. The method for analyzing data association oriented to multiple heterogeneous databases according to claim 1, wherein the step of displaying the data by using a visualization tool specifically comprises:
s19: acquiring description information corresponding to the visual display area;
s20: determining data parameters of the object to be displayed according to the object to be displayed;
s21: determining a display attribute of the visualization display area indicated by the attribute description content;
s22: and setting the display attribute according to the value of the data parameter so as to enable the display attribute to be matched with the value.
CN201911352580.XA 2019-12-24 2019-12-24 Data association analysis method for multiple heterogeneous databases Pending CN111177244A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911352580.XA CN111177244A (en) 2019-12-24 2019-12-24 Data association analysis method for multiple heterogeneous databases

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911352580.XA CN111177244A (en) 2019-12-24 2019-12-24 Data association analysis method for multiple heterogeneous databases

Publications (1)

Publication Number Publication Date
CN111177244A true CN111177244A (en) 2020-05-19

Family

ID=70655676

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911352580.XA Pending CN111177244A (en) 2019-12-24 2019-12-24 Data association analysis method for multiple heterogeneous databases

Country Status (1)

Country Link
CN (1) CN111177244A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112100261A (en) * 2020-09-14 2020-12-18 南京国睿信维软件有限公司 Object model modeling method based on heterogeneous data source connection
CN112540975A (en) * 2020-12-29 2021-03-23 中科院计算技术研究所大数据研究院 Multi-source heterogeneous data quality detection method based on petri net
CN113688288A (en) * 2021-09-02 2021-11-23 广州广电运通金融电子股份有限公司 Data association analysis method and device, computer equipment and storage medium
CN113722600A (en) * 2021-09-06 2021-11-30 阿波罗智联(北京)科技有限公司 Data query method, device, equipment and product applied to big data
CN113722600B (en) * 2021-09-06 2024-04-26 阿波罗智联(北京)科技有限公司 Data query method, device, equipment and product applied to big data

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120310899A1 (en) * 2011-06-03 2012-12-06 Scott Lawrence Wasserman System and method for efficient data exchange in a multi-platform network of heterogeneous devices
CN104899295A (en) * 2015-06-09 2015-09-09 苏州国云数据科技有限公司 Heterogeneous data source data association analysis method
CN107528864A (en) * 2016-06-20 2017-12-29 中国科学院微电子研究所 Heterogeneous network data processing method and system
CN109299068A (en) * 2018-08-31 2019-02-01 安徽四创电子股份有限公司 From relevant database to the data flow migration method of HBase database
CN109829009A (en) * 2018-12-28 2019-05-31 北京邮电大学 Configurable isomeric data real-time synchronization and visual system and method
US20190188308A1 (en) * 2017-12-20 2019-06-20 Sap Se Computing data lineage across a network of heterogeneous systems
US20190197174A1 (en) * 2017-12-22 2019-06-27 Warevalley Co., Ltd. Method and system for replicating data to heterogeneous database and detecting synchronization error of heterogeneous database through sql packet analysis

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120310899A1 (en) * 2011-06-03 2012-12-06 Scott Lawrence Wasserman System and method for efficient data exchange in a multi-platform network of heterogeneous devices
CN104899295A (en) * 2015-06-09 2015-09-09 苏州国云数据科技有限公司 Heterogeneous data source data association analysis method
CN107528864A (en) * 2016-06-20 2017-12-29 中国科学院微电子研究所 Heterogeneous network data processing method and system
US20190188308A1 (en) * 2017-12-20 2019-06-20 Sap Se Computing data lineage across a network of heterogeneous systems
US20190197174A1 (en) * 2017-12-22 2019-06-27 Warevalley Co., Ltd. Method and system for replicating data to heterogeneous database and detecting synchronization error of heterogeneous database through sql packet analysis
CN109299068A (en) * 2018-08-31 2019-02-01 安徽四创电子股份有限公司 From relevant database to the data flow migration method of HBase database
CN109829009A (en) * 2018-12-28 2019-05-31 北京邮电大学 Configurable isomeric data real-time synchronization and visual system and method

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112100261A (en) * 2020-09-14 2020-12-18 南京国睿信维软件有限公司 Object model modeling method based on heterogeneous data source connection
CN112100261B (en) * 2020-09-14 2024-04-09 南京国睿信维软件有限公司 Object model modeling method based on heterogeneous data source connection
CN112540975A (en) * 2020-12-29 2021-03-23 中科院计算技术研究所大数据研究院 Multi-source heterogeneous data quality detection method based on petri net
CN113688288A (en) * 2021-09-02 2021-11-23 广州广电运通金融电子股份有限公司 Data association analysis method and device, computer equipment and storage medium
CN113688288B (en) * 2021-09-02 2023-09-29 广州广电运通金融电子股份有限公司 Data association analysis method, device, computer equipment and storage medium
CN113722600A (en) * 2021-09-06 2021-11-30 阿波罗智联(北京)科技有限公司 Data query method, device, equipment and product applied to big data
CN113722600B (en) * 2021-09-06 2024-04-26 阿波罗智联(北京)科技有限公司 Data query method, device, equipment and product applied to big data

Similar Documents

Publication Publication Date Title
US9026901B2 (en) Viewing annotations across multiple applications
CN111382226B (en) Database query and retrieval method and device and electronic equipment
CN104899295B (en) A kind of heterogeneous data source data relation analysis method
CN103218402B (en) General database data structure and data mover system and method thereof
CN111177244A (en) Data association analysis method for multiple heterogeneous databases
CN105468720A (en) Method for integrating distributed data processing systems, corresponding systems and data processing method
KR20150118975A (en) System and methods for multi-user cax editing conflict management
CN111639082B (en) Object storage management method and system of billion-level node scale knowledge graph based on Ceph
CN109408493A (en) A kind of moving method and system of data source
CN111046036A (en) Data synchronization method, device, system and storage medium
US8694525B2 (en) Systems and methods for performing index joins using auto generative queries
JP2001350656A (en) Integrated access method for different data sources
CN104881749A (en) Data management method and data storage system for multiple tenants
CN111090803A (en) Data processing method and device, electronic equipment and storage medium
CN111143468A (en) Multi-database data management method based on MPP distributed technology
CN114329096A (en) Method and system for processing native map database
CN107239568B (en) Distributed index implementation method and device
US11354313B2 (en) Transforming a user-defined table function to a derived table in a database management system
CA2510644A1 (en) Quality of service feedback for technology-neutral data reporting
CN114153547B (en) Management page display method and device
CN113590651B (en) HQL-based cross-cluster data processing system and method
CN115168396A (en) Comprehensive intelligent platform data management method and system based on spatio-temporal system
CN111563123B (en) Real-time synchronization method for hive warehouse metadata
US20190303460A1 (en) Transaction-based pseudo-script generation for scheduling and implementing database schema changes
CN116737113B (en) Metadata catalog management system and method for mass scientific data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination