CN112445867A - Intelligent analysis method and system for data relationship - Google Patents

Intelligent analysis method and system for data relationship Download PDF

Info

Publication number
CN112445867A
CN112445867A CN201910761620.XA CN201910761620A CN112445867A CN 112445867 A CN112445867 A CN 112445867A CN 201910761620 A CN201910761620 A CN 201910761620A CN 112445867 A CN112445867 A CN 112445867A
Authority
CN
China
Prior art keywords
database
data
independent
relations
semantic analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910761620.XA
Other languages
Chinese (zh)
Inventor
张颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Hantong New Energy Co ltd
Original Assignee
Chengdu Hantong New Energy Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Hantong New Energy Co ltd filed Critical Chengdu Hantong New Energy Co ltd
Priority to CN201910761620.XA priority Critical patent/CN112445867A/en
Publication of CN112445867A publication Critical patent/CN112445867A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data relation intelligent analysis method and a system based on database redo logs, semantic analysis and a graphic database, wherein the method comprises the following steps: collecting redo logs of a relational database contained in a single independent application system, and automatically extracting SQL sentences in batches; performing semantic analysis on the extracted SQL sentences to form an independent Abstract Syntax Tree (AST) of each SQL sentence; importing all acquired information of the independent abstract syntax trees into a graphic database; the information of all independent abstract syntax trees is subjected to incidence relation establishment, de-duplication and combination in the graphic database, and finally, data incidence relations between the inside of the independent application system and different heterogeneous application systems are formed, so that a user can be helped to accurately and intuitively know and analyze data relations between the inside of the independent application system and different application systems, including database field mapping relations, constraint relations and intermediate process implicit data relations, and the core foundation of all data integration work is achieved.

Description

Intelligent analysis method and system for data relationship
Technical Field
The invention relates to the technical field of database redo logs, semantic analysis, graphic databases and the like, in particular to a data relation intelligent analysis method and system based on the database redo logs, the semantic analysis and the graphic databases.
Background
Database redo log: in order to ensure data security, each mainstream relational database manufacturer such as ORACLE, MYSQL, MS-SQLSERVER, SYSBASE, DB2 provides a database redo log function for each database instance to record each step of change vector of database data. The database redo log is also called a redo thread, and the redo log file is filled with redo records. The redo records, also called redo entries, are made up of a set of change vectors, each describing changes made to a single block in the database, and the redo records all contain SQL commands that contain all the original operations that the application system performed on the database.
SQL: structured Query Language (Structured Query Language) is a database Query and programming Language for accessing data and querying, updating, and managing relational database systems.
Semantic analysis: semantic analysis is a logical phase of the compilation process, the task of which is to perform context-related property scrutiny, type-scrutiny, on structurally correct source programs. The semantic analysis is to examine whether semantic error exists in the source program, collect type information for the code generation stage, and perform semantic check and processing according to the grammar category identified by the grammar analyzer to generate corresponding intermediate code or target code.
ANTLR: ANTLR is an open source parser (ANTLR-other Tool for Language Recognition) that can generate abstract syntax trees, and provides a framework for automatically constructing recognizers, compilers, and interpreters of custom languages by syntactic descriptions for a variety of languages including Java, SQL, C +, C #.
A graph database: the graph database is a non-relational database that stores relational information between entities using graph theory. The most common example is the interpersonal relationship in social networks. Relational databases are not effective for storing "relational" data, are complex, slow, and beyond expectations in querying, and the unique design of graphic databases just remedies this deficiency. Commonly used graphic databases are Neo4j, FlockDB, GraphDB, InfiniteGraph, etc.
In recent years, due to the rapid development of computer technology, especially mobile internet technology, the construction demand of large data platforms with unified integration for each department to realize data sharing has increased explosively. In the past, various application systems suitable for the department of the unit are gradually and respectively developed and brought on line due to business requirements of various systems of the unit, and individual information isolated islands are gradually formed due to the non-uniformity of technical means, technical levels and construction standards. At present, the mainstream data integration mode mainly realizes data fusion of different systems by developing a uniform data interface, but the work does not leave the support of an original application system developer and the accurate grasp of a data dictionary. By using the method, under the condition of being separated from the support of an application system developer or lacking of technical data of related application systems, the internal data structure of each application system can be completely known, the operation and maintenance difficulty of a single application system is reduced, the data association relation between the single application system and different application systems is accurately described, the most difficult database field relation mapping and intermediate process implicit data analysis work in the large data platform construction and data management can be intelligently completed, and real-time and effective data guarantee is provided for business intelligent work such as later data extraction, data warehouse construction, statistical reports, decision analysis and the like.
Disclosure of Invention
The invention aims to provide a data relation intelligent analysis method and a data relation intelligent analysis system based on database redo logs, semantic analysis and a graphic database.
The invention aims to provide an intelligent data relation analysis method based on database redo logs, semantic analysis and a graphic database, which comprises the following steps:
collecting redo logs of a relational database contained in a single independent application system;
automatically extracting SQL sentences related to the business operation of the application system in batches from the database redo log;
performing semantic analysis on the extracted SQL sentences by using an ANTLR semantic analysis tool to form an independent Abstract Syntax Tree (AST) of each SQL sentence;
importing all acquired information of independent Abstract Syntax Trees (AST) into a graph database;
performing association relationship establishment, de-duplication, combination and other operations on information of all independent Abstract Syntax Trees (AST) in a graph database;
through the method, the data association relationship between the inside of the independent application system and different heterogeneous application systems is finally obtained, wherein the data association relationship comprises the mapping relationship and the constraint relationship between the database table and the field and the implicit data relationship of the intermediate process of the bearing service logic relationship necessary for the operation of the independent application system.
Wherein, the collecting the redo log of the relational database contained in the single independent application system comprises: exporting the redo log of the database through a self-contained management tool of the database and a command line, and acquiring the redo log of the relational database in a mode of third-party software acquisition and the like;
the extracting and extracting SQL statements related to the business operation of the application system from the database redo log comprises the following steps: according to the setting of an individual application system, SQL sentences related to the business operation of the application system are extracted from the acquired database redo log, wherein the SQL sentences comprise complete SQL operation sentences such as adding, deleting, modifying and checking;
performing semantic analysis on the extracted SQL statements by using an ANTLR semantic analysis tool to form an independent Abstract Syntax Tree (AST) of each SQL statement, including: according to the set grammar, carrying out lexical analysis and syntactic analysis on the single SQL statement by using an ANTLR semantic analysis tool, and decomposing a clear data operation object, an operation process and an operation result represented by the SQL statement;
wherein the importing the acquired information of all independent Abstract Syntax Trees (AST) into the graphics database includes: converting the results of the lexical analysis and the syntactic analysis of the SQL sentences into a syntactic format specified by a used graphic database, and storing the results of the lexical analysis and the syntactic analysis of the SQL sentences into the graphic database in a command line mode, a graphic database with management tool input, a third-party graphic database operation interface and other modes;
wherein, the operations of establishing the association relationship, de-overlapping and merging and the like of the information of all independent Abstract Syntax Trees (AST) in the graph database comprise: in a graphic database, according to a result generated after semantic analysis is carried out on an individual SQL statement, correlation is carried out according to related elements such as names, attributes, attribute values, literal values and the like, similar combination is carried out on repeated contents according to corresponding rules, and repeated items are removed;
the invention also provides a system for intelligently analyzing data relationship based on database redo log, semantic analysis and graphic database, comprising: the system comprises a database redo log SQL acquisition module, a database redo log SQL filtering module, a single SQL statement semantic analysis module, a single semantic analysis result importing module, a multiple semantic analysis result duplication removing module, a multiple semantic analysis result association module and a data relationship display module;
the database redo log SQL acquisition module comprises: exporting the redo log of the database through a self-contained management tool of the database and a command line, and acquiring the redo log of the relational database in a mode of third-party software acquisition and the like;
the database redo log SQL filtering module comprises: according to the setting of an individual application system, SQL sentences related to system operation are extracted from the acquired database redo log, wherein the SQL sentences comprise complete SQL operation sentences such as adding, deleting, modifying, checking and the like;
wherein, the single SQL statement semantic analysis module comprises: according to the set grammar, carrying out lexical analysis and syntactic analysis on the single SQL statement by using an ANTLR semantic analysis tool, and decomposing a clear data operation object, an operation process and an operation result represented by the SQL statement;
wherein the single semantic analysis result importing module comprises: converting the results of the lexical analysis and the syntactic analysis of the SQL sentences into syntactic formats specified by a graphic database, and storing the results of the lexical analysis and the syntactic analysis of the SQL sentences into the graphic database in a command line mode, a graphic database with management tool input, a third-party graphic database operation interface and other modes;
wherein the plurality of semantic analysis result association modules comprise: in a graphic database, performing correlation according to related elements such as names, attributes, attribute values, literal values and the like according to results generated after semantic analysis is performed on an individual SQL statement;
wherein the multiple semantic analysis result deduplication modules comprise: performing similar combination on the repeated contents according to corresponding rules, and removing repeated items;
wherein, the data relation display module comprises: after correlation, duplicate removal and other operations are carried out according to the result of semantic analysis on SQL semantics in a graph database, a data association relation file required by a user is generated according to the requirements of the user and is displayed in various different modes such as a file, a graph, a database and the like;
drawings
FIG. 1 is a schematic diagram of a database redo log, semantic analysis and graphical database based intelligent analysis method and system for data relationships of the present invention;
Detailed Description
The invention provides a data relation intelligent analysis method and a system based on database redo logs, semantic analysis and a graphic database, comprising the following steps: collecting relational database redo logs contained in a single independent application system, and automatically extracting SQL sentences related to the business operation of the application system in batches from the database redo logs; performing semantic analysis on the extracted SQL sentences by using an ANTLR semantic analysis tool to form an independent Abstract Syntax Tree (AST) of each SQL sentence; importing all acquired information of independent Abstract Syntax Trees (AST) into a graph database; and performing association relationship establishment, de-duplication combination and other operations on information of all independent Abstract Syntax Trees (AST) in the graph database to finally form data association relationships between the inside of the independent application system and different heterogeneous application systems, including mapping relationships and constraint relationships between database tables and fields and intermediate process implicit data relationships of service bearing logical relationships necessary for running of the independent application systems.
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, fig. 1 is a schematic diagram of a method and a system for intelligently analyzing data relationships based on database redo logs, semantic analysis and a graphic database according to the present invention, including:
s1, a database redo log SQL acquisition module;
s2, a database redo log SQL filtering module;
s3, a single SQL statement semantic analysis module;
s4, importing a single semantic analysis result into a module;
s5, a plurality of semantic analysis result duplicate removal modules;
s6, a plurality of semantic analysis result association modules;
s7, a data relation display module.
The database redo log SQL acquisition module comprises: exporting the redo log of the database through a self-contained management tool of the database and a command line, and acquiring the redo log of the relational database in a mode of third-party software acquisition and the like;
the database redo log SQL filtering module comprises: according to the setting of an individual application system, SQL sentences related to system operation are extracted from the acquired database redo log, wherein the SQL sentences comprise complete SQL operation sentences such as adding, deleting, modifying, checking and the like;
wherein, the single SQL statement semantic analysis module comprises: according to the set grammar, carrying out lexical analysis and syntactic analysis on the single SQL statement by using an ANTLR semantic analysis tool, and decomposing a clear data operation object, an operation process and an operation result represented by the SQL statement;
wherein the single semantic analysis result importing module comprises: converting the results of the lexical analysis and the syntactic analysis of the SQL sentences into syntactic formats specified by a graphic database, and storing the results of the lexical analysis and the syntactic analysis of the SQL sentences into the graphic database in a command line mode, a graphic database with management tool input, a third-party graphic database operation interface and other modes;
wherein the plurality of semantic analysis result association modules comprise: in a graphic database, performing correlation according to related elements such as names, attributes, attribute values, literal values and the like according to results generated after semantic analysis is performed on an individual SQL statement;
wherein the multiple semantic analysis result deduplication modules comprise: performing similar combination on the repeated contents according to corresponding rules, and removing repeated items;
wherein, the data relation display module comprises: after correlation, duplicate removal and other operations are carried out according to the result of semantic analysis on SQL semantics in a graph database, a data association relation file required by a user is generated according to the requirements of the user and is displayed in various different modes such as a file, a graph, a database and the like;
the data relation intelligent analysis method and system based on the database redo log, the semantic analysis and the graphic database have various realization forms. The foregoing description discloses and describes merely exemplary embodiments of the invention. One skilled in the art will readily recognize from such discussion and from the accompanying drawings and claims that various changes, modifications and variations can be made therein without departing from the spirit and scope of the invention as defined in the following claims. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (2)

1. A data relation intelligent analysis method and system based on database redo log, semantic analysis and graphic database is characterized by comprising the following steps:
1) automatically extracting SQL sentences related to the business operation of the application system in batches from the database redo log;
2) performing semantic analysis on the extracted SQL sentences by using an ANTLR semantic analysis tool to form an independent Abstract Syntax Tree (AST) of each SQL sentence;
3) importing all acquired information of independent Abstract Syntax Trees (AST) into a graph database;
4) and performing association relationship establishment, de-duplication combination and other operations on the information of all independent Abstract Syntax Trees (AST) in the graph database.
2. And finally, forming data association relations between the inside of the independent application system and different heterogeneous application systems, including mapping relations and constraint relations between database tables and fields, and intermediate process implicit data relations of the logical relations of the bearing services, which are necessary for the operation of the independent application systems.
CN201910761620.XA 2019-08-16 2019-08-16 Intelligent analysis method and system for data relationship Pending CN112445867A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910761620.XA CN112445867A (en) 2019-08-16 2019-08-16 Intelligent analysis method and system for data relationship

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910761620.XA CN112445867A (en) 2019-08-16 2019-08-16 Intelligent analysis method and system for data relationship

Publications (1)

Publication Number Publication Date
CN112445867A true CN112445867A (en) 2021-03-05

Family

ID=74741440

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910761620.XA Pending CN112445867A (en) 2019-08-16 2019-08-16 Intelligent analysis method and system for data relationship

Country Status (1)

Country Link
CN (1) CN112445867A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113467785A (en) * 2021-07-19 2021-10-01 上海红阵信息科技有限公司 SQL translation method and system for mimicry database
CN113626423A (en) * 2021-06-29 2021-11-09 欧电云信息科技(江苏)有限公司 Log management method, device and system of service database

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113626423A (en) * 2021-06-29 2021-11-09 欧电云信息科技(江苏)有限公司 Log management method, device and system of service database
CN113626423B (en) * 2021-06-29 2024-01-30 欧电云信息科技(江苏)有限公司 Log management method, device and system of business database
CN113467785A (en) * 2021-07-19 2021-10-01 上海红阵信息科技有限公司 SQL translation method and system for mimicry database
CN113467785B (en) * 2021-07-19 2023-02-28 上海红阵信息科技有限公司 SQL translation method and system for mimicry database

Similar Documents

Publication Publication Date Title
US11461294B2 (en) System for importing data into a data repository
US11360950B2 (en) System for analysing data relationships to support data query execution
US11409764B2 (en) System for data management in a large scale data repository
US6253200B1 (en) Structured query language to IMS transaction mapper
US8943059B2 (en) Systems and methods for merging source records in accordance with survivorship rules
CN109614432B (en) System and method for acquiring data blood relationship based on syntactic analysis
US20040167884A1 (en) Methods and products for producing role related information from free text sources
CN110555032A (en) Data blood relationship analysis method and system based on metadata
US20070143321A1 (en) Converting recursive hierarchical data to relational data
CN107291471B (en) Meta-model framework system supporting customizable data acquisition
CN110674229A (en) AST-based relational database SQL table relational analysis and display method
CN113934750A (en) Data blood relationship analysis method based on compiling mode
CN117093599A (en) Unified SQL query method for heterogeneous data sources
CN112445867A (en) Intelligent analysis method and system for data relationship
CN108255852B (en) SQL execution method and device
CN115357678A (en) GIS automatic examination method and system based on structured natural language rule
CN110633290A (en) SQL statement analysis method and analysis device
WO2014125430A1 (en) Method for creating specifications of software systems, in particular of the oltp-app type, and device thereof
CN113221528B (en) Automatic generation and execution method of clinical data quality evaluation rule based on openEHR model
Thiran et al. Updating legacy databases through wrappers: Data consistency management
CN117131027A (en) Data quality detection method, device, terminal equipment and storage medium
CN113190573A (en) Data file analysis processing method and device based on SQL-like and electronic equipment
CN118093580A (en) Method and device for discovering data blood edges by using large model knowledge base
CN114942766A (en) Excel function conversion method based on SQL database and related device
Jin et al. Effective spatial database support for acquiring spatial information from remote sensing images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210305

WD01 Invention patent application deemed withdrawn after publication