CN112035508A - SQL (structured query language) -based online metadata analysis method, system and equipment - Google Patents

SQL (structured query language) -based online metadata analysis method, system and equipment Download PDF

Info

Publication number
CN112035508A
CN112035508A CN202010876759.1A CN202010876759A CN112035508A CN 112035508 A CN112035508 A CN 112035508A CN 202010876759 A CN202010876759 A CN 202010876759A CN 112035508 A CN112035508 A CN 112035508A
Authority
CN
China
Prior art keywords
engine
user data
sql
parsing
analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010876759.1A
Other languages
Chinese (zh)
Inventor
罗赞
陈友
王志
黄强
范成其
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Tydic Information Technology Co ltd
Original Assignee
Shenzhen Tydic Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Tydic Information Technology Co ltd filed Critical Shenzhen Tydic Information Technology Co ltd
Priority to CN202010876759.1A priority Critical patent/CN112035508A/en
Publication of CN112035508A publication Critical patent/CN112035508A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Bioethics (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method, a system, equipment and a storage medium for on-line metadata analysis based on SQL, wherein the method comprises the following steps: constructing an engine for the database; and analyzing the user data based on the engine to obtain an analysis result of the user data. The method can realize automatic parsing of the sql script. The system has the capability of acquiring table, field and blood relationship information.

Description

SQL (structured query language) -based online metadata analysis method, system and equipment
Technical Field
The invention belongs to the technical field of databases, and particularly relates to a method, a system, equipment and a storage medium for SQL (structured query language) -based online metadata analysis.
Background
In the big data era, data has been recognized as an important asset. Metadata management is an important management function in a data management framework, and the requirement for automatic analysis of metadata in an SQL script is gradually increased; the requirement on data safety is gradually strengthened, and the access and operation permission of different data tables are controlled by SQL scripts developed by different users; in the actual operation and maintenance process, the operation and maintenance personnel need to know the blood relationship and the influence analysis of the data table so as to facilitate the tracking and backtracking of data when problems occur.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: aiming at the problems in the prior art, the invention provides a method capable of automatically analyzing sql scripts and acquiring the relationships between tables, fields and blood relationship.
In a first aspect, an embodiment of the present application provides a method for SQL-based online metadata parsing, where the method includes:
constructing an engine for the database;
and analyzing the user data based on the engine to obtain an analysis result of the user data.
In a second aspect, an embodiment of the present application further provides a system for SQL-based online metadata parsing, where the system includes:
constructing a module: for constructing an engine against a database;
an analysis module: and the analysis engine is used for analyzing the user data based on the engine to obtain an analysis result of the user data.
In a third aspect, an embodiment of the present application further provides an apparatus for SQL-based online metadata parsing, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor, when executing the computer program, implements each step in the method for SQL-based online metadata parsing according to the first aspect.
In a fourth aspect, embodiments of the present application further provide a storage medium having a computer program stored thereon, where the computer program is executed by a processor to implement the steps in the method based on SQL online metadata parsing according to the first aspect.
The method for analyzing the online metadata based on the SQL, provided by the embodiment of the application, comprises the following steps: constructing an engine for the database; and analyzing the user data based on the engine to obtain an analysis result of the user data. The method can realize automatic analysis of the sql script and has the capability of acquiring the information of the table, the field and the blood relationship.
Drawings
The detailed structure of the invention is described in detail below with reference to the accompanying drawings
FIG. 1 is a flow chart of a SQL-based online metadata parsing method according to the present invention;
FIG. 2 is a sub-flow diagram of the SQL based on-line metadata parsing method of the present invention;
FIG. 3 is a schematic view of another sub-flow of the SQL-based online metadata parsing method of the invention;
FIG. 4 is a schematic view of another sub-flow of the SQL-based online metadata parsing method of the invention;
FIG. 5 is a schematic diagram of program modules of the SQL-based online metadata parsing method of the invention.
Detailed Description
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, and not all the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a schematic flow chart of a method for SQL-based online metadata parsing in an embodiment of the present application, where the method for SQL-based online metadata parsing in the embodiment includes:
step 101, constructing an engine for a database.
The engine in the embodiment refers to extracting key information of data, different application scenes are developed according to different script sources (execution sql, storage sql script and sql script in a shell file) and places when different engines are constructed, different application scenes are flexibly processed, and different service requirements are met by developing the plug-ins according to different application scenes. The plug-in is realized based on a java reflection mechanism, and the calling of the plug-in can be completed by configuring the name of the plug-in the configuration file.
And 102, analyzing the user data based on the engine to obtain an analysis result of the user data.
The method comprises the steps that different engines constructed based on different databases are used, corresponding engines are used for analyzing which type of database user data belong to, and the engines are used for analyzing the user data on line to obtain analysis results of the user data.
And (3) an analysis process:
1. the grammar rules define: the grammatical notation and grammar rules of the target language are defined using the grammar of Antlr, and parsing of the sql statement is based on the defined grammatical notation and grammar rules.
2. And (3) SQL statement analysis: and (4) carrying out lexical analysis, grammar analysis and abstract grammar tree generation on the sql statement according to a defined grammar rule, and finally analyzing a field and model mapping relation.
For example: insert into a (id, p _ date, cnt, flag) select b.id, c.p _ date, e.cnt,
case b.flag when 1 then c.flag when 2 then b.flag end from b,c,(select count(1) from d where date_id='20180401')e limit 100;
2.1, lexical analysis:
aggregating characters into words or symbols (lexical symbols, tokens), recognizing keywords and non-keywords, and then obtaining the meaning of each word:
insert|into|a|(|id|,|p_date|,|cnt|,|flag|)|select|b.id|,|c.p_date|...;
2.2, syntax analysis:
after lexical analysis, using the result of the lexical analysis as the input of syntactic analysis, judging whether the word input by a user accords with syntactic logic on the basis of the lexical analysis by the syntactic analysis, verifying the syntactic analysis according to a predefined syntactic rule, and reporting an error by the syntactic analyzer if the syntactic analysis fails;
2.3, abstract syntax tree:
the user enters a representation of the tree structure of the sentence, which embodies the grammar. The abstract syntax tree is constructed along with the process of syntax analysis, and when the syntax analysis is normally finished, the syntax analyzer outputs an abstract syntax tree.
The hierarchy of the tree reflects the hierarchical relationship of the SQL statement, when the SQL statement comprises an insert _ close or a select _ close node, syntax keywords under the insert _ close or the select _ close are displayed as the attributes of the node, leaf nodes of the node are sub-query statements, table names or fields, and the sub-query statements are continuously subjected to recursive decomposition according to the display rule of the select _ close.
2.4, model and field resolution:
the program analyzes a syntax tree, and the sql script content is analyzed based on the syntax tree:
inputting a model: b. c, d;
outputting a model: a;
the following table is a field mapping table:
Figure BDA0002652818880000041
the method for analyzing the online metadata based on the SQL, provided by the embodiment of the application, comprises the following steps: constructing different engines aiming at different databases; and analyzing the user data based on the engine to obtain an analysis result of the user data. The method can realize automatic parsing of the sql script. The ability to obtain table, field, kindred information.
Further, the database in this embodiment includes a big data type and a relational database type, where the big data is a data set that cannot be captured, managed, and processed by a conventional software tool within a certain time range, and is a massive, high-growth-rate, and diversified information asset that needs a new processing mode to have stronger decision-making power, insight discovery power, and process optimization capability; a relational database refers to a database that uses a relational model to organize data, and stores data in rows and columns for a user to understand, and a series of rows and columns of the relational database are called tables, and a set of tables constitutes the database. A user retrieves data in a database by a query, which is an executable code that defines certain areas in the database. The relational model can be simply understood as a two-dimensional table model, and a relational database is a data organization composed of two-dimensional tables and relations between them.
Further, based on the foregoing embodiment, referring to fig. 2, fig. 2 is a sub-flow diagram of a method for analyzing online metadata based on SQL in this embodiment, and in this embodiment, the step of analyzing the big data type includes:
parsing user data based on the engine;
for big data type hive, spark standard hql script, the user data is parsed with antlr 4.
Wherein an analysis result can be obtained after analyzing the user data.
Further, based on the foregoing embodiment, referring to fig. 3, fig. 3 is another sub-flow diagram of the method for analyzing online metadata based on SQL in this embodiment, where the step of analyzing the relational database type in this embodiment includes:
parsing user data based on the engine;
and analyzing the user data by adopting a pipeline for the standard sql script of the relational database type oracle and mysql.
Wherein an analysis result can be obtained after analyzing the user data.
Further, in this embodiment, the analysis result obtained after analyzing the user data includes a table of the user data, a mapping relationship of fields, and a relationship of relationship.
Further, based on the above embodiment, referring to fig. 4, fig. 4 is a flowchart illustrating that after obtaining the analysis result of the user data, the method further includes:
if the obtained analysis result is the same as the requirement of the user, the user can directly refer to the engine;
if the obtained analysis result is different from the requirement of the user, the user can develop the component to form a new engine according to the requirement of the user.
In this embodiment, the development of the engine in the method supports hot plug, and the number of plug-ins can be increased or decreased dynamically.
Further, an embodiment of the present application further provides an apparatus 500 for SQL-based online metadata parsing, referring to fig. 5, where fig. 5 is the apparatus 500 for SQL-based online metadata parsing in the embodiment of the present application, and includes:
the construction module 501: for constructing an engine against a database;
the parsing module 502: and the analysis engine is used for analyzing the user data based on the engine to obtain an analysis result of the user data.
The device 500 based on SQL online metadata parsing provided by the embodiment of the present application can implement: constructing an engine for the database; and analyzing the user data based on the engine to obtain an analysis result of the user data. The method can realize automatic parsing of the sql script. The system has the capability of acquiring table, field and blood relationship information.
Further, an embodiment of the present application further provides an apparatus for SQL-based online metadata parsing, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where when the processor executes the computer program, each step in the above method for SQL-based online metadata parsing is implemented.
Further, the present application also provides a storage medium having a computer program stored thereon, where the computer program is executed by a processor to implement the steps of the SQL-based online metadata parsing method as described above.
Each functional module in the embodiments of the present invention may be integrated into one processing module, or each module may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It should be noted that, for the sake of simplicity, the above-mentioned method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present invention is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no acts or modules are necessarily required of the invention.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In view of the above description of the method, system, device and storage medium for SQL-based online metadata parsing provided by the present invention, those skilled in the art will recognize that the concepts according to the embodiments of the present invention may be modified in the specific implementation manners and application ranges.

Claims (10)

1. A method for SQL-based on-line metadata analysis is characterized in that,
constructing an engine for the database;
and analyzing the user data based on the engine to obtain an analysis result of the user data.
2. The method of claim 1, wherein the database comprises a big data type and a relational database type.
3. The method of claim 2, wherein the parsing user data based on the engine comprises:
for big data type hive, spark standard hql script, the user data is parsed with antlr 4.
4. The method of claim 2, wherein the parsing user data based on the engine comprises:
and analyzing the user data by adopting a pipeline for the standard sql script of the relational database type oracle and mysql.
5. The method of claim 4, wherein the parsing result includes a table of the user data, a mapping of fields, and a relationship of blood margins.
6. The method of claim 5, wherein parsing user data based on the engine, after obtaining at least a result of parsing the user data, comprises:
if the analysis result is the same as the user requirement, the user refers to the engine;
and if the analysis result is different from the user requirement, the component required by the user development forms a new engine.
7. The method of claim 6, wherein the engine is developed based on a plug-in mode, and supports hot plug-in mode add-and-drop.
8. A system for SQL-based online metadata parsing, the system comprising:
constructing a module: for constructing an engine against a database;
an analysis module: and the analysis engine is used for analyzing the user data based on the engine to obtain an analysis result of the user data.
9. An apparatus for SQL-based online metadata parsing, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the computer program, implements the steps of the method for SQL-based online metadata parsing according to any of claims 1 to 7.
10. A storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of the SQL-based online metadata parsing method according to any of claims 1 to 7.
CN202010876759.1A 2020-08-27 2020-08-27 SQL (structured query language) -based online metadata analysis method, system and equipment Pending CN112035508A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010876759.1A CN112035508A (en) 2020-08-27 2020-08-27 SQL (structured query language) -based online metadata analysis method, system and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010876759.1A CN112035508A (en) 2020-08-27 2020-08-27 SQL (structured query language) -based online metadata analysis method, system and equipment

Publications (1)

Publication Number Publication Date
CN112035508A true CN112035508A (en) 2020-12-04

Family

ID=73580887

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010876759.1A Pending CN112035508A (en) 2020-08-27 2020-08-27 SQL (structured query language) -based online metadata analysis method, system and equipment

Country Status (1)

Country Link
CN (1) CN112035508A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112202822A (en) * 2020-12-07 2021-01-08 中国人民解放军国防科技大学 Database injection detection method and device, electronic equipment and storage medium
CN112434046A (en) * 2020-12-16 2021-03-02 杭州天均科技有限公司 Data blood margin analysis method, device, equipment and storage medium
CN112818015A (en) * 2021-01-21 2021-05-18 广州汇通国信科技有限公司 Data tracking method, system and storage medium based on data blood margin analysis
CN113420097A (en) * 2021-06-23 2021-09-21 网易(杭州)网络有限公司 Data analysis method and device, storage medium and server
CN113468873A (en) * 2021-07-09 2021-10-01 北京东方国信科技股份有限公司 Syntax analysis method and device of PL/SQL language

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709043A (en) * 2016-12-30 2017-05-24 江苏瑞中数据股份有限公司 Data synchronous loading method based on database log
CN108446289A (en) * 2017-09-26 2018-08-24 北京中安智达科技有限公司 A kind of data retrieval method for supporting heterogeneous database
CN108536853A (en) * 2018-04-11 2018-09-14 上海驰骛信息科技有限公司 A kind of automatic routing database inquiry system and method based on query resource and accuracy
CN109325078A (en) * 2018-09-18 2019-02-12 拉扎斯网络科技(上海)有限公司 Method and device is determined based on the data blood relationship of structured data
CN109582731A (en) * 2018-10-18 2019-04-05 恒峰信息技术有限公司 A kind of real time data synchronization method and system
CN110232056A (en) * 2019-05-21 2019-09-13 苏宁云计算有限公司 A kind of the blood relationship analytic method and its tool of structured query language
CN110704479A (en) * 2019-09-12 2020-01-17 新华三大数据技术有限公司 Task processing method and device, electronic equipment and storage medium
CN110837515A (en) * 2019-11-06 2020-02-25 北京天融信网络安全技术有限公司 Database-based data processing method and electronic equipment
CN111522816A (en) * 2020-04-16 2020-08-11 云和恩墨(北京)信息技术有限公司 Data processing method, device, terminal and medium based on database engine

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709043A (en) * 2016-12-30 2017-05-24 江苏瑞中数据股份有限公司 Data synchronous loading method based on database log
CN108446289A (en) * 2017-09-26 2018-08-24 北京中安智达科技有限公司 A kind of data retrieval method for supporting heterogeneous database
CN108536853A (en) * 2018-04-11 2018-09-14 上海驰骛信息科技有限公司 A kind of automatic routing database inquiry system and method based on query resource and accuracy
CN109325078A (en) * 2018-09-18 2019-02-12 拉扎斯网络科技(上海)有限公司 Method and device is determined based on the data blood relationship of structured data
CN109582731A (en) * 2018-10-18 2019-04-05 恒峰信息技术有限公司 A kind of real time data synchronization method and system
CN110232056A (en) * 2019-05-21 2019-09-13 苏宁云计算有限公司 A kind of the blood relationship analytic method and its tool of structured query language
CN110704479A (en) * 2019-09-12 2020-01-17 新华三大数据技术有限公司 Task processing method and device, electronic equipment and storage medium
CN110837515A (en) * 2019-11-06 2020-02-25 北京天融信网络安全技术有限公司 Database-based data processing method and electronic equipment
CN111522816A (en) * 2020-04-16 2020-08-11 云和恩墨(北京)信息技术有限公司 Data processing method, device, terminal and medium based on database engine

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112202822A (en) * 2020-12-07 2021-01-08 中国人民解放军国防科技大学 Database injection detection method and device, electronic equipment and storage medium
CN112434046A (en) * 2020-12-16 2021-03-02 杭州天均科技有限公司 Data blood margin analysis method, device, equipment and storage medium
CN112434046B (en) * 2020-12-16 2021-09-17 杭州天均科技有限公司 Data blood margin analysis method, device, equipment and storage medium
CN112818015A (en) * 2021-01-21 2021-05-18 广州汇通国信科技有限公司 Data tracking method, system and storage medium based on data blood margin analysis
CN113420097A (en) * 2021-06-23 2021-09-21 网易(杭州)网络有限公司 Data analysis method and device, storage medium and server
CN113468873A (en) * 2021-07-09 2021-10-01 北京东方国信科技股份有限公司 Syntax analysis method and device of PL/SQL language
CN113468873B (en) * 2021-07-09 2024-04-16 北京东方国信科技股份有限公司 Syntax analysis method and device of PL/SQL language

Similar Documents

Publication Publication Date Title
CN110291517B (en) Query language interoperability in graph databases
Gad-Elrab et al. Exfakt: A framework for explaining facts over knowledge graphs and text
US8630989B2 (en) Systems and methods for information extraction using contextual pattern discovery
CN112035508A (en) SQL (structured query language) -based online metadata analysis method, system and equipment
EP3080721B1 (en) Query techniques and ranking results for knowledge-based matching
Beheshti et al. A systematic review and comparative analysis of cross-document coreference resolution methods and tools
JP2022120014A (en) System and method for capturing data and facilitating user access to data
US8892580B2 (en) Transformation of regular expressions
US20130262501A1 (en) Context-aware question answering system
US20170357625A1 (en) Event extraction from documents
US20170351816A1 (en) Identifying potential patient candidates for clinical trials
US20130060807A1 (en) Relational metal- model and associated domain context-based knowledge inference engine for knowledge discovery and organization
Ganino et al. Ontology population for open‐source intelligence: A GATE‐based solution
US20170068891A1 (en) System for rapid ingestion, semantic modeling and semantic querying over computer clusters
US11003661B2 (en) System for rapid ingestion, semantic modeling and semantic querying over computer clusters
Mahmoud et al. Estimating semantic relatedness in source code
Arora et al. Language models enable simple systems for generating structured views of heterogeneous data lakes
Ma et al. A novel data integration framework based on unified concept model
CN116483850A (en) Data processing method, device, equipment and medium
Kopp et al. An approach and software prototype for translation of natural language business rules into database structure
Talburt et al. A practical guide to entity resolution with OYSTER
CN110580170B (en) Method and device for identifying software performance risk
Abad-Navarro et al. Semankey: a semantics-driven approach for querying RDF repositories using keywords
Beheshti et al. Data curation apis
Wilder et al. Exploring a framework for identity and attribute linking across heterogeneous data systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination