CN117312369A - Data specification checking method and system based on data dictionary - Google Patents

Data specification checking method and system based on data dictionary Download PDF

Info

Publication number
CN117312369A
CN117312369A CN202311513143.8A CN202311513143A CN117312369A CN 117312369 A CN117312369 A CN 117312369A CN 202311513143 A CN202311513143 A CN 202311513143A CN 117312369 A CN117312369 A CN 117312369A
Authority
CN
China
Prior art keywords
database
data
information
modeling
checking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311513143.8A
Other languages
Chinese (zh)
Inventor
于腾辉
刘小成
马康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beiyin Financial Technology Co ltd
Original Assignee
Beiyin Financial Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beiyin Financial Technology Co ltd filed Critical Beiyin Financial Technology Co ltd
Priority to CN202311513143.8A priority Critical patent/CN117312369A/en
Publication of CN117312369A publication Critical patent/CN117312369A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a data specification checking method and a system based on a data dictionary, wherein the checking method comprises the following steps: adding database connection information; inquiring metadata of a database to obtain a database modeling statement; analyzing database modeling sentences by using ANTLR to obtain attribute information corresponding to the database; characteristic information of the data elements is input, and file information is obtained; and inquiring to acquire the file information, comparing the field name, the type and the length information analyzed into the modeling statement, and outputting a comparison result. The method for detecting the information is more intelligent, efficient and low in cost, and improves the quality and efficiency of informationized construction.

Description

Data specification checking method and system based on data dictionary
Technical Field
The invention relates to the field of enterprise information system construction, in particular to a data specification checking method and system based on a data dictionary.
Background
In the construction process of enterprise information systems, the naming, the types, the use scenes and the like of fields are different from one business system to another, the database modeling generally lacks unified field design standards, and the compliance of data fields is one of key factors for guaranteeing the data quality and the system stability. The data dictionary is a data management tool for recording information such as definition, format, source, use, limitation and the like of data elements, and can reduce the phenomenon of non-standardization in the data modeling process by uniformly defining Chinese names, english names, types, meanings, usage scenes and the like of dictionary items. On the basis, the ANTLR (ANotherTool for Language Recognition) is used for processing the structured text data, analyzing the text transmitted by a user or database modeling sentences obtained by inquiring the metadata information of the database by writing corresponding grammar rules, analyzing corresponding field names, types, lengths and the like in the sentences, and comparing whether the structure corresponds to the expected structure and rule of the data elements in the data dictionary or not. The ANTLR is used for checking the data dictionary, so that the analysis and generation capabilities of the ANTLR can be fully utilized, and the checking requirement of the data dictionary can be effectively met.
1. The database field design is reviewed based on manual inspection by business personnel.
2. By querying the names, types, lengths in the database metadata, the names, types, lengths, etc. defined in the data dictionary are checked against.
1. Problem of manually checking database field designs
1.1 Artificial omission causes under-inspection
Depending on manual inspection by business personnel, human subjective factors exist, which may cause omission, so that the inspection result of the database field design is not comprehensive and accurate enough.
1.2 high cost and time consuming
The manual inspection mode requires a great deal of manpower and time input, and is high in cost and time-consuming. In the informatization construction process, the project can bear large cost pressure, and the progress and efficiency of the project are affected.
2 problem of querying database System metadata
2.1 dependence on database metadata tables
Verification by querying the database metadata tables must rely on such metadata information. Metadata table structures of different database systems are different, and additional manpower and time are required for adaptation, thus increasing the complexity and cost of operation.
2.2 cannot intercept verification before database execution
The method can only check when inquiring the metadata of the database system, but cannot check in an interception mode before system research personnel write modeling sentences to the database for execution. This results in a failure to discover and correct potential field design problems in time before the database statement is actually executed.
Disclosure of Invention
In view of the foregoing, the present invention has been developed to provide a data dictionary-based data specification checking method and system that overcome, or at least partially solve, the foregoing problems.
According to an aspect of the present invention, there is provided a data specification checking method based on a data dictionary, the checking method comprising:
adding database connection information;
inquiring metadata of a database to obtain a database modeling statement;
analyzing database modeling sentences by using ANTLR to obtain attribute information corresponding to the database;
characteristic information of the data elements is input, and file information is obtained;
and inquiring to acquire the file information, comparing the field name, the type and the length information analyzed into the modeling statement, and outputting a comparison result.
Optionally, the database connection information includes: connection name, connection address, port number, user name, password information.
Optionally, the adding database connection information further includes: viewing the table structure and table data in the database on the IDE page, and writing SQL sentences on the SQL editing interface.
Optionally, the querying the database metadata to obtain the database modeling statement specifically includes:
inquiring metadata of a database to obtain a database modeling statement;
receiving database modeling sentences to be executed through an HTTP interface;
SQL statements written by the SQL editing interface are intercepted.
Optionally, the attribute information corresponding to the database specifically includes: table name, field name, and type.
Optionally, the characteristic information of the data element specifically includes: definition, format, source, use, and limitation.
Optionally, the file information specifically includes: chinese name, english name, type, length information.
The invention also provides a data specification checking system based on the data dictionary, which applies the data specification checking method based on the data dictionary, and the checking method comprises the following steps:
the IDE module is used for adding database connection information;
the database modeling statement acquisition module is used for inquiring the metadata of the database to acquire database modeling statements;
the grammar analysis module is used for analyzing database modeling sentences by using the ANTLR to obtain attribute information corresponding to the database;
the data dictionary module is used for inputting the characteristic information of the data elements and acquiring file information;
and the checking module is used for inquiring and acquiring the file information, comparing the field name, the type and the length information analyzed into the modeling statement, and outputting a comparison result.
The invention provides a data specification checking method and a system based on a data dictionary, wherein the checking method comprises the following steps: adding database connection information; inquiring metadata of a database to obtain a database modeling statement; analyzing database modeling sentences by using ANTLR to obtain attribute information corresponding to the database; characteristic information of the data elements is input, and file information is obtained; and inquiring to acquire the file information, comparing the field name, the type and the length information analyzed into the modeling statement, and outputting a comparison result. The method for detecting the information is more intelligent, efficient and low in cost, and improves the quality and efficiency of informationized construction.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a data specification checking method based on a data dictionary according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The terms "comprising" and "having" and any variations thereof in the description embodiments of the invention and in the claims and drawings are intended to cover a non-exclusive inclusion, such as a series of steps or elements.
The technical scheme of the invention is further described in detail below with reference to the accompanying drawings and the examples.
An ANTLR and data dictionary based data specification verification system includes the following modules:
IDE module: and adding database connection information, including connection name, connection address, port number, user name and password information, checking the table structure and table data in the database on the IDE page, and writing SQL sentences on the SQL editing interface.
Database modeling statement acquisition module: querying the database metadata to obtain database modeling sentences or receiving the database modeling sentences to be executed through an HTTP interface or intercepting SQL sentences written by an SQL editing interface.
And a grammar parsing module: and analyzing the database modeling statement by using the ANTLR to obtain the table name, the field name and the type information corresponding to the database.
And a data dictionary module: definition, format, source, use and limitation information of the data elements are entered.
And (3) a checking module: the query data dictionary module obtains the Chinese name, english name, type and length information, analyzes the grammar analysis module to the field name, type and length information in the modeling statement for comparison, and outputs the comparison result.
The invention provides a data specification checking system based on an ANTLR and a data dictionary, which comprises the following steps:
fig. 1 is a flowchart of an implementation of a data specification checking method based on an ANTLR and a data dictionary according to an embodiment of the present invention, and referring to fig. 1, the method includes:
step 1: in the IDE module, the information of the connection address, port number, user name, password and the like of the database is filled in and stored in the OceanBase database.
Step 2: in the first way, writing an SQL sentence to be executed in an IDE interface, clicking an execution button on the interface, and acquiring the executed SQL sentence from the program. In the second way, the service personnel clicks the check button, and the system obtains the statement of the building table from the target database by querying the connection information stored to the OceanBase in the first step and by using the DBMS_METATATA.GET_DDL method or show create table statement. In the third mode, service personnel transmit SQL sentences written by the service personnel to the system in a REST interface mode.
Step 3: calling the SQL analysis method from the list-building statement obtained in the previous step, and analyzing the Structured Query Language (SQL) statement of the target through the ANTLR to obtain a grammar tree ParseTree; traversing the ParseTree to obtain table names, field names, types and length information.
Step 4: and inquiring all data element information in the data dictionary, and acquiring Chinese name, english name, type and length information of the data field.
Step 5: searching the obtained field names in the step 3 in the English names of the data dictionary information obtained in the step 4, and marking the field as non-conforming to the rule if the field names cannot be found. If the related fields are found, comparing whether the types and the lengths of the fields are consistent in sequence, and if any one item is not consistent, marking the fields as inconsistent rules.
Step 6: outputting all the fields in the modeling statement, and displaying whether the field names and the fields accord with the rules.
The beneficial effects are that:
automated inspection mechanism:
an automatic checking mechanism is introduced, dependence on business personnel is reduced, comprehensive checking on database field design is realized through algorithms and rules, and the problem of artificial omission is avoided.
Intercept type checking function:
the method can check the existing database fields of the system by inquiring the metadata of the database, can also realize interception type check before system research personnel write modeling sentences to the database for execution, and can reduce the possibility of the problems entering the database by detecting and correcting the field design problems in advance, thereby improving the stability and quality of the system.
High efficiency and low cost:
through automatic and intelligent inspection mode, the labor input is reduced, the inspection efficiency is improved, and therefore the cost pressure in the informatization construction process is reduced.
The invention provides a more intelligent, efficient and low-cost inspection method for database field design, and improves the quality and efficiency of informationized construction.
The foregoing detailed description of the invention has been presented for purposes of illustration and description, and it should be understood that the invention is not limited to the particular embodiments disclosed, but is intended to cover all modifications, equivalents, alternatives, and improvements within the spirit and principles of the invention.

Claims (8)

1. A data specification checking method based on a data dictionary, the checking method comprising:
adding database connection information;
inquiring metadata of a database to obtain a database modeling statement;
analyzing database modeling sentences by using ANTLR to obtain attribute information corresponding to the database;
characteristic information of the data elements is input, and file information is obtained;
and inquiring to acquire the file information, comparing the field name, the type and the length information analyzed into the modeling statement, and outputting a comparison result.
2. The data dictionary-based data specification checking method of claim 1, wherein the database connection information comprises: connection name, connection address, port number, user name, password information.
3. The method for checking data specifications based on a data dictionary according to claim 1, wherein the adding database connection information further comprises: viewing the table structure and table data in the database on the IDE page, and writing SQL sentences on the SQL editing interface.
4. The method for checking data specifications based on a data dictionary according to claim 1, wherein the querying the database metadata to obtain the database modeling statement specifically comprises:
inquiring metadata of a database to obtain a database modeling statement;
receiving database modeling sentences to be executed through an HTTP interface;
SQL statements written by the SQL editing interface are intercepted.
5. The method for checking data specifications based on a data dictionary according to claim 1, wherein the attribute information corresponding to the database specifically includes: table name, field name, and type.
6. The method for checking data specifications based on a data dictionary according to claim 1, wherein the feature information of the data elements specifically comprises: definition, format, source, use, and limitation.
7. The method for checking data specifications based on a data dictionary according to claim 1, wherein the file information specifically includes: chinese name, english name, type, length information.
8. A data dictionary based data specification verification system employing a data dictionary based data specification verification method as claimed in any one of claims 1 to 7, said verification method comprising:
the IDE module is used for adding database connection information;
the database modeling statement acquisition module is used for inquiring the metadata of the database to acquire database modeling statements;
the grammar analysis module is used for analyzing database modeling sentences by using the ANTLR to obtain attribute information corresponding to the database;
the data dictionary module is used for inputting the characteristic information of the data elements and acquiring file information;
and the checking module is used for inquiring and acquiring the file information, comparing the field name, the type and the length information analyzed into the modeling statement, and outputting a comparison result.
CN202311513143.8A 2023-11-14 2023-11-14 Data specification checking method and system based on data dictionary Pending CN117312369A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311513143.8A CN117312369A (en) 2023-11-14 2023-11-14 Data specification checking method and system based on data dictionary

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311513143.8A CN117312369A (en) 2023-11-14 2023-11-14 Data specification checking method and system based on data dictionary

Publications (1)

Publication Number Publication Date
CN117312369A true CN117312369A (en) 2023-12-29

Family

ID=89297447

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311513143.8A Pending CN117312369A (en) 2023-11-14 2023-11-14 Data specification checking method and system based on data dictionary

Country Status (1)

Country Link
CN (1) CN117312369A (en)

Similar Documents

Publication Publication Date Title
Souibgui et al. Data quality in ETL process: A preliminary study
KR101755365B1 (en) Managing record format information
US8103705B2 (en) System and method for storing text annotations with associated type information in a structured data store
US8615526B2 (en) Markup language based query and file generation
WO2020253399A1 (en) Log classification rule generation method, device, apparatus, and readable storage medium
US20080140696A1 (en) System and method for analyzing data sources to generate metadata
US20080281820A1 (en) Schema Matching for Data Migration
CN110597844B (en) Unified access method for heterogeneous database data and related equipment
CN112231407B (en) DDL synchronization method, device, equipment and medium of PostgreSQL database
CN112579610A (en) Multi-data source structure analysis method, system, terminal device and storage medium
CN110659282A (en) Data route construction method and device, computer equipment and storage medium
CN114253995B (en) Data tracing method, device, equipment and computer readable storage medium
CN113901083A (en) Heterogeneous data source operation resource analysis positioning method and equipment based on multiple analyzers
US8290950B2 (en) Identifying locale-specific data based on a total ordering of supported locales
CN106570095B (en) XML data operation method and equipment
CN113238865A (en) Method for quickly constructing knowledge graph based on Excel one-key import
CN107633094B (en) Method and device for data retrieval in cluster environment
CN116775488A (en) Abnormal data determination method, device, equipment, medium and product
CN116521621A (en) Data processing method and device, electronic equipment and storage medium
RU2393536C2 (en) Method of unified semantic processing of information, which provides for, within limits of single formal model, presentation, control of semantic accuracy, search and identification of objects description
CN116303359A (en) Method for realizing multi-type document export of database structure
CN117312369A (en) Data specification checking method and system based on data dictionary
US11321340B1 (en) Metadata extraction from big data sources
CN111221846B (en) Automatic translation method and device for SQL sentences
CN116450717B (en) Data integration method and information management system for cross-service modules

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination