CN115878176A - Intelligent data relation construction method and system based on storage process and storage medium - Google Patents

Intelligent data relation construction method and system based on storage process and storage medium Download PDF

Info

Publication number
CN115878176A
CN115878176A CN202211349369.4A CN202211349369A CN115878176A CN 115878176 A CN115878176 A CN 115878176A CN 202211349369 A CN202211349369 A CN 202211349369A CN 115878176 A CN115878176 A CN 115878176A
Authority
CN
China
Prior art keywords
file
data
statement
dml
intelligent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211349369.4A
Other languages
Chinese (zh)
Inventor
张宇
周宇
张嘉禹
何冬临
舒望
张硕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Saxo Financial Technology Co ltd
Original Assignee
Saxo Financial Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Saxo Financial Technology Co ltd filed Critical Saxo Financial Technology Co ltd
Priority to CN202211349369.4A priority Critical patent/CN115878176A/en
Publication of CN115878176A publication Critical patent/CN115878176A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of data processing, in particular to a method, a system and a storage medium for intelligently constructing a data relation based on a storage process, wherein the method comprises the following steps: s1, acquiring a file to be analyzed, wherein the file comprises: storing one or more of a procedure and a script file; s2, judging whether the grammar in the file is correct or not, if so, executing S3; s3, analyzing the file by adopting a preset rule to form a syntax tree; s4, processing the syntax tree, and extracting DML sentences from the syntax tree; s5, analyzing the DML sentences and constructing the dependency relationship of each sentence in the DML sentences; s6, analyzing each statement in the DML statement with the established dependency relationship, intelligently searching and completing unknown information in the statement, and acquiring structure related information with the dependency relationship. The scheme can assist in processing complex DML sentences so as to reduce memory requirements and improve processing speed.

Description

Intelligent data relation construction method and system based on storage process and storage medium
Technical Field
The invention relates to the technical field of data processing, in particular to a method, a system and a storage medium for intelligently constructing a data relationship based on a storage process.
Background
The storage process is in a large database system, a group of SQL statement sets for completing specific functions is an important object in the database, and the efficiency improvement of speed multiplication can be achieved by using the storage process under the condition of extremely large data volume.
When the storage process is written, the content in the storage process is somewhat dependent on the service processing and the functional design. The DML statement (one of the SQL statements) contained in the storage process can be simplified and complicated, and the spatial complexity and the temporal complexity in processing the DML statement depend on the complexity of the DML statement.
For a simplified DML statement, when the processing operation is performed, the demand for the memory is small, but for a complex DML statement, when the processing operation is performed, because the processing needs to be performed for a plurality of times in batches, and the data that is analyzed first needs to be cached, the demand for the memory is increased by multiple times, and particularly in the storage process with more complex functions, the writing of the DML statement is more complex, the memory requirement is higher, and the memory resource is excessively occupied.
Therefore, an intelligent data relationship construction method based on a storage process is urgently needed, the processing of complex DML statements can be assisted, and the memory requirement is reduced.
Disclosure of Invention
One of the purposes of the present invention is to provide a data relationship intelligent construction method based on a storage process, which can assist in processing complex DML statements to reduce memory requirements.
The invention provides a basic scheme I: the intelligent data relationship construction method based on the storage process comprises the following steps:
s1, acquiring a file to be analyzed, wherein the file comprises: storing one or more of a procedure and a script file;
s2, judging whether the grammar in the file is correct or not, if so, executing S3;
s3, analyzing the file by adopting a preset rule to form a syntax tree;
s4, processing the syntax tree, and extracting DML sentences from the syntax tree;
s5, analyzing the DML sentences and constructing the dependency relationship of each sentence in the DML sentences;
s6, analyzing each statement in the DML statement with the built dependency relationship, and intelligently searching and completing unknown information in the statement to acquire structural related information with the dependency relationship.
The beneficial effects of the first basic scheme are as follows: the method comprises the steps of obtaining a storage process and/or a script file to be analyzed, firstly, judging grammar and file types in the file, ensuring that the correct file is processed subsequently, and preventing the memory consumption and even influencing normal operation caused by processing of an error file; then analyzing the correct file by adopting a preset rule to form a syntax tree, wherein the syntax tree refers to a group of data with a tree structure, the access to the syntax tree can be realized through tree traversal, and the relationship exists among all nodes of the syntax tree; then, the syntax tree is processed, and DML sentences are extracted from the syntax tree, because the scheme is directed at the processing of complex DML sentences, the DML sentences in the syntax tree are extracted, and only the DML sentences are processed, so that the processing amount is reduced, and the memory is occupied by the subsequent processing; then analyzing the extracted DML sentences to construct the dependency relationship of each sentence in the DML sentences, wherein the DML sentences are extracted from the syntax tree, and each sentence in the DML sentences has a dependency relationship in the syntax tree, but the dependency relationship may not directly exist after the DML sentences are extracted, so that the dependency relationship of each sentence in the DML sentences needs to be constructed to ensure the accuracy and the continuity of the dependency relationship among the sentences in each DML sentence; and finally, analyzing each statement in the DML statement with the built dependency relationship, intelligently searching and completing unknown information in the statement, preventing incomplete structure-related information obtained subsequently due to information loss, thereby influencing the processing accuracy, and obtaining the structure-related information with the dependency relationship, thereby realizing intelligent data relationship building based on a storage process, determining the relationship dependency between service data and the service data, processing the subsequent DML statement, analyzing the subsequent DML statement one by one without batch processing for multiple times regardless of the complexity, directly analyzing the service data circulation according to the dependency relationship, not analyzing the DML statement again, reducing the memory requirement, and further improving the processing speed to a certain extent because the DML statement does not need to be processed in batch for multiple times and analyzed one by one.
The scheme can assist in processing complex DML sentences so as to reduce memory requirements and improve processing speed.
Further, the acquiring the file to be parsed includes: and inputting a file to be analyzed through the tracing service.
Has the advantages that: the file to be analyzed is obtained through source tracing service input, so that a user can perform specific input according to requirements to perform intelligent construction of data relation based on a storage process.
Further, the determining whether the grammar in the file is correct includes:
judging whether the file is an SQL statement and/or script file which can be correctly executed in a database, if so, judging that the grammar in the file is correct, and executing S3; if not, the grammar in the file is judged to be wrong.
Has the advantages that: the grammar judgment in the file is mainly to judge whether the corresponding statement andor script file can be correctly executed in the database, thereby ensuring that the file for subsequent processing is executable, ensuring the accuracy of processing the input data and further improving the accuracy of the output result.
Further, the S3 includes: and performing lexical analysis and syntactic analysis on the file and the content thereof by adopting a preset lexical rule and a preset syntactic rule through the tracing service to form a syntactic tree.
Has the advantages that: the preset rules comprise lexical rules and grammatical rules, so that lexical analysis and grammatical analysis are performed on the file and the content of the file to form a grammar tree, the formed grammar tree meets the grammatical rules, and meanwhile, the lexical rules of all nodes in the grammar tree also meet the lexical rules, and the accuracy of the grammar tree is guaranteed.
Further, the DML statement includes: SELECT statements, INSERT statements, UPDATE statements, and merge statements.
Has the advantages that: a DML statement comprising: the SELECT statement, the INSERT statement, the UPDATE statement and the MERGER statement guarantee the comprehensiveness of the extracted DML statement.
Further, the dependency relationship includes: hierarchical relationships and parent-child relationships.
Has the beneficial effects that: the dependency relationship comprises a hierarchical relationship and a parent-child relationship, and is not a single relationship, so that the relationship can be determined in multiple aspects, and the available range is increased.
Furthermore, the intelligent searching and completing is to form a group of data with a data structure through the analysis of the DDL statement and store the group of data in the memory.
Has the beneficial effects that: the intelligent searching and completing is to form a group of data with a data structure through the analysis of the DDL statement and store the group of data in a memory, so that the integrity of the data is guaranteed.
Further, the structure-related information includes: field information, table information and association conditions;
further comprising: s7, sending the table information and the field information to a front end for displaying; according to the dependency relationship, the upstream and downstream of the field information source are obtained; and assembling the table information and the field information with the dependency relationship to generate a data relationship map.
Has the advantages that: assembling table information and field information with dependency relationship to generate a data relationship map, thereby assisting a user to quickly construct the data relationship map, analyzing business data flow, analyzing the dependency relationship of data in the ETL, analyzing downstream influence and the like; sending the table information and the field information to a front end for displaying, and realizing visual display of the dependency relationship of each node in the SQL; according to the dependency relationship, the upstream and the downstream of the field information source are obtained, and the data errors caused by the upstream and downstream data errors in the ETL can be efficiently checked, so that a user can be helped to quickly locate the data problem in the shortest time and correct the data problem.
The second purpose of the present invention is to provide a data relationship intelligent construction system based on storage process, which can assist the processing of complex DML statements to reduce the memory requirement.
The invention provides a second basic scheme: the intelligent data relationship construction system based on the storage process adopts the intelligent data relationship construction method based on the storage process.
The second basic scheme has the beneficial effects that: by adopting the intelligent data relationship construction method based on the storage process, the intelligent construction of the data relationship based on the storage process can be realized, the relationship dependence between the business data and the business data is determined, the subsequent DML sentences are processed, no matter how complex the DML sentences are, the DML sentences do not need to be processed in batches for multiple times and are analyzed one by one, the business data flow is directly analyzed according to the dependence relationship, the DML sentences are not analyzed, the memory requirement is reduced, and the processing speed is improved to a certain extent because the DML sentences do not need to be processed in batches for multiple times and are analyzed one by one.
The scheme can assist in processing the complex DML statement so as to reduce the memory requirement and improve the processing speed.
The invention also aims to provide a storage medium intelligently constructed based on the data relation in the storage process, which can assist in processing complex DML statements so as to reduce the memory requirement.
The invention provides a third basic scheme: the method comprises the steps of intelligently constructing a storage medium based on data relation of stored procedures, wherein the storage medium is stored with a computer program, and the computer program realizes any one of the steps of the intelligently constructing method based on data relation of stored procedures when being executed by a processor.
The third basic scheme has the beneficial effects that: the method comprises the steps of intelligently constructing a storage medium based on the data relationship of the stored procedures, and storing a computer program on the storage medium, wherein the computer program is used for realizing any one of the steps of the method for intelligently constructing the storage medium based on the data relationship of the stored procedures when being executed by a processor so as to facilitate the application of the method for intelligently constructing the storage medium based on the data relationship of the stored procedures.
Drawings
FIG. 1 is a flowchart of an embodiment of a method for intelligently constructing a data relationship based on a storage process according to the present invention.
Detailed Description
The following is further detailed by way of specific embodiments:
description of the drawings: SQL statement: a language in which the database is operated;
DDL statement: a database definition language in SQL statements;
DML statement: a database operating language in SQL statements;
DCL statement: the database control language in the SQL statement.
The embodiment is basically as shown in the attached figure 1: the intelligent data relationship construction method based on the storage process comprises the following steps:
s1, acquiring a file to be analyzed; wherein the file includes: storing one or more of a procedure and a script file; the stored procedure is a named script file;
specifically, a file to be analyzed is input through a tracing service, so that the file to be analyzed is obtained, specifically, a user inputs a specific SQL statement and an absolute path or a relative path containing an SQL content file through the input service of the tracing analysis;
s2, judging whether the grammar in the file is correct or not, if so, executing S3; if not, executing S8;
specifically, judging whether the file is an SQL statement and/or script file which can be correctly executed in the database, if so, judging that the grammar in the file is correct, and executing S3; if not, judging that the grammar in the file is wrong; the SQL statement comprises: the method comprises the following steps that DML statements, DCL statements and DDL statements are subjected to syntax check during tracing, a syntax tree is constructed, and a script file is another carrier of SQL statements; s2, in order to ensure that the syntax of the stored procedures and/or script files must be correct and support SQL sentences.
S3, analyzing the file by adopting a preset rule to form a syntax tree;
specifically, through a tracing service, a preset lexical rule and a preset grammatical rule are adopted to perform lexical analysis and grammatical analysis on the file and the content thereof to form a grammar tree; the content of the lexical rule and the grammatical rule is grammatical codes defined by antlr4, specific content is set according to the grammatical codes, for example, java language can be used for writing a plurality of class files, a plurality of java codes can be written in the class files, the class files are equivalent to grammar code files predefined in advance, and the rule is a code composition block in the class files; the file analysis is completed through anltr4, the visitor mode and the listener mode provided by antlr4 complete the analysis of the grammar structure, the grammar tree formed after the analysis is a group of data with a tree structure, and the access to the grammar tree can be realized through the traversal of the tree; and the lexical analysis and the syntactic analysis are carried out on the file and the content thereof, and all SQL sentences including DML sentences, DCL sentences and DDL sentences are formed into corresponding syntactic trees, wherein only the analysis of the DML sentences is needed by the service, and other sentences can be analyzed, but the applied service scenes are different;
s4, processing the syntax tree, and extracting DML sentences from the syntax tree; wherein the DML statement includes but is not limited to: the SELECT statement, the INSERT statement, the UPDATE statement and the MERGER statement, and the processing of the syntax tree is also completed through anltr 4;
s5, analyzing the DML sentences, and constructing the dependency relationship of each sentence in the DML sentences; wherein the dependency relationship includes: hierarchical relationships and parent-child relationships;
according to the grammar code file, the rule in the grammar code file can correspondingly match the script file and the SQL statement according to the grammar definition of the antlr4, and the analysis of the DML statement is completed; the anti 4 syntax definition is a program which is defined by a user in advance, and is used for carrying out corresponding matching on the script file and the SQL sentence and carrying out dynamic adaptation and analysis on each sentence in the SQL sentence: DML statements, DDL statements and DCL statements;
s6, analyzing each statement in the DML statement with the established dependency relationship, intelligently searching and completing unknown information in the statement, and acquiring structure related information with the dependency relationship; wherein the structure-related information includes: field information, table information and association conditions; the intelligent searching and completion of the unknown information in the statement is realized by analyzing the DDL statement to form a group of data with a data structure and storing the group of data in a memory; specifically, if asterisks exist in the DML statements, it is described that the parsing process is in a condition of incomplete fields, and syntax completion needs to be performed; wherein the DDL statement is metadata of the whole DML parsing;
s7, sending the table information and the field information to a front end for displaying; according to the dependency relationship, the upstream and downstream of the field information source are obtained; assembling table information and field information with dependency relationship to generate a data relationship map; s6, returning the structure related information with the dependency relationship to carry out S7; assembling table information and field information with dependency relationship to generate a data relationship map, thereby assisting a user to quickly construct the data relationship map, analyzing business data flow, analyzing the dependency relationship of data in the ETL, analyzing downstream influence and the like; sending the table information and the field information to a front end for displaying, and realizing visual display of the dependency relationship of each node in the SQL; according to the dependency relationship, the upstream and the downstream of the field information source are obtained, and data errors caused by upstream and downstream data errors in the ETL can be efficiently checked, so that a user can be helped to quickly locate the data problem in the shortest time and correct the data problem; in the embodiment, the front end is a webpage front end and is displayed through a terminal (such as a computer), namely, the front end is displayed in a front-end page displayed in the terminal;
and S8, ending.
According to the scheme, the intelligent construction of the data relation based on the storage process is realized, the relation dependence between the business data and the business data is determined, the subsequent DML sentences are processed, no matter how complex the DML sentences are, the batch processing and the analysis are not needed, the business data flow is directly analyzed according to the dependence relation, the DML sentences are not analyzed, the memory requirement is reduced, and the batch processing and the analysis are not needed, so that the processing speed is improved to a certain extent.
The embodiment also provides a data relation intelligent construction system based on the storage process, and the data relation intelligent construction method based on the storage process is used.
The intelligent data relation construction method based on the storage process can be stored in a storage medium if the intelligent data relation construction method is realized in the form of a software functional unit and is sold or used as an independent product. Based on such understanding, all or part of the flow in the method according to the above embodiments may be implemented by a computer program, which may be stored in a readable storage medium and used by a processor to implement the steps of the above method embodiments. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like.
The foregoing is merely an example of the present invention, and common general knowledge in the field of known specific structures and characteristics is not described herein in any greater extent than that known in the art at the filing date or prior to the priority date of the application, so that those skilled in the art can now appreciate that all of the above-described techniques in this field and have the ability to apply routine experimentation before this date can be combined with one or more of the present teachings to complete and implement the present invention, and that certain typical known structures or known methods do not pose any impediments to the implementation of the present invention by those skilled in the art. It should be noted that, for those skilled in the art, without departing from the structure of the present invention, several changes and modifications can be made, which should also be regarded as the protection scope of the present invention, and these will not affect the effect of the implementation of the present invention and the practicability of the patent. The scope of the claims of the present application shall be determined by the contents of the claims, and the description of the embodiments and the like in the specification shall be used to explain the contents of the claims.

Claims (10)

1. The intelligent data relationship construction method based on the storage process is characterized by comprising the following steps:
s1, acquiring a file to be analyzed, wherein the file comprises: storing one or more of a procedure and a script file;
s2, judging whether the grammar in the file is correct or not, if so, executing S3;
s3, analyzing the file by adopting a preset rule to form a syntax tree;
s4, processing the syntax tree, and extracting DML sentences from the syntax tree;
s5, analyzing the DML sentences and constructing the dependency relationship of each sentence in the DML sentences;
s6, analyzing each statement in the DML statement with the established dependency relationship, intelligently searching and completing unknown information in the statement, and acquiring structure related information with the dependency relationship.
2. The method for intelligently constructing the data relationship based on the stored process according to claim 1, wherein the obtaining the file to be parsed comprises: and inputting a file to be analyzed through the tracing service.
3. The method for intelligently constructing the data relationship based on the stored process as claimed in claim 1, wherein the determining whether the grammar in the file is correct comprises:
judging whether the file is an SQL statement and/or script file which can be correctly executed in a database, if so, judging that the grammar in the file is correct, and executing S3; if not, the grammar in the file is judged to be wrong.
4. The intelligent building method of data relationship based on stored procedures as claimed in claim 2, wherein the S3 comprises: and performing lexical analysis and syntactic analysis on the file and the content thereof by adopting a preset lexical rule and a preset syntactic rule through the tracing service to form a syntactic tree.
5. The method for intelligently constructing the data relationship based on the stored procedure according to claim 1, wherein the DML statement comprises: SELECT statements, INSERT statements, UPDATE statements, and merge statements.
6. The intelligent building method of data relationship based on stored procedures as claimed in claim 1, wherein the dependency relationship comprises: hierarchical relationships and parent-child relationships.
7. The method for intelligently constructing a data relationship based on a storage process according to claim 1, wherein the intelligent search and completion is a group of data with a data structure formed by parsing a DDL statement and stored in a memory.
8. The intelligent building method of data relationship based on stored procedures as claimed in claim 1, wherein the structure-related information comprises: field information, table information and association conditions;
the method further comprises the following steps: s7, sending the table information and the field information to a front end for displaying; according to the dependency relationship, the upstream and downstream of the field information source are obtained; and assembling the table information and the field information with the dependency relationship to generate a data relationship map.
9. The intelligent data relation building system based on the stored procedures is characterized in that the intelligent data relation building method based on the stored procedures is adopted according to any one of claims 1 to 8.
10. A storage medium for intelligent construction of data relationships based on stored procedures, said storage medium having stored thereon a computer program, wherein said computer program, when executed by a processor, implements the steps of the intelligent construction of data relationships based on stored procedures according to any of claims 1-8.
CN202211349369.4A 2022-10-31 2022-10-31 Intelligent data relation construction method and system based on storage process and storage medium Pending CN115878176A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211349369.4A CN115878176A (en) 2022-10-31 2022-10-31 Intelligent data relation construction method and system based on storage process and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211349369.4A CN115878176A (en) 2022-10-31 2022-10-31 Intelligent data relation construction method and system based on storage process and storage medium

Publications (1)

Publication Number Publication Date
CN115878176A true CN115878176A (en) 2023-03-31

Family

ID=85759239

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211349369.4A Pending CN115878176A (en) 2022-10-31 2022-10-31 Intelligent data relation construction method and system based on storage process and storage medium

Country Status (1)

Country Link
CN (1) CN115878176A (en)

Similar Documents

Publication Publication Date Title
CN111061757B (en) Language conversion method and device of database, electronic equipment and storage medium
CN110502227B (en) Code complement method and device, storage medium and electronic equipment
CN107798123B (en) Knowledge base and establishing, modifying and intelligent question and answer methods, devices and equipment thereof
CN109710220B (en) Relational database query method, relational database query device, relational database query equipment and storage medium
CN111831384A (en) Language switching method and device, equipment and storage medium
CN112988163B (en) Intelligent adaptation method, intelligent adaptation device, intelligent adaptation electronic equipment and intelligent adaptation medium for programming language
CN109299289B (en) Query graph construction method and device, electronic equipment and computer storage medium
WO2021259290A1 (en) Stored procedure conversion method and apparatus, and device and storage medium
CN111427784B (en) Data acquisition method, device, equipment and storage medium
CN113901083A (en) Heterogeneous data source operation resource analysis positioning method and equipment based on multiple analyzers
CN113468204A (en) Data query method, device, equipment and medium
CN116521621A (en) Data processing method and device, electronic equipment and storage medium
CN108008947B (en) Intelligent prompting method and device for programming statement, server and storage medium
CN112948419A (en) Query statement processing method and device
CN117971860A (en) Method and device for generating SQL (structured query language) sentences based on large language model and terminal equipment
CN111401034A (en) Text semantic analysis method, semantic analysis device and terminal
WO2020024778A1 (en) Method, system and device for modifying xml file in batch and computer-readable storage medium
CN116775488A (en) Abnormal data determination method, device, equipment, medium and product
CN115878176A (en) Intelligent data relation construction method and system based on storage process and storage medium
CN116955393A (en) Data processing method and device, electronic equipment and storage medium
CN116010461A (en) Data blood relationship analysis method and device, storage medium and electronic equipment
CN112799638B (en) Non-invasive rapid development method, platform, terminal and storage medium
CN114416107A (en) Method, device, storage medium and equipment for translating logic
CN113821533A (en) Data query method, device, equipment and storage medium
CN110263055B (en) Parameter prompting method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination