CN110781181B - Data extraction method, device, equipment and storage medium based on MDL file - Google Patents

Data extraction method, device, equipment and storage medium based on MDL file Download PDF

Info

Publication number
CN110781181B
CN110781181B CN201910843447.8A CN201910843447A CN110781181B CN 110781181 B CN110781181 B CN 110781181B CN 201910843447 A CN201910843447 A CN 201910843447A CN 110781181 B CN110781181 B CN 110781181B
Authority
CN
China
Prior art keywords
target
script
data extraction
data
structured query
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910843447.8A
Other languages
Chinese (zh)
Other versions
CN110781181A (en
Inventor
任世民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Property and Casualty Insurance Company of China Ltd
Original Assignee
Ping An Property and Casualty Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Property and Casualty Insurance Company of China Ltd filed Critical Ping An Property and Casualty Insurance Company of China Ltd
Priority to CN201910843447.8A priority Critical patent/CN110781181B/en
Publication of CN110781181A publication Critical patent/CN110781181A/en
Application granted granted Critical
Publication of CN110781181B publication Critical patent/CN110781181B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation

Abstract

The invention is applicable to the technical field of computers and provides a data extraction method, a device, computer equipment and a storage medium based on an MDL file, wherein the method comprises the following steps: target data is extracted in batches from the target MDL file through the target data extraction script compiled in advance, so that the data is extracted in batches from the MDL file through the target data extraction script with a light weight, the problems of long time consumption and complicated steps when the MDL file is opened through software such as cognos, matlab are solved, and convenience, rapidness and efficiency of data extraction are improved.

Description

Data extraction method, device, equipment and storage medium based on MDL file
Technical Field
The invention belongs to the technical field of computers, and particularly relates to a data extraction method, device and equipment based on an MDL file and a storage medium.
Background
Business intelligence, also known as BI (Business Intelligence), is now beginning to build business intelligence systems suitable for self business analysis by many units as business intelligence technology continues to spread. IBM's cognis are one of the tools largely adopted in BI projects in recent years, and the report presented by cognis is based on a unified metadata model, which provides a unified and consistent view for applications, so that a user can customize the report in a browser, the format is flexible, the elements are rich, and the open Query can be performed Through Query Studio, and cognis also have unique functions of Drill Through (slice), slice (face), rotation (pivot), etc., so that analysts, management personnel or executives can access information rapidly, consistently and interactively from multiple angles, thereby obtaining deeper understanding of data, effectively associating various related information, enabling the user to go deep into detail data of interest while analyzing summarized data, so as to more comprehensively understand conditions, and make correct decisions. The cognis is widely applied in the industry by virtue of the advantages of high-performance large-user number, large-data-volume data access analysis capability, flexible and easy-to-use report making capability, high safety, high expandability, easy use, low deployment cost and the like.
In practical application, a multidimensional data model (namely an MDL file) is established through a multidimensional data design tool Transfomer in the Cognos, and data in different data sources can be stored through the MDL file, so that the requirement of a user for data query and analysis from multiple angles and multiple layers can be met. However, at present, when Data is queried or extracted from an MDL file, the corresponding MDL file can only be opened through cognis, then the Data Source (Data Source) of each part is manually opened, and the corresponding SQL code is copied to query or extract the corresponding Data.
Disclosure of Invention
The invention aims to provide a data extraction method, device, equipment and storage medium based on an MDL file, and aims to solve the problems that the steps of inquiring or extracting data from the MDL file are complicated, the time consumption is long and the efficiency is low because the corresponding MDL file can only be opened through software such as Cognos in the prior art.
In one aspect, the present invention provides a data extraction method based on an MDL file, the method including the steps of:
running a precompiled target data extraction script according to the received script starting instruction;
when the target data extraction script receives a data extraction request sent by a user, a target multidimensional data model is obtained according to the data extraction request, wherein the target multidimensional data model is a target MDL file;
and according to the obtained target MDL file, the target data extraction script extracts target data from the target MDL file in batches.
Preferably, the step of extracting target data from the target MDL file by the target data extraction script in batches includes:
the target data extraction script receives dimension table information and measurement information of a preset data source input by the user according to the target multi-dimension data model;
the target data extraction script extracts a corresponding structured query language script from the target multidimensional data model according to the received dimension table information and the received measurement information;
and the target data extraction script extracts corresponding target data from the data source according to the structured query language script.
Further preferably, the step of extracting, by the target data extraction script, a corresponding structured query language script from the target multidimensional data model according to the received dimension table information and the metric information includes:
and the target data extraction script analyzes the mapping relation between the dimension table and between the dimension table and the fact table stored in the target MDL file according to the dimension table information and the measurement information, and extracts the structured query language script according to the mapping relation.
Further preferably, the step of extracting the structured query language script according to the mapping relation includes:
when the target data extraction script extracts a plurality of different structured query language scripts according to the mapping relation, synthesizing the plurality of different structured query language scripts into one structured query language script, and performing sentence inspection on the synthesized structured query language script.
In another aspect, the present invention provides an MDL file-based data extraction apparatus, the apparatus comprising:
the target script starting unit is used for running a precompiled target data extraction script according to the received script starting instruction;
the data model obtaining unit is used for obtaining a target multidimensional data model according to the data extraction request when the target data extraction script receives the data extraction request sent by a user, wherein the target multidimensional data model is a target MDL file; and
and the target data extraction unit is used for extracting target data in batches from the target MDL file according to the obtained target MDL file by the target data extraction script.
Preferably, the target data extraction unit includes:
the source information receiving unit is used for receiving the dimension table information and the measurement information of the preset data source input by the user according to the target multi-dimension data model by the target data extraction script;
the language script extraction unit is used for extracting a corresponding structured query language script from the target multidimensional data model according to the received dimension table information and the received measurement information by the target data extraction script; and
and the data extraction subunit is used for extracting corresponding target data from the data source according to the structured query language script by the target data extraction script.
Further preferably, the language script extraction unit includes:
and the script extraction subunit is used for analyzing the mapping relation between the dimension table and between the dimension table and the fact table stored in the target MDL file according to the dimension table information and the measurement information by the target data extraction script, and extracting the structured query language script according to the mapping relation.
Further preferably, the script extraction subunit includes:
and the language script merging unit is used for synthesizing the plurality of different structured query language scripts into one structured query language script when the target data extraction script extracts the plurality of different structured query language scripts according to the mapping relation, and performing sentence inspection on the synthesized structured query language script.
In another aspect, the present invention also provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps described in the MDL file-based data extraction method above when the computer program is executed.
In another aspect, the present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the MDL file based data extraction method described above.
According to the invention, the target data is extracted from the target MDL file in batches through the target data extraction script compiled in advance, so that the data is extracted from the MDL file in batches through the target data extraction script with a light weight, the problems of long time consumption and complicated steps when the MDL file is opened through software such as cognos, matlab are solved, and the convenience, the rapidness and the efficiency of data extraction are improved.
Drawings
FIG. 1 is a flowchart of an implementation of an MDL file-based data extraction method according to an embodiment of the present invention;
fig. 2 is a flowchart for implementing step S103 in the first embodiment provided in the second embodiment of the present invention;
fig. 3 is a schematic structural diagram of an MDL file based data extracting apparatus according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of a target data extraction unit in an MDL file based data extraction apparatus according to a fourth embodiment of the present invention; and
fig. 5 is a schematic structural diagram of a computer device according to a fifth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The following describes in detail the implementation of the present invention in connection with specific embodiments:
embodiment one:
fig. 1 shows a flow of implementation of the MDL file-based data extraction method according to the first embodiment of the present invention, and for convenience of explanation, only the relevant parts of the embodiment of the present invention are shown, which is described in detail below:
in step S101, a precompiled target data extraction script is run according to the received script start instruction.
The embodiment of the invention is applicable to a data processing platform, equipment or system, such as a personal computer, a server and the like. In the embodiment of the invention, the script is usually temporarily called and executed by an application program, which may be a system application program or a third party application program, and when the data processing platform, the device or the system receives a script starting instruction sent by a user, the script starting instruction executes a target data extraction script according to the script starting instruction, where the script starting instruction includes a name of the target data extraction script, and the target data extraction script (script) is an executable file written according to a certain format, which is also called a macro or batch file, using a specific descriptive language.
Before running the precompiled target data extraction script according to the received script starting instruction, preferably, a Python environment is built, and the target data extraction script is written through the Python, so that the system cost is reduced, and the readability, maintainability and expandability of the target data extraction script are improved.
In step S102, when the target data extraction script receives a data extraction request sent by a user, a target multidimensional data model is obtained according to the data extraction request, where the target multidimensional data model is a target MDL file.
In the embodiment of the invention, the model definition language (Model Definition Language, abbreviated as MDL) file is stored in an American Standard message exchange code (American Standard Code for Information Interchange, ASCIL) file, and the MDL file can be used for generating a corresponding Cube by a Cognos report tool Powerplay in an On-line analysis processing system (On-Line Analytical Processing, OLAP) so as to allow a user to explore and analyze a data set from multiple angles. When the target data extraction script receives a data extraction request sent by a user through a command prompt (cmd command) or an interface (API) provided by the target data extraction script, the target data extraction script analyzes the data extraction request to obtain a storage path of a target multidimensional data model corresponding to data to be extracted, which is contained in the data extraction request, wherein the storage path can be a local storage path or a cloud storage path, and the target data extraction script obtains the target multidimensional data model according to the storage path, wherein the target multidimensional data model is stored in a cognis as a target model definition language (Model Definition Language, MDL) file, namely a target MDL file.
Before the target multidimensional data model is obtained according to the data extraction request, the target multidimensional data model in mdl format is preferably generated through a multidimensional data design tool Transfomer in the Cognos8.4 version, so that the compatibility and portability of the target multidimensional data model are improved.
In step S103, the target data extraction script extracts target data in batch from the target MDL file according to the obtained target MDL file.
In the embodiment of the invention, the target data extraction script extracts or inquires corresponding target data in batches from the target MDL file according to the data extraction requirement of the user.
According to the embodiment of the invention, the target data is extracted in batches from the target MDL file through the pre-compiled target data extraction script, so that the data is extracted in batches from the MDL file through the lightweight target data extraction script, the problems of long time consumption and complicated steps when the MDL file is opened through cognos, matlab and other software are solved, and the convenience, the rapidness and the efficiency of data extraction are improved.
Embodiment two:
fig. 2 shows a flow of implementation of step S103 in the second embodiment of the present invention, and for convenience of explanation, only the portions related to the second embodiment of the present invention are shown, which is described in detail below:
in step S201, the target data extraction script receives dimension table information and metric information of a preset data source input by a user according to a target multi-dimensional data model.
In the embodiment of the invention, the target multidimensional data model is a freely drillable model composed of a plurality of dimensions and indexes (i.e. metrics), namely, the source data is divided into different dimensions and data of different metrics for storage, wherein the dimensions correspond to a plurality of aspects of each data, for example, one sales data can correspond to a plurality of aspects of sales time, sales place, products and the like, each dimension can comprise a plurality of sub-dimensions, the plurality of sub-dimensions can belong to different layers, for example, the sales time dimension can comprise a plurality of time sub-dimensions, each time sub-dimension can belong to any of three layers of year, quarter, month and the like, and in the data warehouse, the dimension members are stored in the dimension table. The metrics are the actual meaning represented by each data, e.g., one data may represent sales, inventory, order, etc., as it exists in a list of facts. In the embodiment of the invention, after the target data extraction script obtains the target multidimensional data model according to the storage path, the content of the obtained target multidimensional data model is read, the output is displayed, the dimension table information and the measurement information of the preset data source are input by a receiving user according to the content of the target multidimensional data model which is displayed and output, so as to be used for extracting the wanted data from the data source, wherein the preset data source comprises an internal data source and an external data source, the internal data source stores the total data which is accumulated for a long time in daily business activities and stored in a GOSL and/or GORT (for example, SQL Server, oracle, DB2 and the like) database from the enterprise, and the external data source stores the data of a third party platform except the enterprise itself, for example, the data which is interested in the enterprise comparison is grabbed on a network by using a tool such as a web crawler.
Preferably, the target multidimensional data model can connect data from a plurality of different data sources, wherein each data source can be stored in a different data warehouse, or the data in each data source can be stored in a different storage format, so that the total amount of data processed by the Transfomer is reduced through multiple data sources, the performance of the target multidimensional data model is improved, and the management performance of the data in the multiple data sources is further improved. For example, one data source in the target multidimensional data model is from an EXCEL table that extracts data from one data warehouse, while another data source is from an iqd file that extracts data from a different data warehouse.
In step S202, the target data extraction script extracts a corresponding structured query language script from the target multidimensional data model according to the received dimension table information and the metric information.
In an embodiment of the invention, the target data extraction script extracts a corresponding structured query language (Structured Query Language, SQL) script from the target multidimensional data model according to the received dimension table information and the received measurement information.
When the target data extraction script extracts the corresponding structured query language script from the target multidimensional data model according to the received dimension table information and the measurement information, preferably, the target data extraction script analyzes the mapping relation between the dimension table and the dimension table stored in the MDL file and between the dimension table and the fact table according to the received dimension table information and the measurement information, and extracts the corresponding SQL script according to the mapping relation.
Further preferably, the mapping relationship comprises mapping relationships of tables among different data cubes and association relationships among MDL files formed by different cognos versions, so that the integrity of the follow-up extraction data is improved.
Further preferably, when a plurality of different SQL scripts are extracted according to the mapping relation, the plurality of different SQL scripts are synthesized into one SQL script, and sentence examination is performed on the synthesized SQL script to determine the correctness of the SQL script, so that the integrity and the correctness of the subsequently extracted data are improved.
In step S203, the target data extraction script extracts corresponding target data from the data source according to the structured query language script.
In the embodiment of the invention, the target data extraction script extracts corresponding data from the corresponding internal data source and the external data source according to the table information (for example, table name, table type, table and fact table association information) contained in the parsed SQL script, so that a user can perform operations such as data modification, deletion, addition and the like.
In the embodiment of the invention, firstly, the target data extraction script receives the dimension table information and the measurement information of the preset data source input by a user according to the target multidimensional data model, then, the corresponding structured query language script is extracted from the target multidimensional data model according to the received dimension table information and the measurement information, and finally, the corresponding target data is extracted from the data source according to the structured query language script, so that the data is extracted from the MDL file in batches through the lightweight target data extraction script, the problems of long time consumption and complicated steps when the MDL file is opened through cognos, matlab and other software are solved, and the convenience, the rapidness and the efficiency of data extraction are improved.
Embodiment III:
fig. 3 shows a structure of an MDL file based data extracting apparatus according to a third embodiment of the present invention, and for convenience of explanation, only a portion related to the embodiment of the present invention is shown, including:
the target script starting unit 31 is configured to run a precompiled target data extraction script according to the received script starting instruction.
The embodiment of the invention is applicable to a data processing platform, equipment or system, such as a personal computer, a server and the like. In the embodiment of the invention, the script is usually temporarily called and executed by an application program, which may be a system application program or a third party application program, and when the data processing platform, the device or the system receives a script starting instruction sent by a user, the script starting instruction executes a target data extraction script according to the script starting instruction, where the script starting instruction includes a name of the target data extraction script, and the target data extraction script (script) is an executable file written according to a certain format, which is also called a macro or batch file, using a specific descriptive language.
Before running the precompiled target data extraction script according to the received script starting instruction, preferably, a Python environment is built, and the target data extraction script is written through the Python, so that the system cost is reduced, and the readability, maintainability and expandability of the target data extraction script are improved.
And a data model obtaining unit 32, configured to obtain a target multidimensional data model according to the data extraction request when the target data extraction script receives the data extraction request sent by the user, where the target multidimensional data model is a target MDL file.
In the embodiment of the invention, the model definition language (Model Definition Language, abbreviated as MDL) file is stored in an American Standard message exchange code (American Standard Code for Information Interchange, ASCIL) file, and the MDL file can be used for generating a corresponding Cube by a Cognos report tool Powerplay in an On-line analysis processing system (On-Line Analytical Processing, OLAP) so as to allow a user to explore and analyze a data set from multiple angles. When the target data extraction script receives a data extraction request sent by a user through a command prompt (cmd command) or an interface (API) provided by the target data extraction script, the target data extraction script analyzes the data extraction request to obtain a storage path of a target multidimensional data model corresponding to data to be extracted, which is contained in the data extraction request, wherein the storage path can be a local storage path or a cloud storage path, and the target data extraction script obtains the target multidimensional data model according to the storage path, wherein the target multidimensional data model is stored in a cognis as a target model definition language (Model Definition Language, MDL) file, namely a target MDL file.
Before the target multidimensional data model is obtained according to the data extraction request, the target multidimensional data model in mdl format is preferably generated through a multidimensional data design tool Transfomer in the Cognos8.4 version, so that the compatibility and portability of the target multidimensional data model are improved.
And a target data extraction unit 33, configured to extract target data from the target MDL file in batch according to the obtained target MDL file.
In the embodiment of the invention, the target data extraction script extracts or inquires corresponding target data in batches from the target MDL file according to the data extraction requirement of the user.
In the embodiment of the present invention, each unit of the data extraction device based on the MDL file may be implemented by a corresponding hardware or software unit, and each unit may be an independent software or hardware unit, or may be integrated into one software or hardware unit, which is not used to limit the present invention.
Embodiment four:
fig. 4 shows the structure of the target data extraction unit 33 provided in the fourth embodiment of the present invention, and for convenience of explanation, only the portions related to the embodiment of the present invention are shown, including:
a source information receiving unit 41 for receiving the dimension table information and the metric information of the preset data source input by the user according to the target multi-dimensional data model.
In the embodiment of the invention, the target multidimensional data model is a freely drillable model composed of a plurality of dimensions and indexes (i.e. metrics), namely, the source data is divided into different dimensions and data of different metrics for storage, wherein the dimensions correspond to a plurality of aspects of each data, for example, one sales data can correspond to a plurality of aspects of sales time, sales place, products and the like, each dimension can comprise a plurality of sub-dimensions, the plurality of sub-dimensions can belong to different layers, for example, the sales time dimension can comprise a plurality of time sub-dimensions, each time sub-dimension can belong to any of three layers of year, quarter, month and the like, and in the data warehouse, the dimension members are stored in the dimension table. The metrics are the actual meaning represented by each data, e.g., one data may represent sales, inventory, order, etc., as it exists in a list of facts. In the embodiment of the invention, after the target data extraction script obtains the target multidimensional data model according to the storage path, the content of the obtained target multidimensional data model is read, the output is displayed, the dimension table information and the measurement information of the preset data source are input by a receiving user according to the content of the target multidimensional data model which is displayed and output, so as to be used for extracting the wanted data from the data source, wherein the preset data source comprises an internal data source and an external data source, the internal data source stores the total data which is accumulated for a long time in daily business activities and stored in a GOSL and/or GORT (for example, SQL Server, oracle, DB2 and the like) database from the enterprise, and the external data source stores the data of a third party platform except the enterprise itself, for example, the data which is interested in the enterprise comparison is grabbed on a network by using a tool such as a web crawler.
Preferably, the target multidimensional data model can connect data from a plurality of different data sources, wherein each data source can be stored in a different data warehouse, or the data in each data source can be stored in a different storage format, so that the total amount of data processed by the Transfomer is reduced through multiple data sources, the performance of the target multidimensional data model is improved, and the management performance of the data in the multiple data sources is further improved. For example, one data source in the target multidimensional data model is from an EXCEL table that extracts data from one data warehouse, while another data source is from an iqd file that extracts data from a different data warehouse.
The language script extraction unit 42 is configured to extract, from the target multidimensional data model, a corresponding structured query language script according to the received dimension table information and metric information.
In an embodiment of the invention, the target data extraction script extracts a corresponding structured query language (Structured Query Language, SQL) script from the target multidimensional data model according to the received dimension table information and the received measurement information.
When the target data extraction script extracts the corresponding structured query language script from the target multidimensional data model according to the received dimension table information and the measurement information, preferably, the target data extraction script analyzes the mapping relation between the dimension table and the dimension table stored in the MDL file and between the dimension table and the fact table according to the received dimension table information and the measurement information, and extracts the corresponding SQL script according to the mapping relation.
Further preferably, the mapping relationship comprises mapping relationships of tables among different data cubes and association relationships among MDL files formed by different cognos versions, so that the integrity of the follow-up extraction data is improved.
Further preferably, when a plurality of different SQL scripts are extracted according to the mapping relation, the plurality of different SQL scripts are synthesized into one SQL script, and sentence examination is performed on the synthesized SQL script to determine the correctness of the SQL script, so that the integrity and the correctness of the subsequently extracted data are improved.
The data extraction subunit 43 is configured to extract, by the target data extraction script, corresponding target data from the data source according to the structured query language script.
In the embodiment of the invention, the target data extraction script extracts corresponding data from the corresponding internal data source and the external data source according to the table information (for example, table name, table type, table and fact table association information) contained in the parsed SQL script, so that a user can perform operations such as data modification, deletion, addition and the like.
Preferably, the language script extraction unit 42 includes:
the script extraction subunit 421 is configured to analyze, according to the dimension table information and the metric information, a mapping relationship between the dimension table and the dimension table stored in the target MDL file, and between the dimension table and the fact table, by using the target data extraction script, and extract the structured query language script according to the mapping relationship.
Further preferably, the script extraction subunit 421 includes:
the language script merging unit 4211 is configured to, when the target data extraction script extracts a plurality of different structured query language scripts according to the mapping relationship, synthesize the plurality of different structured query language scripts into one structured query language script, and perform sentence inspection on the synthesized structured query language script.
In the embodiment of the present invention, each unit of the data extraction device based on the MDL file may be implemented by a corresponding hardware or software unit, and each unit may be an independent software or hardware unit, or may be integrated into one software or hardware unit, which is not used to limit the present invention.
Fifth embodiment:
fig. 5 shows the structure of a computer device provided in the fifth embodiment of the present invention, and for convenience of explanation, only the portions relevant to the embodiment of the present invention are shown.
The computer device 5 of an embodiment of the invention comprises a processor 50, a memory 51 and a computer program 52 stored in the memory 51 and executable on the processor 50. The processor 50, when executing the computer program 52, performs the steps of the above-described embodiment of the MDL file-based data extraction method, such as steps S101 to S103 shown in fig. 1, and steps S201 to S203 shown in fig. 2. Alternatively, the processor 50, when executing the computer program 52, implements the functions of the units in the above-described device embodiments, such as the functions of the units 31 to 33 shown in fig. 3, and the functions of the units 41 to 43 shown in fig. 4.
According to the embodiment of the invention, the target data is extracted in batches from the target MDL file through the pre-compiled target data extraction script, so that the data is extracted in batches from the MDL file through the lightweight target data extraction script, the problems of long time consumption and complicated steps when the MDL file is opened through cognos, matlab and other software are solved, and the convenience, the rapidness and the efficiency of data extraction are improved.
The computer equipment of the embodiment of the invention can be a personal computer or a notebook computer. The steps implemented when the processor 50 executes the computer program 52 in the computer device 5 to implement the MDL file-based data extraction method may refer to the description of the foregoing method embodiments, which are not repeated herein.
Example six:
in an embodiment of the present invention, there is provided a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps in the above-described MDL file-based data extraction method embodiment, for example, step S101 to step S103 shown in fig. 1, and step S201 to step S203 shown in fig. 2. Alternatively, the computer program, when executed by a processor, implements the functions of the units in the above-described embodiments of the apparatus, such as the functions of the units 31 to 33 shown in fig. 3, and the functions of the units 41 to 45 shown in fig. 4.
According to the embodiment of the invention, the target data is extracted in batches from the target MDL file through the pre-compiled target data extraction script, so that the data is extracted in batches from the MDL file through the lightweight target data extraction script, the problems of long time consumption and complicated steps when the MDL file is opened through cognos, matlab and other software are solved, and the convenience, the rapidness and the efficiency of data extraction are improved.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims (6)

1. A data extraction method based on an MDL file, the method comprising the steps of:
running a precompiled target data extraction script according to the received script starting instruction;
when the target data extraction script receives a data extraction request sent by a user, a target multidimensional data model is obtained according to the data extraction request, wherein the target multidimensional data model is a target MDL file;
according to the obtained target MDL file, the target data extraction script extracts target data from the target MDL file in batches;
the step of extracting target data from the target MDL file in batches by the target data extraction script comprises the following steps:
the target data extraction script receives dimension table information and measurement information of a preset data source input by the user according to the target multi-dimension data model;
the target data extraction script extracts a corresponding structured query language script from the target multidimensional data model according to the received dimension table information and the received measurement information;
the target data extraction script extracts corresponding target data from the data source according to the structured query language script;
the step of extracting the corresponding structured query language script from the target multidimensional data model by the target data extraction script according to the received dimension table information and the measurement information comprises the following steps:
and the target data extraction script analyzes the mapping relation between the dimension table and between the dimension table and the fact table stored in the target MDL file according to the dimension table information and the measurement information, and extracts the structured query language script according to the mapping relation.
2. The data extraction method of claim 1, wherein the step of extracting the structured query language script according to the mapping relationship comprises:
when the target data extraction script extracts a plurality of different structured query language scripts according to the mapping relation, synthesizing the plurality of different structured query language scripts into one structured query language script, and performing sentence inspection on the synthesized structured query language script.
3. An MDL file based data extraction apparatus, the apparatus comprising:
the target script starting unit is used for running a precompiled target data extraction script according to the received script starting instruction;
the data model obtaining unit is used for obtaining a target multidimensional data model according to the data extraction request when the target data extraction script receives the data extraction request sent by a user, wherein the target multidimensional data model is a target MDL file; and
the target data extraction unit is used for extracting target data in batches from the target MDL file according to the obtained target MDL file;
the source information receiving unit is used for receiving the dimension table information and the measurement information of the preset data source input by the user according to the target multi-dimension data model by the target data extraction script;
the language script extraction unit is used for extracting a corresponding structured query language script from the target multidimensional data model according to the received dimension table information and the received measurement information by the target data extraction script; and
the data extraction subunit is used for extracting corresponding target data from the data source according to the structured query language script by the target data extraction script;
and the script extraction subunit is used for analyzing the mapping relation between the dimension table and between the dimension table and the fact table stored in the target MDL file according to the dimension table information and the measurement information by the target data extraction script, and extracting the structured query language script according to the mapping relation.
4. The data extraction apparatus of claim 3, wherein the script extraction subunit comprises:
and the language script merging unit is used for synthesizing the plurality of different structured query language scripts into one structured query language script when the target data extraction script extracts the plurality of different structured query language scripts according to the mapping relation, and performing sentence inspection on the synthesized structured query language script.
5. Computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the MDL file based data extraction method according to any one of claims 1 to 2 when the computer program is executed.
6. A computer-readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the MDL file based data extraction method according to any one of claims 1 to 2.
CN201910843447.8A 2019-09-06 2019-09-06 Data extraction method, device, equipment and storage medium based on MDL file Active CN110781181B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910843447.8A CN110781181B (en) 2019-09-06 2019-09-06 Data extraction method, device, equipment and storage medium based on MDL file

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910843447.8A CN110781181B (en) 2019-09-06 2019-09-06 Data extraction method, device, equipment and storage medium based on MDL file

Publications (2)

Publication Number Publication Date
CN110781181A CN110781181A (en) 2020-02-11
CN110781181B true CN110781181B (en) 2024-02-02

Family

ID=69384025

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910843447.8A Active CN110781181B (en) 2019-09-06 2019-09-06 Data extraction method, device, equipment and storage medium based on MDL file

Country Status (1)

Country Link
CN (1) CN110781181B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361140A (en) * 2014-12-10 2015-02-18 用友软件股份有限公司 Dynamically generated data model configuration device and method
CN108763240A (en) * 2018-03-22 2018-11-06 五八有限公司 Data query method, apparatus, equipment and storage medium based on OLAP
CN108833482A (en) * 2018-05-21 2018-11-16 平安科技(深圳)有限公司 MDL file automatic downloading method, system, computer equipment and storage medium
CN109448859A (en) * 2018-11-09 2019-03-08 贵州医渡云技术有限公司 Data processing method and device, electronic equipment, storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180039527A1 (en) * 2016-08-04 2018-02-08 Veeva Systems Inc. Configuring Content Management Systems
US10606857B2 (en) * 2016-09-26 2020-03-31 Splunk Inc. In-memory metrics catalog

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361140A (en) * 2014-12-10 2015-02-18 用友软件股份有限公司 Dynamically generated data model configuration device and method
CN108763240A (en) * 2018-03-22 2018-11-06 五八有限公司 Data query method, apparatus, equipment and storage medium based on OLAP
CN108833482A (en) * 2018-05-21 2018-11-16 平安科技(深圳)有限公司 MDL file automatic downloading method, system, computer equipment and storage medium
CN109448859A (en) * 2018-11-09 2019-03-08 贵州医渡云技术有限公司 Data processing method and device, electronic equipment, storage medium

Also Published As

Publication number Publication date
CN110781181A (en) 2020-02-11

Similar Documents

Publication Publication Date Title
US11086751B2 (en) Intelligent metadata management and data lineage tracing
CN105144080B (en) System for metadata management
US10180992B2 (en) Atomic updating of graph database index structures
EP3144826A1 (en) A method and apparatus for representing compound relationships in a graph database
US20170212945A1 (en) Branchable graph databases
US20210357503A1 (en) Systems and Methods for Detecting Data Alteration from Source to Target
US10671671B2 (en) Supporting tuples in log-based representations of graph databases
US20120124081A1 (en) Method and system for providing data migration
WO2017048303A1 (en) Graph-based queries
CN111061475B (en) Software code generating method, device, computer equipment and storage medium
US20170255708A1 (en) Index structures for graph databases
US20200409944A1 (en) Visual distributed data framework for analysis and visualization of datasets
US20080133455A1 (en) Method of processing data
US20180357278A1 (en) Processing aggregate queries in a graph database
US7529758B2 (en) Method for pre-processing mapping information for efficient decomposition of XML documents
CN116226159A (en) Metadata blood-edge relationship analysis method, system, equipment and storage medium
US11163742B2 (en) System and method for generating in-memory tabular model databases
EP3635580A1 (en) Functional equivalence of tuples and edges in graph databases
US20070282804A1 (en) Apparatus and method for extracting database information from a report
US10754859B2 (en) Encoding edges in graph databases
CN110781181B (en) Data extraction method, device, equipment and storage medium based on MDL file
US20120284224A1 (en) Build of website knowledge tables
US11170164B2 (en) System and method for cell comparison between spreadsheets
CN110851517A (en) Source data extraction method, device and equipment and computer storage medium
Paneva-Marinova et al. Intelligent Data Curation in Virtual Museum for Ancient History and Civilization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant