CN112685325B - ETL software research and development test management method and system - Google Patents

ETL software research and development test management method and system Download PDF

Info

Publication number
CN112685325B
CN112685325B CN202110090036.3A CN202110090036A CN112685325B CN 112685325 B CN112685325 B CN 112685325B CN 202110090036 A CN202110090036 A CN 202110090036A CN 112685325 B CN112685325 B CN 112685325B
Authority
CN
China
Prior art keywords
data
test
information
obtaining
script
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110090036.3A
Other languages
Chinese (zh)
Other versions
CN112685325A (en
Inventor
汪名森
徐迎田
王琴
席艳秋
肖威
信勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Citic Bank Corp Ltd
Original Assignee
China Citic Bank Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Citic Bank Corp Ltd filed Critical China Citic Bank Corp Ltd
Priority to CN202110090036.3A priority Critical patent/CN112685325B/en
Publication of CN112685325A publication Critical patent/CN112685325A/en
Application granted granted Critical
Publication of CN112685325B publication Critical patent/CN112685325B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses an ETL software research and development test management method and system, which acquire field information according to data table metadata information; obtaining test data according to the field information; obtaining a data type according to the data information of the data table; obtaining a checking script according to the data type; obtaining first input information; obtaining a first program script according to the first input information; obtaining first extraction data including source table information and target table information according to a first program script; obtaining first package information according to the test data, the check script and the first extraction data; obtaining test case information according to the first package information; and obtaining a regression test case library according to the test case information, wherein the regression test case library supports continuous integration of programs. The technical problem that the test management platform has single function and cannot meet various requirements of research and development tests is solved. The integration of test data, program scripts and check scripts is achieved, and the technical effect of manual test workload is reduced.

Description

ETL software research and development test management method and system
Technical Field
The invention relates to the technical field of computers, in particular to an ETL software research and development test management method and system.
Background
ETL is an abbreviation for Extract-Transform-Load, and is used to describe the process of extracting (Extract), converting (Transform), and loading (Load) data from a source to a destination. The processes described by ETL are commonly known as ETL or ELT (Extract-Load-Transform) and are used in combination. The larger the data, the complex conversion logic, the more computationally intensive the destination database, the more ELT is biased to use in order to exploit the parallel processing capabilities of the destination database.
The technical scheme of ETL research and test management at present mainly comprises the following steps: and (3) processing the target data table by calling the ETL smoke test storage process, and processing a check report by checking the data distribution, null proportion, maximum value, minimum value and the like of the target data table, thereby checking the quality of the ETL program. The method has higher automation program and can well check the data quality of the target table. However, for ETL script testing, only smoke testing is insufficient, and system testing, performance testing, business testing, regression testing all require a lot of test data, test case preparation. The patent covers too narrow a range. Or checking the data quality of the test result table by methods such as item-by-item comparison of the expected result table and the test result table, and further checking the ETL script quality. Although the method has high accuracy, a large amount of data preparation work is needed, when the method faces large data processing, the preparation of an expected result table (i.e. theoretical value) needs a large amount of work, and continuous accumulation of test cases and automatic test cannot be achieved.
However, in the process of implementing the technical scheme of the invention in the embodiment of the application, the inventor of the application finds that at least the following technical problems exist in the above technology:
in the prior art, the test management platform has single function and cannot meet the technical problems of various requirements of research and development tests.
Disclosure of Invention
The embodiment of the application solves the technical problems that a test management platform in the prior art has single function and cannot meet various requirements of research and development tests by providing the ETL software research and development test management method and system. The integration of test data, program scripts and check scripts is achieved, the test complexity is effectively reduced, and the technical effect of manual test workload is reduced.
In view of the above problems, an embodiment of the present application provides a method and a system for managing ETL software development and testing.
In a first aspect, an embodiment of the present application provides an ETL software development test management method, to obtain metadata information of a data table; obtaining field information according to the data table metadata information; obtaining test data according to the field information; obtaining a data type according to the data information of the data table; obtaining a checking script according to the data type; obtaining first input information, wherein the first input information comprises source table information, target table information and field mapping relations between the source table and the target table; obtaining a first program script according to the first input information; obtaining first extraction data according to the first program script, wherein the first extraction data comprises the source table information and the target table information; obtaining first package information according to the test data, the check script and the first extraction data; obtaining test case information according to the first package information; and obtaining a regression test case library according to the test case information, wherein the regression test case library supports continuous integration of programs.
Preferably, the obtaining test data according to the field information includes: obtaining a preset rule; obtaining a preset algorithm; obtaining various data corresponding to the field according to the field information, the preset rule and the preset algorithm; according to the field information, a date field is obtained; acquiring the date information of the data according to the date field; and obtaining the test data according to the various data and the data date information.
Preferably, the method comprises: obtaining first test data, wherein the first test data is the test data generated by a first field; obtaining second test data, wherein the second test data is the test data generated by a second field, and the first field is different from the second field; wherein the various types of data in the first test data are associated with the various types of data in the second test data; and obtaining a test database according to the first test data and the second test data.
Preferably, after obtaining the check script according to the data type, obtaining a first check script, wherein the first check script is the check script generated at the first time; obtaining a second check script, wherein the second check script is generated at a second time; and analogically, obtaining an Nth checking script, wherein the Nth checking script is generated at an Nth time, and N is a natural number larger than 1; and obtaining a checking script set according to the first checking script, the second checking script and the nth checking script.
Preferably, after obtaining the check script according to the data type, the method includes: obtaining a checking information list according to the checking script; acquiring personnel information according to the personnel information list; acquiring a checking date according to the checking personnel information; and obtaining a checking result according to the checking date and the checking personnel information.
Preferably, the method comprises: obtaining a first test requirement; obtaining data to be tested from the test database according to the first test requirement; obtaining a first test date; obtaining first replacement information according to the first test date and the data to be tested, wherein the first replacement information is used for replacing the date in the data to be tested according to the first test date; obtaining first test information according to the data to be tested, wherein the first test information comprises a first test result and the data to be tested; and obtaining first recovery information according to the first test information, wherein the first recovery information is used for recovering the date in the data to be tested and storing the date in the test database.
Preferably, the method comprises: obtaining the test case information according to the source table information and the target table information; obtaining a data table checking script according to the test case information; obtaining a first binding instruction according to the source table information, the target table information, the test case information and the data table checking script, wherein the first binding instruction is used for binding the source table information and the target table information with the test case information and the data table checking script; obtaining test data date, test data file and checking script according to the test case information and the data table checking script; obtaining a data test script according to the test data date, the test data file and the check script, wherein the data test script is a repeatable execution shell-like script; and according to the data test script, a first execution instruction is obtained, and the first execution instruction is used for issuing the data test script to a test environment for execution.
In another aspect, the present application further provides an ETL software development test management system, where the system includes:
a first obtaining unit configured to obtain data table metadata information;
the second obtaining unit is used for obtaining field information according to the data table metadata information;
the third obtaining unit is used for obtaining test data according to the field information;
a fourth obtaining unit, configured to obtain a data type according to the data information of the data table;
a fifth obtaining unit, configured to obtain a check script according to the data type;
a sixth obtaining unit, configured to obtain first input information, where the first input information includes source table information, target table information, and a field mapping relationship between the source table and the target table;
a seventh obtaining unit, configured to obtain a first program script according to the first input information;
an eighth obtaining unit, configured to obtain first extraction data according to the first program script, where the first extraction data includes the source table information and the target table information;
A ninth obtaining unit, configured to obtain first packet information according to the test data, the check script, and the first extraction data;
a tenth obtaining unit, configured to obtain test case information according to the first packet information;
and the eleventh obtaining unit is used for obtaining a regression test case library according to the test case information, and the regression test case library supports continuous integration of programs.
In a third aspect, the present invention provides an ETL software development test management system comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method of any one of the first aspects when the program is executed by the processor.
One or more technical solutions provided in the embodiments of the present application at least have the following technical effects or advantages:
the embodiment of the application provides an ETL software research and development test management method and system, which are implemented by acquiring metadata information of a data table; obtaining field information according to the data table metadata information; obtaining test data according to the field information; obtaining a data type according to the data information of the data table; obtaining a checking script according to the data type; obtaining first input information, wherein the first input information comprises source table information, target table information and field mapping relations between the source table and the target table; obtaining a first program script according to the first input information; obtaining first extraction data according to the first program script, wherein the first extraction data comprises the source table information and the target table information; obtaining first package information according to the test data, the check script and the first extraction data; obtaining test case information according to the first package information; and obtaining a regression test case library according to the test case information, wherein the regression test case library supports continuous integration of programs. The method and the system realize the construction of a unified ETL research and development and test management platform, realize the unified management of a data table, a batch processing script, test data, a test script, a quality checking script and a test suite, and realize the technical effects of continuous accumulation and effective management of the test suite by combining three elements of the test data, a program script and the checking script, carrying out full-flow scripting on data loading, program running, data checking, data cleaning and the like and accumulating the test script. Therefore, the technical problem that the test management platform in the prior art has single function and cannot meet various requirements of research and development tests is solved.
The foregoing description is only an overview of the technical solutions of the present application, and may be implemented according to the content of the specification in order to make the technical means of the present application more clearly understood, and in order to make the above-mentioned and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application will be given.
Drawings
Fig. 1 is a flow chart of an ETL software development test management method according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of an ETL software development test management system according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an exemplary electronic device according to an embodiment of the present application.
Reference numerals illustrate: the first obtaining unit 11, the second obtaining unit 12, the third obtaining unit 13, the fourth obtaining unit 14, the fifth obtaining unit 15, the sixth obtaining unit 16, the seventh obtaining unit 17, the eighth obtaining unit 18, the ninth obtaining unit 19, the tenth obtaining unit 20, the eleventh obtaining unit 21, the bus 300, the receiver 301, the processor 302, the transmitter 303, the memory 304, the bus interface 306.
Detailed Description
The embodiment of the application solves the technical problems that a test management platform in the prior art has single function and cannot meet various requirements of research and development tests by providing the ETL software research and development test management method and system.
Hereinafter, example embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application and not all of the embodiments of the present application, and it should be understood that the present application is not limited by the example embodiments described herein.
Summary of the application
ETL is an abbreviation for Extract-Transform-Load, and is used to describe the process of extracting (Extract), converting (Transform), and loading (Load) data from a source to a destination. The processes described by ETL are commonly known as ETL or ELT (Extract-Load-Transform) and are used in combination. The larger the data, the complex conversion logic, the more computationally intensive the destination database, the more ELT is biased to use in order to exploit the parallel processing capabilities of the destination database. However, in the prior art, the test management platform has a single function and cannot meet the technical problems of various requirements of research and development tests.
Aiming at the technical problems, the technical scheme provided by the application has the following overall thought:
the embodiment of the application provides an ETL software research and development test management method, which comprises the following steps: obtaining data table metadata information; obtaining field information according to the data table metadata information; obtaining test data according to the field information; obtaining a data type according to the data information of the data table; obtaining a checking script according to the data type; obtaining first input information, wherein the first input information comprises source table information, target table information and field mapping relations between the source table and the target table; obtaining a first program script according to the first input information; obtaining first extraction data according to the first program script, wherein the first extraction data comprises the source table information and the target table information; obtaining first package information according to the test data, the check script and the first extraction data; obtaining test case information according to the first package information; and obtaining a regression test case library according to the test case information, wherein the regression test case library supports continuous integration of programs. The method and the system realize the construction of a unified ETL research and development and test management platform, realize the unified management of a data table, a batch processing script, test data, a test script, a quality checking script and a test suite, and realize the technical effects of continuous accumulation and effective management of the test suite by combining three elements of the test data, a program script and the checking script, carrying out full-flow scripting on data loading, program running, data checking, data cleaning and the like and accumulating the test script.
Having described the basic principles of the present application, various non-limiting embodiments of the present application will now be described in detail with reference to the accompanying drawings.
Example 1
Fig. 1 is a flow chart of an ETL software development test management method according to an embodiment of the present application, as shown in fig. 1, the embodiment of the present application provides an ETL software development test management method, where the method includes:
step S100: data table metadata information is obtained.
Step S200: and obtaining field information according to the data table metadata information.
Step S300: and obtaining test data according to the field information.
Further, the obtaining test data according to the field information includes obtaining a preset rule; obtaining a preset algorithm; obtaining various data corresponding to the field according to the field information, the preset rule and the preset algorithm; according to the field information, a date field is obtained; acquiring the date information of the data according to the date field; and obtaining the test data according to the various data and the data date information.
Further, the method comprises the steps of: obtaining first test data, wherein the first test data is the test data generated by a first field; obtaining second test data, wherein the second test data is the test data generated by a second field, and the first field is different from the second field; wherein the various types of data in the first test data are associated with the various types of data in the second test data; and obtaining a test database according to the first test data and the second test data.
Specifically, the test data is automatically generated, the test data of each field is automatically generated according to the recorded data table metadata information, such as information of a primary key, a data type, dictionary enumeration values and the like, and various types of test data are generated according to a Cartesian product method. For example, various data such as "111, 112, … … 119, 11a … … z" are automatically generated for the char (3) field, and various data such as "1, 11, 111, 1111, 11111, … … zzzz" are automatically generated for the varchar (5) field. The data generation rules are consistent, the algorithms are consistent, and the data are generated according to the sequence of 1-9-a-z, so that the data among the data tables can be correctly associated. Meanwhile, under different test situations, the magnitude of the required test data is different, so that the unit test cases, the integrated test cases and the like are different in the selection quantity of the test cases. Meanwhile, considering that the data date field exists in the system, the metadata of the data table needs to be specially noted in maintenance so as to be replaced by the designated data date information. In an actual test scene, part of test data is desensitized according to production data, so the platform also provides a convenient data import and export function, supports the replacement of the data with data on a designated date, saves various test data versions and realizes the multiplexing of test cases. The table structure stored in the test file in the embodiment of the application is a field, a field name, a type and remarks. And various test data are automatically generated. And various test data management such as unit test, system test, performance test, business test and the like is supported, multiple sets of data multiplexing is allowed, and meanwhile, the test data among team members are not interfered with each other.
Step S400: and obtaining the data type according to the data information of the data table.
Step S500: and obtaining a checking script according to the data type.
Further, after obtaining the check script according to the data type, obtaining a first check script, wherein the first check script is the check script generated at the first time; obtaining a second check script, wherein the second check script is generated at a second time; and analogically, obtaining an Nth checking script, wherein the Nth checking script is generated at an Nth time, and N is a natural number larger than 1; and obtaining a checking script set according to the first checking script, the second checking script and the nth checking script.
Further, after obtaining the check script according to the data type, the method includes: obtaining a checking information list according to the checking script; acquiring personnel information according to the personnel information list; acquiring a checking date according to the checking personnel information; and obtaining a checking result according to the checking date and the checking personnel information.
Further, the method comprises the steps of: obtaining a first test requirement; obtaining data to be tested from the test database according to the first test requirement; obtaining a first test date; obtaining first replacement information according to the first test date and the data to be tested, wherein the first replacement information is used for replacing the date in the data to be tested according to the first test date; obtaining first test information according to the data to be tested, wherein the first test information comprises a first test result and the data to be tested; and obtaining first recovery information according to the first test information, wherein the first recovery information is used for recovering the date in the data to be tested and storing the date in the test database.
Specifically, the check script is automatically generated, and according to the entered data table metadata information, such as a primary key, a non-null key, an enumeration value and the like, the SQL check script is automatically generated, for example, for checking a type of primary key ID, the check script is obtained as a select count (1) from (select ID, count (1) from tableA where data _date= $datadate $ group by ID having count (1) > 1); for the type-enumerated value check, its checkscript is select count (1) from tableA where data _date= $data$and type not in (select type from dic). Besides automatic generation, the check SQL can also be used for checking other types, and meanwhile, manual maintenance is supported so as to continuously accumulate check scripts, such as checking the checking relation among processing time length and data tables. The table structure of the data quality check table is designed as follows, supports dynamic replacement of data date, and supports testing under different environments, different personnel and different dates. And realizing automatic generation of batch script. And realizing a framework program template based on the input data table and the output data table, unifying coding styles, log specifications and the like. Meanwhile, the functions of version management, release management and the like are embedded, so that the version management of the program script is realized. Different personnel can extract corresponding database from the test database according to own checking requirement, the database can be reused, namely, after the first user uses, the test data is restored to the test database, the second user can extract and use, and meanwhile, the date can be automatically replaced, and the recovery is carried out after the replacement, so that the mutual influence among different personnel is avoided. Personnel and data date relationship management. The corresponding relation between main maintainers and data dates avoids the conflict of the data dates in a team when the test data is issued later, and the data files are mutually covered.
Step S600: first input information is obtained, wherein the first input information comprises source table information, target table information and field mapping relations between the source table and the target table.
Step S700: and obtaining a first program script according to the first input information.
Step S800: and obtaining first extraction data according to the first program script, wherein the first extraction data comprises the source table information and the target table information.
Specifically, according to the input table information and the output table information input by the user, the program script is automatically generated, the system provides two ways for editing the program script, the first way is to input the field mapping relation between the source table and the target table in the excel through an embedded excel editing module, and MYSQL, teradata or other types of program scripts can be automatically generated. Secondly, a developer selects an input source table and a target table to automatically generate a program script template to support secondary editing of the program. In general, the transition from the first to the second is possible, but the reverse is not possible. The system extracts the source table and the target table according to the final program script so as to be related with the test data and the check script.
The program metadata table is designed as follows:
Fields Field name Type(s) Remarks
PROG_ID Self-increment ID Int
PROG_NAME Program name varchar(50)
PROG_TAB_TYP Program table type Char(1) 1-Source Table, 2-target Table
TAB_NAME Data table name varchar(50)
Step S900: obtaining first package information according to the test data, the check script and the first extraction data;
step S1000: obtaining test case information according to the first package information;
step S1100: and obtaining a regression test case library according to the test case information, wherein the regression test case library supports continuous integration of programs.
Specifically, the continuous accumulation of test cases is realized for the combination of test data, test scripts, check scripts and the like, and the problems of high test cost, incomplete test coverage, high regression test cost and the like in ETL research and development are solved. Binding the source list and the target list information of the program script with the data list test case and the data list check script, selecting test data date, test data file, check method and the like to form a repeatedly executable shell-like script, and issuing the repeatedly executable shell-like script to the designated mysql, td or other data environments to realize ETL processing and finally check the accuracy of the result. According to different test situations, the required test data level is different, and the test situations with multiple levels, such as unit test cases, integrated test cases and the like, can be divided into different selection amounts of the test cases, so that the test data is correspondingly selected, meanwhile, the data date field in the system is considered, and the data date field needs to be specially noted during metadata maintenance of the data table so as to be replaced by appointed data date information. In the actual test scene, partial test data is desensitized according to production data, and the embodiment of the application also provides a convenient data import and export function, supports the replacement of the test data with data on a specified date, saves various test data versions and realizes the multiplexing of the test cases. The method and the system realize the construction of a unified ETL research and development and test management platform, realize the unified management of a data table, a batch processing script, test data, a test script, a quality checking script and a test suite, realize the full-flow scripting of data loading, program running, data checking, data cleaning and the like by combining three elements of the test data, the program script and the checking script, and realize the continuous accumulation and effective management of the test suite by accumulating the test script. The technical problem that the test management platform in the prior art has single function and cannot meet various requirements of research and development tests is solved.
The test case table is designed as follows:
fields Field name Type(s) Remarks
TEST_ID Test case ID Int
PROG_NAME Program name varchar(50)
Data_DATE Date of data
The test case library detail information table is designed as follows:
further, the method comprises the steps of: obtaining the test case information according to the source table information and the target table information; obtaining a data table checking script according to the test case information; obtaining a first binding instruction according to the source table information, the target table information, the test case information and the data table checking script, wherein the first binding instruction is used for binding the source table information and the target table information with the test case information and the data table checking script; obtaining test data date, test data file and checking script according to the test case information and the data table checking script; obtaining a data test script according to the test data date, the test data file and the check script, wherein the data test script is a repeatable execution shell-like script; and according to the data test script, a first execution instruction is obtained, and the first execution instruction is used for issuing the data test script to a test environment for execution.
Specifically, the system searches the corresponding test data file in the test file storage table according to the data table name and the file type so as to load. And searching the corresponding check SQL in the data quality check list according to the data list name. The data test shell script is formed and issued to the test environment for execution as follows.
Load A temporary Table A.dat
Load B temporary table B.dat
The date in the Update A temporary table is the appointed date
The date in the Update B temporary table is the appointed date
Respectively filling the temporary table A and the temporary table B into a A, B table
Operating the appointed program to process the C table result according to the A, B table
Running a checking script, checking the data quality of the C table, and inserting the access into a quality checking result table
Clearing data (optional) in A temporary table, B temporary table, A table, B table and C table
The procedure is exited.
In summary, the embodiment of the application has the following service modules, and each module is combined with each other to form a complete batch script management platform.
1. And (5) database management. The method mainly maintains basic information of a local database, a test database and a production database, is used for automatically connecting the database when a test is issued subsequently, and further initializes test data, issues a program script, runs a checking program and the like.
2. And (5) managing a data table. Metadata information of the data table is mainly maintained, including a table establishment statement, a primary key field, a non-null field, an enumeration field, a data date field and the like, and is used for automatically generating a test case and a check script subsequently. The metadata information can realize bidirectional synchronization, namely, the metadata information can be read from the test database, and the metadata information maintained by the platform can be synchronized into the designated test database.
3. Personnel and data date relationship management. The corresponding relation between main maintainers and data dates avoids the conflict of the data dates in a team when the test data is issued later, and the data files are mutually covered.
4. And (5) testing file management. And automatically generating test example data of the table according to the database and the data table information which are maintained in the prior stage. According to different requirements of the test scene, unit test data, system test data, performance test data and the like can be generated, exported and stored, and accumulation of a case set is facilitated.
5. And (5) checking rule management. According to metadata information of the data table, check rules of the table, such as primary key check, non-null check, enumerated value check and the like, are automatically generated, and the tester can be allowed to manually maintain check information such as check relation, fluctuation rate and the like. Meanwhile, various inspection information can be configured and issued to different test scenes such as unit test, system test, performance test and the like, so that the running condition and the result of the ETL script are monitored.
6. Batch script management. And the information such as the selection data source table, the target data table and the like is supported, the program template is automatically generated, and the manual editing and the program script release are supported.
7. And (5) managing test cases. And combining key elements such as batch scripts, test files, data dates, verification rules and the like to form a test case, and automatically cleaning. Multiple different combinations are supported to generate a test case library.
Furthermore, the embodiment of the application realizes the integration of the test data, the program script and the check script by building the whole-flow management platform of the batch processing script, thereby effectively reducing the test complexity and the manual test workload. The automatic generation of the test data is realized, the problems of complex preparation of the test data, incomplete coverage of the test data and the like are effectively solved, and the complexity of preparation of the test case is reduced. Continuous integrated management of the batch script is realized, continuous checking of the batch script is realized through precipitating the test cases, and robustness of the ETL program script is ensured. Simultaneously has the following characteristics:
1. by building the ETL research and development and test platform, the invention realizes the combination of test data, test scripts, check scripts and the like, realizes the continuous accumulation of test cases, and solves the problems of high test cost, incomplete test coverage, high regression test cost and the like in the ETL research and development.
2. The system technical method of the invention can be realized by using a BS or CS architecture, namely, a browser or a client form can be used.
3. Other Web servers, database servers, network deployments, etc. may be extended according to team size and number of users, such as using distributed extensions to meet team usage needs, but the application architecture remains unchanged.
Further, the embodiments of the present application have been implemented in the enabling portion of a financial institution
1. At present, a financial management domain of a management information development department of a software development center is responsible for the research and development work of a financial class system, a large number of analysis type reports and ETL scripts exist for a financial analysis system, and according to the research and development requirements in a row, a data quality inspection module is popularized in advance for lightening the research and development burden of developers, namely, the data quality inspection SQL is maintained in the system in advance, and the inspection result is passed every day, so that the quick iteration of team delivery is realized.
2. The system brings the following benefits after being on line:
1) Unifying the coding style and reducing the operation and maintenance cost. By automatically generating codes, coding specification information is unified, the readability of scripts is enhanced, and the operation and maintenance cost is reduced.
2) The research and development efficiency is improved, and the test data preparation work is reduced. The requirements of rapid research and development are met by automatically generating different types of test cases, and grammar and basic semantic level data inspection is realized by checking the data quality of the target table.
3) And improving the code iteration quality. The test case construction is realized by combining the requirements of test data, program scripts, check scripts, cleaning scripts and the like; by means of the platform, test cases are continuously accumulated, various work automation development such as subsequent regression tests is guaranteed, and code quality is improved.
Example two
Based on the same inventive concept as the ETL software development test management method in the foregoing embodiment, the present invention also provides an ETL software development test management system, as shown in FIG. 2, which includes:
a first obtaining unit 11, the first obtaining unit 11 being configured to obtain data table metadata information;
a second obtaining unit 12, where the second obtaining unit 12 is configured to obtain field information according to the data table metadata information;
a third obtaining unit 13, where the third obtaining unit 13 is configured to obtain test data according to the field information;
a fourth obtaining unit 14, where the fourth obtaining unit 14 is configured to obtain a data type according to the data information of the data table;
A fifth obtaining unit 15, where the fifth obtaining unit 15 is configured to obtain a checking script according to the data type;
a sixth obtaining unit 16, where the sixth obtaining unit 16 is configured to obtain first input information, where the first input information includes source table information, target table information, and a field mapping relationship between the source table and the target table;
a seventh obtaining unit 17, where the seventh obtaining unit 17 is configured to obtain a first program script according to the first input information;
an eighth obtaining unit 18, where the eighth obtaining unit 18 is configured to obtain first extraction data according to the first program script, where the first extraction data includes the source table information and the target table information;
a ninth obtaining unit 19, where the ninth obtaining unit 19 is configured to obtain first packet information according to the test data, the check script, and the first extraction data;
a tenth obtaining unit 20, where the tenth obtaining unit 20 is configured to obtain test case information according to the first packet information;
an eleventh obtaining unit 21, where the eleventh obtaining unit 21 is configured to obtain a regression test case library according to the test case information, where the regression test case library supports continuous integration of programs.
Further, the system further comprises:
a twelfth obtaining unit for obtaining a preset rule;
a thirteenth obtaining unit for obtaining a preset algorithm;
a fourteenth obtaining unit, configured to obtain various types of data corresponding to a field according to the field information, the preset rule, and the preset algorithm;
a fifteenth obtaining unit configured to obtain a date field according to the field information;
a sixteenth obtaining unit configured to obtain the data date information according to the date field;
a seventeenth obtaining unit, configured to obtain the test data according to the various types of data and the data date information.
Further, the system further comprises:
an eighteenth obtaining unit, configured to obtain first test data, where the first test data is the test data generated by the first field;
a nineteenth obtaining unit, configured to obtain second test data, where the second test data is test data generated by a second field, and the first field is different from the second field; wherein the various types of data in the first test data are associated with the various types of data in the second test data;
The twentieth obtaining unit is used for obtaining a test database according to the first test data and the second test data.
Further, the system further comprises:
a twenty-first obtaining unit, configured to obtain a first check script, where the first check script is the check script generated at a first time;
a twenty-second obtaining unit, configured to obtain a second check script, where the second check script is the check script generated at a second time; and analogically, obtaining an Nth checking script, wherein the Nth checking script is generated at an Nth time, and N is a natural number larger than 1;
and the twenty-third obtaining unit is used for obtaining a checking script set according to the first checking script, the second checking script and the Nth checking script.
Further, the system further comprises:
a twenty-fourth obtaining unit, configured to obtain a checking information list according to the checking script;
a twenty-fifth obtaining unit, configured to obtain personnel information according to the list of inspection information;
A twenty-sixth obtaining unit, configured to obtain a check date according to the check personnel information;
the twenty-seventh obtaining unit is used for obtaining a checking result according to the checking date and the checking personnel information.
Further, the system further comprises:
a twenty-eighth obtaining unit configured to obtain a first test requirement;
a twenty-ninth obtaining unit, configured to obtain data to be tested from the test database according to the first test requirement;
a thirty-third obtaining unit for obtaining a second test date;
a thirty-first obtaining unit, configured to obtain first replacement information according to the first test date and data to be tested, where the first replacement information is used to replace a date in the data to be tested according to the first test date;
a thirty-second obtaining unit, configured to obtain first test information according to the data to be tested, where the first test information includes a first test result and the data to be tested;
A thirty-third obtaining unit, configured to obtain first recovery information according to the first test information, where the first recovery information is used to recover a date in the data to be tested and store the date in the test database.
Further, the system further comprises:
a thirty-fourth obtaining unit, configured to obtain the test case information according to the source table information and the target table information;
a thirty-fifth obtaining unit, configured to obtain a data table check script according to the test case information;
a thirty-sixth obtaining unit, configured to obtain a first binding instruction according to the source table information, the target table information, the test case information, and the data table checking script, where the first binding instruction is configured to bind the source table information and the target table information with the test case information and the data table checking script;
a thirty-seventh obtaining unit, configured to obtain a test data date, a test data file, and a check script according to the test case information and the data table check script;
A thirty-eighth obtaining unit, configured to obtain a data test script according to the test data date, the test data file, and the check script, where the data test script is a repeatable execution shell-like script;
and the thirty-ninth obtaining unit is used for obtaining a first execution instruction according to the data test script, and the first execution instruction is used for issuing the data test script to a test environment for execution.
The above-mentioned various modifications and specific examples of the ETL software development test management method in the first embodiment of fig. 1 are equally applicable to an ETL software development test management system in this embodiment, and those skilled in the art will be aware of the implementation method of an ETL software development test management system in this embodiment through the foregoing detailed description of an ETL software development test management method, so that the description is omitted herein for brevity.
Exemplary electronic device
An electronic device of an embodiment of the present application is described below with reference to fig. 3.
Fig. 3 illustrates a schematic structural diagram of an electronic device according to an embodiment of the present application.
Based on the inventive concept of the ETL software development test management method according to the foregoing embodiments, the present invention further provides an ETL software development test management system, on which a computer program is stored, which when executed by a processor, implements the steps of any one of the foregoing ETL software development test management methods.
Where in FIG. 3 a bus architecture (represented by bus 300), bus 300 may comprise any number of interconnected buses and bridges, with bus 300 linking together various circuits, including one or more processors, represented by processor 302, and memory, represented by memory 304. Bus 300 may also link together various other circuits such as peripheral devices, voltage regulators, power management circuits, etc., as are well known in the art and, therefore, will not be described further herein. Bus interface 306 provides an interface between bus 300 and receiver 301 and transmitter 303. The receiver 301 and the transmitter 303 may be the same element, i.e. a transceiver, providing a means for communicating with various other systems over a transmission medium.
The processor 302 is responsible for managing the bus 300 and general processing, while the memory 304 may be used to store data used by the processor 302 in performing operations.
The above-mentioned one or more technical solutions in the embodiments of the present application at least have one or more of the following technical effects:
the embodiment of the application provides an ETL software research and development test management method and system, which are implemented by acquiring metadata information of a data table; obtaining field information according to the data table metadata information; obtaining test data according to the field information; obtaining a data type according to the data information of the data table; obtaining a checking script according to the data type; obtaining first input information, wherein the first input information comprises source table information, target table information and field mapping relations between the source table and the target table; obtaining a first program script according to the first input information; obtaining first extraction data according to the first program script, wherein the first extraction data comprises the source table information and the target table information; obtaining first package information according to the test data, the check script and the first extraction data; obtaining test case information according to the first package information; and obtaining a regression test case library according to the test case information, wherein the regression test case library supports continuous integration of programs. The method and the system realize the construction of a unified ETL research and development and test management platform, realize the unified management of a data table, a batch processing script, test data, a test script, a quality checking script and a test suite, and realize the technical effects of continuous accumulation and effective management of the test suite by combining three elements of the test data, a program script and the checking script, carrying out full-flow scripting on data loading, program running, data checking, data cleaning and the like and accumulating the test script. Therefore, the technical problem that the test management platform in the prior art has single function and cannot meet various requirements of research and development tests is solved.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create a system for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (8)

1. An ETL software development test management method, wherein the method comprises:
obtaining data table metadata information;
obtaining field information according to the data table metadata information;
obtaining test data according to the field information;
obtaining a data type according to the data information of the data table;
obtaining a checking script according to the data type;
obtaining first input information, wherein the first input information comprises source table information, target table information and field mapping relations between the source table and the target table;
obtaining a first program script according to the first input information;
obtaining first extraction data according to the first program script, wherein the first extraction data comprises the source table information and the target table information;
obtaining test case information according to the source table information and the target table information;
Obtaining a data table checking script according to the test case information;
obtaining a first binding instruction according to the source table information, the target table information, the test case information and the data table checking script, wherein the first binding instruction is used for binding the source table information and the target table information with the test case information and the data table checking script;
obtaining test data date, test data file and checking script according to the test case information and the data table checking script;
obtaining a data test script according to the test data date, the test data file and the check script, wherein the data test script is a repeatable execution shell-like script;
according to the data test script, a first execution instruction is obtained, and the first execution instruction is used for issuing the data test script to a test environment for execution;
obtaining first package information according to the test data, the check script and the first extraction data;
obtaining test case information according to the first package information;
and obtaining a regression test case library according to the test case information, wherein the regression test case library supports continuous integration of programs.
2. The method of claim 1, wherein the obtaining test data from the field information comprises:
obtaining a preset rule;
obtaining a preset algorithm;
obtaining various data corresponding to the field according to the field information, the preset rule and the preset algorithm;
according to the field information, a date field is obtained;
obtaining data date information according to the date field;
and obtaining the test data according to the various data and the data date information.
3. The method of claim 2, wherein the method comprises:
obtaining first test data, wherein the first test data is the test data generated by a first field;
obtaining second test data, wherein the second test data is the test data generated by a second field, and the first field is different from the second field; wherein the various types of data in the first test data are associated with the various types of data in the second test data;
and obtaining a test database according to the first test data and the second test data.
4. The method of claim 1, wherein the obtaining the check script according to the data type comprises:
Obtaining a first check script, wherein the first check script is generated at a first time;
obtaining a second check script, wherein the second check script is generated at a second time;
and analogically, obtaining an Nth checking script, wherein the Nth checking script is generated at an Nth time, and N is a natural number larger than 1;
and obtaining a checking script set according to the first checking script, the second checking script and the nth checking script.
5. The method of claim 1, wherein the obtaining the check script according to the data type comprises:
obtaining a checking information list according to the checking script;
acquiring personnel information according to the personnel information list;
acquiring a checking date according to the checking personnel information;
and obtaining a checking result according to the checking date and the checking personnel information.
6. A method as claimed in claim 3, wherein the method comprises:
obtaining a first test requirement;
obtaining data to be tested from the test database according to the first test requirement;
obtaining a first test date;
obtaining first replacement information according to the first test date and the data to be tested, wherein the first replacement information is used for replacing the date in the data to be tested according to the first test date;
Obtaining first test information according to the data to be tested, wherein the first test information comprises a first test result and the data to be tested;
and obtaining first recovery information according to the first test information, wherein the first recovery information is used for recovering the date in the data to be tested and storing the date in the test database.
7. An ETL software development test management system, wherein the system comprises:
a first obtaining unit configured to obtain data table metadata information;
the second obtaining unit is used for obtaining field information according to the data table metadata information;
the third obtaining unit is used for obtaining test data according to the field information;
a fourth obtaining unit, configured to obtain a data type according to data information of the data table;
a fifth obtaining unit, configured to obtain a check script according to the data type;
a sixth obtaining unit, configured to obtain first input information, where the first input information includes source table information, target table information, and a field mapping relationship between the source table and the target table;
A seventh obtaining unit, configured to obtain a first program script according to the first input information;
an eighth obtaining unit, configured to obtain first extraction data according to the first program script, where the first extraction data includes the source table information and the target table information;
a ninth obtaining unit, configured to obtain first packet information according to the test data, the check script, and the first extraction data;
a tenth obtaining unit, configured to obtain test case information according to the first packet information;
an eleventh obtaining unit, configured to obtain a regression test case library according to the test case information, where the regression test case library supports continuous integration of programs;
a thirty-fourth obtaining unit, configured to obtain the test case information according to the source table information and the target table information;
a thirty-fifth obtaining unit, configured to obtain a data table check script according to the test case information;
a thirty-sixth obtaining unit, configured to obtain a first binding instruction according to the source table information, the target table information, the test case information, and the data table checking script, where the first binding instruction is configured to bind the source table information and the target table information with the test case information and the data table checking script;
A thirty-seventh obtaining unit, configured to obtain a test data date, a test data file, and a check script according to the test case information and the data table check script;
a thirty-eighth obtaining unit, configured to obtain a data test script according to the test data date, the test data file, and the check script, where the data test script is a repeatable execution shell-like script;
and the thirty-ninth obtaining unit is used for obtaining a first execution instruction according to the data test script, and the first execution instruction is used for issuing the data test script to a test environment for execution.
8. An ETL software development test management system comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method of any one of claims 1-6 when the program is executed by the processor.
CN202110090036.3A 2021-01-22 2021-01-22 ETL software research and development test management method and system Active CN112685325B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110090036.3A CN112685325B (en) 2021-01-22 2021-01-22 ETL software research and development test management method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110090036.3A CN112685325B (en) 2021-01-22 2021-01-22 ETL software research and development test management method and system

Publications (2)

Publication Number Publication Date
CN112685325A CN112685325A (en) 2021-04-20
CN112685325B true CN112685325B (en) 2023-07-28

Family

ID=75458947

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110090036.3A Active CN112685325B (en) 2021-01-22 2021-01-22 ETL software research and development test management method and system

Country Status (1)

Country Link
CN (1) CN112685325B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1601483A (en) * 2004-10-22 2005-03-30 中国工商银行 Automation software testing system based on script explanatory tool
CN105589874A (en) * 2014-10-22 2016-05-18 阿里巴巴集团控股有限公司 ETL task dependence relationship detecting method and device and ETL tool
CN109308258A (en) * 2018-08-21 2019-02-05 中国平安人寿保险股份有限公司 Building method, device, computer equipment and the storage medium of test data
CN109474488A (en) * 2018-10-31 2019-03-15 中国银行股份有限公司 Interface test method, device and computer equipment
CN109634846A (en) * 2018-11-16 2019-04-16 武汉达梦数据库有限公司 A kind of ETL method for testing software and device
CN110704475A (en) * 2019-09-29 2020-01-17 中国银行股份有限公司 Method and system for comparing ETL loading table structures
CN111597243A (en) * 2020-05-15 2020-08-28 中国工商银行股份有限公司 Data warehouse-based abstract data loading method and system
CN111930617A (en) * 2020-07-31 2020-11-13 中国工商银行股份有限公司 Automatic testing method and device based on data objectification

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1601483A (en) * 2004-10-22 2005-03-30 中国工商银行 Automation software testing system based on script explanatory tool
CN105589874A (en) * 2014-10-22 2016-05-18 阿里巴巴集团控股有限公司 ETL task dependence relationship detecting method and device and ETL tool
CN109308258A (en) * 2018-08-21 2019-02-05 中国平安人寿保险股份有限公司 Building method, device, computer equipment and the storage medium of test data
CN109474488A (en) * 2018-10-31 2019-03-15 中国银行股份有限公司 Interface test method, device and computer equipment
CN109634846A (en) * 2018-11-16 2019-04-16 武汉达梦数据库有限公司 A kind of ETL method for testing software and device
CN110704475A (en) * 2019-09-29 2020-01-17 中国银行股份有限公司 Method and system for comparing ETL loading table structures
CN111597243A (en) * 2020-05-15 2020-08-28 中国工商银行股份有限公司 Data warehouse-based abstract data loading method and system
CN111930617A (en) * 2020-07-31 2020-11-13 中国工商银行股份有限公司 Automatic testing method and device based on data objectification

Also Published As

Publication number Publication date
CN112685325A (en) 2021-04-20

Similar Documents

Publication Publication Date Title
CN107958057B (en) Code generation method and device for data migration in heterogeneous database
US11163731B1 (en) Autobuild log anomaly detection methods and systems
US8935575B2 (en) Test data generation
US9411712B2 (en) Generating test data
JP2020510925A (en) Method and apparatus for performing a test using a test case
US7418449B2 (en) System and method for efficient enrichment of business data
US20130041900A1 (en) Script Reuse and Duplicate Detection
CN108920139B (en) Program generation method, device and system, electronic equipment and storage medium
US20210326197A1 (en) System And Method For Automatically Identifying And Resolving Computing Errors
CN112579586A (en) Data processing method, device, equipment and storage medium
CN109814877A (en) Project dispositions method and its device based on environmental management
CN104919445A (en) System for transform generation
CN110134596A (en) The generation method and terminal device of test document
US11775517B2 (en) Query content-based data generation
CN111382198B (en) Data recovery method, device, equipment and storage medium
Barberis et al. The ATLAS EventIndex: a BigData catalogue for all ATLAS experiment events
CN114281694A (en) ETL framework-based data warehouse operation scheduling method, system and computer readable medium
CN112685325B (en) ETL software research and development test management method and system
CN106843822B (en) Execution code generation method and equipment
US8997064B2 (en) Symbolic testing of software using concrete software execution
CN113434397B (en) Task system testing method and device, electronic equipment and storage medium
CN111078258B (en) Version upgrading method and device
CN112596806A (en) Data lake data loading script generation method and system
CN115250231B (en) Application configuration method and device
US20240104137A1 (en) Main path analysis method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant