CN112597221A - Test environment data extraction optimization execution method based on cross section data - Google Patents

Test environment data extraction optimization execution method based on cross section data Download PDF

Info

Publication number
CN112597221A
CN112597221A CN202011492240.XA CN202011492240A CN112597221A CN 112597221 A CN112597221 A CN 112597221A CN 202011492240 A CN202011492240 A CN 202011492240A CN 112597221 A CN112597221 A CN 112597221A
Authority
CN
China
Prior art keywords
data
section
library
test
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011492240.XA
Other languages
Chinese (zh)
Other versions
CN112597221B (en
Inventor
张瀚
邓海霞
黄小丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan XW Bank Co Ltd
Original Assignee
Sichuan XW Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan XW Bank Co Ltd filed Critical Sichuan XW Bank Co Ltd
Priority to CN202011492240.XA priority Critical patent/CN112597221B/en
Publication of CN112597221A publication Critical patent/CN112597221A/en
Application granted granted Critical
Publication of CN112597221B publication Critical patent/CN112597221B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/275Synchronous replication

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a cross-section data-based test environment data extraction optimization execution method, which belongs to the technical field of big data and software testing and aims to solve the problems that in the prior art, after an upstream service test is finished, the next batch of test can be continued after ETL test data extraction is finished, the ETL data extraction must be executed immediately after each batch of test of an upstream system is finished, and after a plurality of batches of ETL tests, if the problems are found, the source tracing analysis is difficult. The system comprises a master library, a slave library and a target library, wherein the data of the master library and the data of the slave library are synchronous in real time, the target library extracts the data from the slave library, and a plurality of section database environments are newly added between the slave library and the target library; custom synchronizing data in a slave library to the cross-section database; the target library extracts data from the cross-sectional database. The invention is used for data extraction.

Description

Test environment data extraction optimization execution method based on cross section data
Technical Field
The invention belongs to the technical field of big data and software testing, and particularly relates to a test environment data extraction optimization execution method based on cross section data.
Background
With the rapid development of the big data industry, more and more enterprises provide special ETL testing positions, and ETL testing becomes a single branch direction in the field of software testing. The invention provides a test environment data extraction optimization execution method based on cross section data, and aims to reduce the dependence of an ETL test on an upstream service system test, improve the execution efficiency of the ETL test, avoid the failure of the ETL test caused by the dynamic change of upstream source data, facilitate the traceability analysis of test problems and the like.
ETL: ETL is an abbreviation for Extract-Transform-Load in english, and is used to describe the process of extracting (Extract), converting (Transform), and loading (Load) data from a source end to a destination end.
Data extraction: data extraction is a process of extracting part or all of data from a source system to a target system so as to be processed and utilized again at the target system, and is the first step of ETL.
ETL test: the ETL process is tested to ensure that data extraction, conversion and loading can be processed correctly.
Section data: the section data is data of different subjects at the same time or the same time period, and is also called static data, and the section data refers to the static data of different tables at the same time point.
In the prior art, as shown in fig. 1 and 2, ETL is usually a continuous process, such as performing ETL processing on data of the last day, i.e. one batch per day. The current scheme is that data is extracted from a main database of a source system or a database directly, an upstream service system stops testing after completing testing of a current batch, and the upstream service system continues testing of a next batch after completing data extraction of an ETL test, and the steps are repeated and are polled to wait.
The prior art solutions have several problems:
1. after the upstream service test is finished, the next batch of test can be continued after the ETL test data is extracted, otherwise, the extracted data is not in accordance with the expectation possibly because the data of the previous batch is updated;
ETL data extraction must be executed immediately after each batch of test of the upstream system is finished, otherwise the waiting time of upstream system testers is prolonged;
3. after the multiple-batch ETL test, if a problem is found, the source tracing analysis is difficult because the source data has changed after the upstream system is tested by multiple batches.
Disclosure of Invention
The invention provides a cross-section data-based test environment data extraction optimization execution method, aiming at the problems that in the prior art, after an upstream service test is finished, the next batch of test can be continued after ETL test data extraction is finished, ETL data extraction must be executed immediately after each batch of test of an upstream system is finished, and source tracing analysis is difficult if problems are found after a plurality of batches of ETL tests, and the cross-section data-based test environment data extraction optimization execution method is provided and aims to: the problems that the next batch of tests can be continued after ETL test data extraction is finished after the upstream service test is finished, and source tracing analysis is difficult if problems are found after a plurality of batches of ETL tests are solved.
In order to achieve the purpose, the invention adopts the following technical scheme: a cross section data-based test environment data extraction optimization execution method comprises a master library, a slave library and a target library, wherein the master library is synchronized with data of the slave library in real time, and the target library performs data extraction from the slave library, and comprises the following steps:
step A: adding a plurality of section database environments between the slave library and the target library;
and B: custom synchronizing data in a slave library to the cross-section database;
and C, extracting data from the section database by the target database.
The data are extracted from the section library, and the section library retains the section data of the data extraction time point, so that the influence of real-time updating of the master library or the slave library on the data extraction test is avoided;
further, the step a specifically includes:
step A1: applying for section database environment resources, wherein the storage requirements of the section database are evaluated according to the data quantity of ETL test extraction data;
step A2: and building a section database service according to the database service correspondence of the master library and the slave library, wherein the section database instance is consistent with the master library and the slave library.
Further, step B specifically includes:
step B1: developing and setting a section data synchronization tool; the section synchronization tool supports the synchronization of data from the library to the section library at any time point according to the rule of autonomous configuration, and stores the section data of the sampling time point;
step B2: the upstream system tester completes the current batch test and completes the inspection of the source data;
step B3: parallel to the step B2, ETL testers synchronously prepare section library synchronization rules when performing tests on the upstream system, and configure a table needing to retain section data according to test requirements;
step B4: and after the source data is checked, executing section data synchronization, and checking whether section data at the drawing time point of the section library is successfully stored or not after the section data is synchronized.
According to the invention, the decoupling of the upstream service system and the ETL test is realized through the scheme of drawing numbers from the section library, the upstream system tester can perform the next batch of tests without waiting for the completion of the drawing numbers, the ETL tester can develop the drawing numbers without completing the tests of each batch of the upstream service system, and the data can be uniformly extracted from the section library after the upstream service system finishes testing all the batches, so that the work arrangement is more flexible and time-saving.
Further, step B1 includes: the section synchronization tool supports setting of rules required to be configured for section data synchronization, supports rapid configuration of the rules in a rule importing mode, displays rule states after rule configuration is completed, and supports operations including editing, transmission and deletion.
By adopting the scheme, the section data support flexible and synchronous table-level customization, the consumption of storage resources is controlled while the section data at the data extraction time is reserved, and the section data is convenient to be actually used on the ground.
Further, step C specifically includes:
step C1: modifying the ETL configuration to enable the target library to be in butt joint with the section library;
step C2: and executing the data extraction script, and extracting the section data from the section library to the target library.
The section library stores section data of the snapshot time point, and is convenient for tracing analysis of problems found in the ETL test.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:
firstly, the section library is synchronously driven by rules, flexible configuration according to requirements is supported, and environmental resources are saved;
and secondly, by the method of extracting data from the section library, the decoupling of the upstream service system and the ETL test is realized, the upstream system tester can perform the next batch of tests without waiting for the completion of the number of draws, the ETL tester can also develop the number of draws without the completion of the tests of each batch of the upstream service system, and the data can be uniformly extracted from the section library after the upstream service system tests all the batches, so that the work arrangement is more flexible and time-saving.
And thirdly, the section library stores section data of the snapshot time point, and the problems found in the ETL test can be conveniently traced.
Drawings
FIG. 1 is a prior art schematic of the present invention;
FIG. 2 is a prior art schematic of the present invention;
FIG. 3 is a schematic diagram of an embodiment of the present invention;
FIG. 4 is a schematic view of an embodiment of the present invention;
FIG. 5 is a schematic view of an embodiment of the present invention;
FIG. 6 is a schematic diagram of an embodiment of the present invention;
FIG. 7 is a schematic view of an embodiment of the present invention;
FIG. 8 is a schematic diagram of an embodiment of the present invention.
Detailed Description
All of the features disclosed in this specification, or all of the steps in any method or process so disclosed, may be combined in any combination, except combinations of features and/or steps that are mutually exclusive.
The invention will be further described with reference to the accompanying drawings and specific embodiments.
As shown in the figure, the cross-section data-based test environment data extraction optimization execution method of the invention comprises a master library, a slave library and a target library, wherein the master library and the slave library synchronize data in real time, and the target library extracts data from the slave library, and comprises the following steps:
step A: adding a plurality of section database environments between the slave library and the target library;
the method specifically comprises the following steps:
step A1: applying for section database environment resources, wherein the storage requirements of the section database are evaluated according to the data quantity of ETL test extraction data;
step A2: and building a section database service according to the database service correspondence of the master library and the slave library, wherein the section database instance is consistent with the master library and the slave library.
And B: custom synchronizing data in a slave library to the cross-section database;
the method specifically comprises the following steps:
step B1: developing and setting a section data synchronization tool; the section synchronization tool supports the synchronization of data from the library to the section library at any time point according to the rule of autonomous configuration, and stores the section data of the sampling time point;
step B1 includes: as shown in fig. 5, the section synchronization tool supports setting of rules required to be configured for section data synchronization, as shown in fig. 6, and supports rapid configuration of the rules by importing the rules, as shown in fig. 7, displaying rule states after the rule configuration is completed, and supporting operations including editing, transmission, and deletion.
The section data support flexible synchronization of table-level customization, and consumption of storage resources is controlled while the section data at the data extraction time point is reserved, so that the section data can be conveniently used on the ground actually.
Step B2: the upstream system tester completes the current batch test and completes the inspection of the source data;
step B3: parallel to the step B2, ETL testers synchronously prepare section library synchronization rules when performing tests on the upstream system, and configure a table needing to retain section data according to test requirements;
step B4: and after the source data is checked, executing section data synchronization, and checking whether section data is successfully stored at the drawing time point of the section library after the section data is synchronized. A schematic of the cross-sectional library is shown in fig. 8.
According to the invention, the decoupling of the upstream service system and the ETL test is realized through the scheme of drawing numbers from the section library, the upstream system tester can perform the next batch of tests without waiting for the completion of the drawing numbers, the ETL tester can develop the drawing numbers without completing the tests of each batch of the upstream service system, and the data can be uniformly extracted from the section library after the upstream service system finishes testing all the batches, so that the work arrangement is more flexible and time-saving.
And C, extracting data from the section database by the target database.
The step C specifically comprises the following steps:
step C1: modifying the ETL configuration to enable the target library to be in butt joint with the section library;
step C2: and executing the data extraction script, and extracting the section data from the section library to the target library. ETL testers test the section data of the snapshot time points, and the found problems are conveniently traced.
The above description is only a representative embodiment of the present invention in many specific applications, and the protection scope of the present invention is not limited in any way. All the technical schemes formed by adopting conversion or equivalent replacement
And fall within the scope of the claims.

Claims (5)

1. A test environment data extraction optimization execution method based on cross section data is characterized by comprising the following steps:
step A: the method comprises a master library, a slave library and a target library, wherein a section database environment is newly added between the slave library and the target library, and a plurality of section databases are established;
and B: synchronizing data in a slave database to the section database through a section data synchronization tool, and storing section data at the sampling time point;
and C, extracting data from the section database by the target database.
2. The cross-section data-based test environment data extraction optimization execution method according to claim 1, wherein the step A specifically comprises:
step A1: applying for section database environment resources, wherein the storage requirements of the section database are evaluated according to the data quantity of ETL test extraction data;
step A2: and building a section database service according to the database service correspondence of the master library and the slave library, wherein the section database instance is consistent with the master library and the slave library.
3. The cross-section data-based test environment data extraction optimization execution method according to claim 1, wherein the step B specifically comprises:
step B1: developing a cross-section data synchronization tool; the section synchronization tool supports the synchronization of data from a slave database to a section database according to an autonomously configured section data synchronization rule at any time point, and stores the section data at the sampling time point;
step B2: the upstream system tester completes the current batch test and completes the inspection of the source data;
step B3: parallel to the step B2, the ETL tester synchronously prepares a section database synchronization rule when the upstream system tester performs a test, and configures a table requiring section data retention according to the ETL test requirements;
step B4: after completing the current batch test and checking the source data, the upstream system tester executes section data synchronization according to the configuration of the step B3, and checks whether the section data of the snapshot time point of the section library is successfully stored after the section data is synchronized;
step B5: and the upstream system tester completes the test of the current batch, and continues to perform the test of the next batch after completing the inspection of the source data.
4. The cross-section data-based test environment data extraction optimization execution method as claimed in claim 3, wherein the step B1 comprises:
the section data synchronization tool supports configuration of section data synchronization rules required by section data synchronization, supports rapid configuration of the section data synchronization rules in a rule importing mode, displays the state of the section data synchronization rules after the section data synchronization rules are configured, and supports operations including editing, transmission and deletion.
5. The cross-section data-based test environment data extraction optimization execution method according to claim 1, wherein the step C specifically comprises:
step C1: modifying the ETL configuration, and butting the target library with the section library;
step C2: and executing the data extraction script to extract the section data from the section library to the target library.
CN202011492240.XA 2020-12-17 2020-12-17 Test environment data extraction optimization execution method based on cross section data Active CN112597221B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011492240.XA CN112597221B (en) 2020-12-17 2020-12-17 Test environment data extraction optimization execution method based on cross section data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011492240.XA CN112597221B (en) 2020-12-17 2020-12-17 Test environment data extraction optimization execution method based on cross section data

Publications (2)

Publication Number Publication Date
CN112597221A true CN112597221A (en) 2021-04-02
CN112597221B CN112597221B (en) 2023-04-11

Family

ID=75196896

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011492240.XA Active CN112597221B (en) 2020-12-17 2020-12-17 Test environment data extraction optimization execution method based on cross section data

Country Status (1)

Country Link
CN (1) CN112597221B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050187974A1 (en) * 2004-02-20 2005-08-25 Oracle International Corporation Modularized extraction, transformation, and loading for a database
US20090177671A1 (en) * 2008-01-03 2009-07-09 Accenture Global Services Gmbh System and method for automating etl application
CN102004744A (en) * 2009-09-02 2011-04-06 中国银联股份有限公司 Data extraction system and method from one source table to table of at least one object database
US20110213756A1 (en) * 2010-03-01 2011-09-01 International Business Machines Corporation Concurrency control for extraction, transform, load processes
CN102930393A (en) * 2012-10-25 2013-02-13 海南电网公司 Comprehensive power grid information display visualization system
CN104317893A (en) * 2014-10-23 2015-01-28 国家电网公司 Abbreviation type data snapshot implementing method based on mobile security storage medium
CN105740462A (en) * 2016-03-02 2016-07-06 上海新炬网络信息技术有限公司 Method for supporting data migration between different environments
CN107357940A (en) * 2017-08-28 2017-11-17 中煤航测遥感集团有限公司 A kind of method and apparatus of real estate Data Integration
CN109634846A (en) * 2018-11-16 2019-04-16 武汉达梦数据库有限公司 A kind of ETL method for testing software and device
CN109657114A (en) * 2018-08-21 2019-04-19 国家计算机网络与信息安全管理中心 A method of extracting webpage semi-structured data
CN109669983A (en) * 2018-12-27 2019-04-23 杭州火树科技有限公司 Visualize multi-data source ETL tool
US20200334271A1 (en) * 2019-04-18 2020-10-22 Oracle International Corporation System and method for determining an amount of virtual machines for use with extract, transform, load (etl) processes

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050187974A1 (en) * 2004-02-20 2005-08-25 Oracle International Corporation Modularized extraction, transformation, and loading for a database
US20090177671A1 (en) * 2008-01-03 2009-07-09 Accenture Global Services Gmbh System and method for automating etl application
CN102004744A (en) * 2009-09-02 2011-04-06 中国银联股份有限公司 Data extraction system and method from one source table to table of at least one object database
US20110213756A1 (en) * 2010-03-01 2011-09-01 International Business Machines Corporation Concurrency control for extraction, transform, load processes
CN102930393A (en) * 2012-10-25 2013-02-13 海南电网公司 Comprehensive power grid information display visualization system
CN104317893A (en) * 2014-10-23 2015-01-28 国家电网公司 Abbreviation type data snapshot implementing method based on mobile security storage medium
CN105740462A (en) * 2016-03-02 2016-07-06 上海新炬网络信息技术有限公司 Method for supporting data migration between different environments
CN107357940A (en) * 2017-08-28 2017-11-17 中煤航测遥感集团有限公司 A kind of method and apparatus of real estate Data Integration
CN109657114A (en) * 2018-08-21 2019-04-19 国家计算机网络与信息安全管理中心 A method of extracting webpage semi-structured data
CN109634846A (en) * 2018-11-16 2019-04-16 武汉达梦数据库有限公司 A kind of ETL method for testing software and device
CN109669983A (en) * 2018-12-27 2019-04-23 杭州火树科技有限公司 Visualize multi-data source ETL tool
US20200334271A1 (en) * 2019-04-18 2020-10-22 Oracle International Corporation System and method for determining an amount of virtual machines for use with extract, transform, load (etl) processes

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MIROSLAV DZAKOVIC: "Industrial Application of Automated Regression Testing in Test-Driven ETL Development", 《2016 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME)》 *
刘珊艳等: "大数据软件测试技术研究", 《湖北工业大学学报》 *
安轲 等: "面向电信网数据的ETL系统的设计与实现", 《信息工程大学学报》 *

Also Published As

Publication number Publication date
CN112597221B (en) 2023-04-11

Similar Documents

Publication Publication Date Title
CN110908906B (en) Regression testing method and system
CN106970880B (en) Distributed automatic software testing method and system
CN109522228B (en) Interface automation test data construction method, device, platform and storage medium
CN105677465B (en) The data processing method and device of batch processing are run applied to bank
CN101174237B (en) Automatic test method, system and test device
CN110704475A (en) Method and system for comparing ETL loading table structures
CN112131116A (en) Automatic regression testing method for embedded software
CN111651365B (en) Automatic interface testing method and device
CN115292307A (en) Data synchronization system, method and corresponding computer equipment and storage medium
CN104850476A (en) Cross-platform interface automated testing method and cross-platform interface automated testing system
CN103064780B (en) A kind of method of software test and device
CN112597221B (en) Test environment data extraction optimization execution method based on cross section data
CN112732828A (en) Cross-platform data sharing method based on data warehouse tool
CN115827636B (en) Method for storing and reading simulation data of logic system design from waveform database
CN106843822B (en) Execution code generation method and equipment
CN115809197A (en) Automatic debugging method, device, terminal, system and storage medium
CN111581081B (en) Automatic test system and method
CN113238901B (en) Multi-device automatic testing method and device, storage medium and computer device
CN111475419B (en) Method for managing automated test benchmark data based on container technology
WO2021115314A1 (en) Method, apparatus, and device for implementing logic simulation of nuclear power plant onsite control objects
CN114647588A (en) Interface test method and device
CN111008141A (en) Automatic incremental deployment test environment method
CN117076431B (en) Method for migrating system upgrade data
CN111651364B (en) SQL (structured query language) checking method and device under parallel development
Zhang Research on software development and test environment automation based on android platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant