CN109634846A - A kind of ETL method for testing software and device - Google Patents

A kind of ETL method for testing software and device Download PDF

Info

Publication number
CN109634846A
CN109634846A CN201811366641.3A CN201811366641A CN109634846A CN 109634846 A CN109634846 A CN 109634846A CN 201811366641 A CN201811366641 A CN 201811366641A CN 109634846 A CN109634846 A CN 109634846A
Authority
CN
China
Prior art keywords
data
test result
expected results
matching
row
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811366641.3A
Other languages
Chinese (zh)
Other versions
CN109634846B (en
Inventor
付博文
余院兰
冯源
付铨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Dameng Database Co Ltd
Original Assignee
Wuhan Dameng Database Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Dameng Database Co Ltd filed Critical Wuhan Dameng Database Co Ltd
Priority to CN201811366641.3A priority Critical patent/CN109634846B/en
Publication of CN109634846A publication Critical patent/CN109634846A/en
Application granted granted Critical
Publication of CN109634846B publication Critical patent/CN109634846B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3684Test management for test design, e.g. generating new test cases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3692Test management for test results analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The present invention relates to software testing technology field, a kind of ETL method for testing software and device are provided, wherein method includes: and source data is imported preset testing process to handle, and testing process includes that reading data, data interaction conversion and data load;Treated test result is obtained, test result is matched with expected results, and matching result is recorded;The matching result of record is fed back into tester;Matching process specifically: first by the data structure matching of test result and expected results;After data structure matching success, then test result matched with the line number of expected results;After line number successful match, continue to match data line every in test result.The present invention completes the acquisition of data source, data conversion conversion designs and result matching verification by way of automation, realize the automatic test of ETL software, it is matched using by letter to difficult substep when matching verification, the test run time is greatly saved, improve matching efficiency.

Description

A kind of ETL method for testing software and device
[technical field]
The present invention relates to software testing technology fields, and in particular to a kind of ETL method for testing software and device.
[background technique]
ETL (Extract-Transform-Load, data pick-up, conversion and load) is one completely from online transaction Extract data in database, carry out conversion process, be then loaded into the process of data warehouse, be responsible for completing data from data source to The conversion of target data warehouse is the important step for implementing data warehouse.
In the test process of ETL, it need to will complete the data after testing and be matched with expected results, to carry out result verification, Only data exactly match, and could assert that the test process of ETL is successful;When it fails to match for appearance, it was demonstrated that test process There are problem, need to be adjusted test process improvement.However, ETL amount of test data is usually very huge, if to entirety Data matched one by one, will be so that test process be extremely complex, task amount is heavy, and matching efficiency is low, test process operation Time is long;Moreover, do not efficiently use differentiation data corresponding when it fails to match when being adjusted improvement, so that The adjustment of test process is lack of pertinence and purpose, it is relatively time-consuming.
In consideration of it, overcoming defect present in the above-mentioned prior art is the art urgent problem to be solved.
[summary of the invention]
The technical problem to be solved in the invention is:
In ETL test process, data volume is usually very huge, if task will be made by carrying out matching one by one to overall data Measure heavy, matching efficiency is low, test process long operational time;Moreover, being lacked when being adjusted improvement to the adjustment of test process Weary specific aim and purpose, it is relatively time-consuming.
The present invention reaches above-mentioned purpose by following technical solution:
In a first aspect, for source data, expected results are pre-generated the present invention provides a kind of ETL method for testing software, The described method includes:
Source data is imported preset testing process to handle;Wherein, the testing process includes reading data, data Interaction conversion is loaded with data;
Obtain treated test result, test result and expected results be subjected to substep matching, and by matching result into Row record;
The matching result of record is fed back into tester;
Wherein, the test result is made of with the expected results bivariate table, described by test result and expected knot Fruit carries out substep matching specifically: by the data structure of the test result and the progress of the data structure of the expected results Match;After data structure matching success, the line number of the test result is matched with the line number of the expected results, line number After success, continue with the corresponding row data in the expected results to carry out every data line in the test result line by line Matching.
Preferably, the data structure by the test result and the data structure of the expected results carry out matching tool Body are as follows: match the columns of the test result with the columns of the expected results;After columns successful match, continue institute The data definition for stating each column of test result is matched with the data definition of each column of the expected results.
Preferably, after line number successful match between the test result and the expected results, the method also includes: The total amount of data for counting the test result, by the total data of the total amount of data size of the test result and the expected results Amount size matched, after total amount of data size successful match, continue by the test result every data line with it is described Correspondence row data in expected results are matched.
Preferably, every data line by the test result and the corresponding row data in the expected results into Row matching specifically:
Count the data volume size of every a line in the test result;
The data volume that row is corresponded in the data volume size and the expected results of every row in the test result is matched line by line Size;
After the data volume size of each row equal successful match, continue the specific number for matching every row in the test result line by line According to specific data that row is corresponded in the expected results.
Preferably, every data line by the test result and the corresponding row data in the expected results into Row matching specifically:
Count the data volume size of every a line in the test result, and by the test result according to data volume from it is small to Big sequence arranges each row in bivariate table;
The data volume that row is corresponded in the data volume size and the expected results of every row in the test result is matched line by line Size;
After the data volume size of each row equal successful match, puts in order according to capable, successively match the test line by line As a result the specific data of row are corresponded in the specific data of every row and the expected results.
Preferably, when between the test result and the expected results data structure mismatch or line number mismatch, Or between the test result and the expected results there are the data of any row mismatch when, the method also includes:
According to Data Matching as a result, the differentiation data between the test result and the expected results is exported;
Again source data is imported preset testing process to handle, and to the centre that link each in treatment process generates Data are recorded;
The intermediate data that the differentiation data is generated with each link respectively is matched, and then determines the differentiation Appearance link of the data in the testing process, and generate test report and feed back to tester.
Preferably, in every data line by the test result and the corresponding row data in the expected results Carry out it is matched during, correspond to row when first appearing in any row data in the test result and the expected results When data mismatch, stop Data Matching;Alternatively, when the differentiation data between the test result and the expected results When accounting reaches preset threshold, stop Data Matching.
Preferably, the acquisition methods of the source data specifically:
The data source for connecting software to be detected, the system table by reading the data source obtain the related of Data source table and believe Breath, and the relevant information is written in the system table of ETL;Wherein, the relevant information includes data structure, field type And it is one or more in major key information.
Preferably, it is described by source data import preset testing process carry out processing specifically include:
The synchronous testing process of data is created, reading data component, data cleansing conversion are added in the testing process Component and dataload component, and be arranged and need synchronous source table;Wherein, the source table, which is used to store, needs synchronous source number According to;
Different conversion designs are carried out to testing process according to test function, the conversion designs include that incremental data is synchronous It is one or more in design, data filtering design and data cleansing conversion designs;
By each data package of addition, source data is made to carry out the synchronous processing of data according to the testing process of design.
Preferably, it is described the matching result of record is fed back into tester after, the method also includes ant Script being write and executing;The ant script is write specifically:
Write the script of the prerequisite of the testing process;
It calls and source data is imported into testing process and obtains the code write when test result matches, it is described for completing Testing process and Data Matching;
The code write when calling feedback matching result, for completing the feedback of matching result;
Recovery script is write, for returning to original state the source data of test and expected results;
The execution of the ant script specifically:
The ant script, and then the ETL software test of execution cycle property are executed according to the preset period.
Second aspect, the present invention also provides a kind of ETL software testing devices, including at least one processor and storage Device, between at least one described processor and memory by data/address bus connect, the memory be stored with can by it is described extremely The instruction that a few processor executes, described instruction by the processor after being executed, for completing described in above-mentioned first aspect ETL method for testing software.
Compared with prior art, the beneficial effects of the present invention are:
ETL method for testing software provided by the invention can realize the automatic test of ETL software, by way of automation The processes such as acquisition, data interaction conversion and result match check for completing data source, when result is matched and is verified, can carry out by Letter greatlys save the test run time, improves matching efficiency to difficult substep matching.Meanwhile for unmatched differentiation number Get up according to also effective use, can determine by matching differentiation data with intermediate data and lead to the link that it fails to match, favorably It where tester quickly and accurately locks the problems in test, and then targetedly makes adjustment, improves debugging effect Rate.
[Detailed description of the invention]
In order to illustrate the technical solution of the embodiments of the present invention more clearly, will make below to required in the embodiment of the present invention Attached drawing is briefly described.It should be evident that drawings described below is only some embodiments of the present invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.
Fig. 1 is a kind of flow chart of ETL method for testing software provided in an embodiment of the present invention;
Fig. 2 is the two-dimensional representation intention of test result provided in an embodiment of the present invention and expected results;
Fig. 3 is a kind of flow chart of substep matching test result and expected results provided in an embodiment of the present invention;
Fig. 4 is a kind of implementation method flow figure of second step Data Matching in Fig. 3;
Fig. 5 is another implementation method flow figure of second step Data Matching in Fig. 3;
Fig. 6 is the specific implementation flow chart of step 10 in Fig. 1;
Fig. 7 is a kind of architecture diagram of ETL software testing device provided in an embodiment of the present invention.
[specific embodiment]
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.
In addition, as long as technical characteristic involved in the various embodiments of the present invention described below is each other not Constituting conflict can be combined with each other.Just with reference to drawings and examples, in conjunction with coming, the present invention will be described in detail below.
Embodiment 1:
The embodiment of the invention provides a kind of ETL method for testing software, as shown in Figure 1, the method specifically include it is following Step:
Step 10, source data preset testing process is imported to handle;Wherein, the testing process includes that data are read It takes, the load of data interaction conversion and data.Testing process herein is by research staff for special scenes demand and/or client What demand customized, the present invention is tested aiming at the ETL software of the customization, differentiation, to reach dream database For DMETL, specifically can by java code call DMETL API (Application Programming Interface, Application programming interface), setting testing process is pre-created.Before executing the step 10, also pass through the side of automation Formula obtains source data, specifically: the data source that software to be detected is connected by jdbc, by the system table for reading the data source The relevant information of acquisition source table, and the relevant information is written in the system table of ETL, it can be obtained by the relevant information To the source data of test;Wherein, the relevant information include one in data structure, field type and major key information or It is multinomial.In embodiments of the present invention, for up to the DMETL of dream database, ETL method for testing software is illustrated, but simultaneously Not to limit the present invention.
Step 20, treated test result is obtained, test result is matched with expected results, and by matching result It is recorded.
Wherein, expected results need to be pre-generated for source data, the expected results are used as matching criteria, and the present invention is implemented Test result described in example is made of with the expected results bivariate table, as shown in Figure 2: column represent attribute item, are true in advance It is fixed, such as " name " in Fig. 2, " gender ", " age ", " phone ", " address " etc., capable then represent result items, corresponding attribute Specific data under.It in this step, specifically can treated by writing java Code obtaining test result (including data Structure and Data concentrating fruit), while the expected results are read, and then the test result and the expected results are carried out Match, matching result can be write in result.xml and be recorded.The result of successful match and failure can be recorded herein, Under normal conditions, only the result that it fails to match can also be recorded, so as to subsequent reference use.
When carrying out Data Matching, when the test result and the expected results exactly match, can just recognize The fixed testing process is successful.Assuming that tester is currently set for discovery, it fails to match just stops matching, then in order to mention High matching efficiency, matching process refer to Fig. 3, particularly may be divided into three steps: the first step, first by the data structure of the test result with The data structure of the expected results is matched;Second step, by the row of the line number of the test result and the expected results Number is matched;Third step, by every data line and the corresponding row data in the expected results in the test result into Row matching.Wherein, the matching of data structure specifically: first by the columns of the columns of the test result and the expected results into Row matching;After columns successful match, continue the data definition of each column of the test result and each column of the expected results Data definition matched, by taking Fig. 3 as an example, i.e., whether more each column are respectively that the data such as name, gender, age are fixed Justice.Since the matching to data structure and the matching for being intended to compare the specific data of every row to the matching of line number are simply more, It is also much faster with speed, second step can be just carried out after first step successful match herein;If just it fails to match for the first step, prove Testing process no longer needs to carry out second step matching there are problem, directly stopping matching process;Similarly, if the first step Successful match, it fails to match for second step, then without carrying out third step matching, directly stops matching process.Therefore, pass through this Kind is greatly improved matching efficiency by letter to difficult substep matching process, saves the test run time.
Step 30, the matching result of record is fed back into tester.It in this step, specifically can be by writing Java realizes the reading that result is recorded in result.xml, then calls Mail Server Interface to write mail and is sent to tester The function of member makes tester get the record that it fails to match as a result, can check problem in turn, find out reason, flows to test Journey is adjusted.
ETL method for testing software provided by the invention can realize the automatic test of ETL software, by way of automation Complete the processes such as acquisition, data interaction conversion and the result match check of data;When carrying out result matching verification, can carry out By letter to difficult substep matching, it fails to match can stop matching process for the first step, and without further matching, survey is greatly saved Trial run time improves matching efficiency.
After the step 30, also need to carry out writing and executing for ant script, thus by the step 10- step 30 Effectively concatenation, the ant script are write specifically: 1) write the script of the prerequisite of the testing process;2) calling will Source data imports testing process and obtains the code write when test result matches, for completing the testing process and number According to matching;The program of the entrance operation automatic test of write code i.e. in invocation step 10 and step 20;3) feedback is called The code write when matching result, for completing the feedback of matching result;The main program that i.e. invocation step 30 writes program is completed Acquisition of the tester to matching result;4) recovery script is write, for the source data of test and expected results to be restored to Original state.The execution of the ant script specifically: execute the ant script, and then execution cycle property according to the preset period ETL software test.Such as need to carry out an ETL software test daily, timer and bat script specifically can be used herein, Timing executes an ant script daily, to be automatically performed ETL detection once a day.
In conjunction with the embodiment of the present invention, there is also a kind of preferred implementations to execute the first step in the step 20 After line number matching and line number successful match, before executing second step matching, the method also includes: analysis counts the survey The total amount of data of test result carries out the total amount of data size of the test result and the total amount of data size of the expected results It matches, after total amount of data size successful match, is further continued for the every data line and the expected results in the test result In correspondence row data matched, that is, execute the matching of the second step.Wherein, when it fails to match for total amount of data, then It proves that there are problems for testing process, also just no longer needs to carry out in second step to every data line matched process line by line.To sum Matching according to amount size is also very fast, therefore, can also be in certain journey by increasing the matching of total amount of data before second step matches Matching efficiency is improved on degree, saves the test run time.
In conjunction with the embodiment of the present invention, there is also a kind of preferred implementations, for the third step in the step 20 With process, with reference to Fig. 4, it is specific again the following steps are included:
Step 201, the data volume size of every a line in the test result is counted.Wherein, each in the expected results Capable data volume size has also been counted in advance.
Step 202, it matches line by line in the data volume size and the expected results of every row in the test result and corresponds to row Data volume size.
Step 203, after the data volume size of each row equal successful match, continuation matches every row in the test result line by line Specific data and the expected results in correspond to the specific data of row.For each specific data of matching, to every row Data volume size match fairly simple, matching speed is also very fast, and the data volume size of any row mismatches if it exists, then Matching process can directly be stopped, no longer needing to match the specific data of every a line, therefore can also improve to a certain extent Matching efficiency saves the test run time.
Wherein, in preferred scheme, specifically may be used with reference to Fig. 5 for the third step matching process in the step 20 The following steps are included:
Step 201 ', the data volume size of every a line in the test result is counted, and by the test result according to number Each row is arranged in bivariate table according to the sequence of amount from small to large.Relative to step 201, increase according to data volume size to each row The step of sequence, for example, the data volume size of the first row to fifth line is respectively 10M, 12M, 13M, 15M and 20M, from small to large It is arranged successively;Wherein, the expected results can also shift to an earlier date is arranged according to the sequence of each row of data amount from small to large, Jin Eryu The test result corresponds to line by line.
Step 202 ', it is matched in the data volume size and the expected results of every row in the test result line by line and corresponds to row Data volume size.With above-mentioned five-element's data instance, the sequence of each row of data amount size formation of the test result are as follows: 10,12,13,15,20, only when the sequence that each row of data amount size of the expected results is formed also is above-mentioned Serial No. When, the data volume size of each row just calculates successful match, mismatches if any any value, then it fails to match.
Step 203 ', after the data volume size of each row equal successful match, puts in order according to capable, successively match line by line The specific data of row are corresponded in the test result in the specific data of every row and the expected results.Still with five above-mentioned line numbers For, after the equal successful match of data volume size of each row, the first the smallest the first row of matched data amount in sequence, then successively With the second row, the third line, fourth line and fifth line, data volume is smaller, and the byte for showing that the row data occupy is smaller, then matches speed It spends faster.Therefore, it is matched according to sequence from small to large, the data of row as much as possible can be completed within the same time Matching, to improve matching speed, saves the test run time convenient for finding unmatched row in time.
In conjunction with the embodiment of the present invention, there is also a kind of preferred implementations, when any step in the three steps matching of step 20 When it fails to match, i.e., when data structure mismatch between the test result and the expected results or line number mismatch, or When the data of person's any row mismatch, where also searching the problem of it fails to match using unmatched differentiation data, specifically Method is as follows:
Firstly, according to Data Matching as a result, the differentiation data between the test result and the expected results is defeated Out, and then by the differentiation data tester is fed back to.
Then, source data is imported preset testing process again to handle, that is, re-execute the steps 20, difference It is, the intermediate data that wherein each link generates is recorded during retesting, such as after reading data Data, the data after data filtering and the data after process different data cleansing or data conversion, these are all The intermediate data for needing to record.
Finally, the intermediate data that the differentiation data is generated with each link respectively is matched, and then described in determination Appearance link of the differentiation data in the testing process, and generate test report and feed back to tester.For example, when described When Data Matching after differentiation data and data cleansing is successful, then differentiation data is that occur during data cleansing , it was demonstrated that the problems in the design of data cleansing can make to survey by generating corresponding test report and feeding back to tester Examination personnel recognize the link to go wrong in time, and targetedly to data cleansing, this process is adjusted, and avoid blind Mesh entire testing process is adjusted.
In the above-mentioned methods, lead to the link that it fails to match by can determine using differentiation data, and effective Feedback is given Tester is conducive to tester and quickly and accurately locks the problems in testing process place, and then targetedly makes Adjustment, improves debugging efficiency.
Wherein, in the step 20, when carrying out the matching line by line of third step, when first appearing in the test result When corresponding to the unmatched situation of data of row in any row data and the expected results, stop Data Matching, and will be corresponding It fails to match, and result is recorded;Alternatively, when the accounting of the differentiation data between the test result and the expected results When reaching preset threshold, stop Data Matching, and it fails to match that result is recorded by corresponding;Wherein, the preset threshold It can be adjusted according to actual needs by tester.For example, allowing test process to have subtle error when measuring accuracy is of less demanding When, can set preset threshold is 2%, as long as then differentiation data control can continue to match within 2%, is thought more than 2% It fails to match, can finish test procedure.For another example tester is settable to continue matching process after it fails to match, when reaching Just terminate to match when to preset threshold, and by the data feedback that it fails to match to tester, so that tester analyzes.
Below with reference to Fig. 6, the step 10 is further spread out and is discussed in detail, specifically includes the following steps:
Step 101, the synchronous testing process of creation data, adds reading data component, data in the testing process Transition components and dataload component are cleaned, and is arranged and needs synchronous source table;Wherein, the source table needs to synchronize for storing Source data.The reading data component be used for from from data source extract data into source table, the data cleansing transition components For carrying out cleaning conversion to data, the dataload component is loaded for data to object table, and each data package is The functional unit of DMETL can call directly.
Before creating testing process, it usually needs the server-side of first test connection DMETL is dished out different if it can not connect Normal information;Then corresponding engineering and flow path switch are created, such as the engineering of entitled " automatic test engineering " can be created, and The flow path switch of entitled " test data is synchronous " is created under the engineering;DMETL is added in " test data is synchronous " flow path switch Reading data component, data cleansing transition components and dataload component;According to the correlation being written in ETL system table Information setting needs synchronous source table, while also preferable customized in batches to needing synchronous source data to be arranged in the table of source Cache size, and then improve synchronous efficiency.For example, it is desired to which synchronous size of data is 1G, then it may be configured as point 4 completions, often Subsynchronous 256M, this is than disposably synchronizing the speed of 1G data faster.
Step 102, different conversion designs are carried out to testing process according to test function, the conversion designs include increment It is one or more in data Synchronization Design, data filtering design and data cleansing conversion designs.
Wherein, the mode obtained according to incremental data is different, the incremental data Synchronization Design include again trigger increment, The data Synchronization Design of MD5 increment, shadow table increment and incremental raio to component, the design of the trigger increment are as follows: in data Trigger is created on source to capture increment delta data and operation, and is recorded in the system table of DMETL, to carry out incremental number According to synchronization;The design of the MD5 increment are as follows: the MD5 value for calculating every data line is recorded in the MD5 table of DMETL creation, It is matched by major key and obtains incremental data and operation, and be recorded in the system table of DMETL, to carry out the same of incremental data Step;The design of the shadow table increment are as follows: in the shadow table that copy source table data to DMETL create, obtained by major key matching Incremental data and operation are obtained, and is recorded in the system table of DMETL, to carry out the synchronization of incremental data;The incremental raio pair The design of component are as follows: be ranked up in database layer in face of source table and object table, by configuring unique match example and comparison column etc. Condition is compared, and obtains incremental data and operation, is that sql is sent to object library by code conversion, it is same to complete incremental data Step.
The design of the data filtering are as follows: can be used if condition judgement, according to filter condition to the source data in the table of source into The data for meeting filter condition are only synchronized to object table by row filtering.
The data cleansing conversion is divided into three classes, wherein field is deleted, merge, is split: can pass through the number of DMETL It is configured according to cleaning transition components, deletion, merging or the fractionation of field is carried out according to conditions such as position or separators;For Field contents cleaning: field contents can be cleaned by java function;For date-time string format: passing through The data cleansing transition components of DMETL are configured, and are called format () method of date conversion, are errors excepted then dished out different Normal information, to carry out the formatting of date-time character string.
Step 103, by each data package of addition, source data is made to carry out what data synchronized according to the testing process of design Processing.According to the process designed in step 102, loaded by reading data, data exchange conversion and data, complete source data from Source table and then completes corresponding synchronism detection process to the synchronization of object table.
In conclusion the ETL method for testing software provided through the invention, it can be achieved that ETL software automatic test, packet The acquisition, data interaction conversion and result match check etc. for including data source can be carried out when carrying out result matching verification by letter It is matched to difficult substep, the test run time is greatly saved, improve matching efficiency.Meanwhile for unmatched differentiation number It according to that can efficiently use, timely feedbacks to tester, can determine and cause by matching differentiation data with intermediate data The link that it fails to match is conducive to tester and quickly and accurately locks the problems in testing process place, and then targetedly Ground is made adjustment, and debugging efficiency is improved.
Embodiment 2:
On the basis of a kind of ETL method for testing software that embodiment 1 provides, the present invention also provides one kind can be used for reality The ETL software testing device of the existing above method, as shown in fig. 7, being the device architecture schematic diagram of the embodiment of the present invention.The present embodiment ETL software testing device include one or more processors 21 and memory 22.Wherein, with a processor 21 in Fig. 7 For.
The processor 21 can be connected with the memory 22 by bus or other modes, by total in Fig. 7 For line connection.
The memory 22 is used as a kind of ETL method for testing software non-volatile computer readable storage medium storing program for executing, can be used for Non-volatile software program, non-volatile computer executable program and module are stored, as the ETL software in embodiment 1 is surveyed Method for testing.Non-volatile software program, instruction and the mould that the processor 21 is stored in the memory 22 by operation Block, the ETL software thereby executing the various function application and data processing of ETL software testing device, i.e. realization embodiment 1 are surveyed Method for testing.
The memory 22 may include high-speed random access memory, can also include nonvolatile memory, such as At least one disk memory, flush memory device or other non-volatile solid state memory parts.In some embodiments, described Optional memory 22 includes the memory remotely located relative to the processor 21, these remote memories can pass through network It is connected to the processor 21.The example of above-mentioned network includes but is not limited to internet, intranet, local area network, moves and lead to Letter net and combinations thereof.
Described program instruction/module is stored in the memory 22, is held when by one or more of processors 21 When row, the ETL method for testing software in above-described embodiment 1 is executed, for example, it is shown in fig. 6 to execute Fig. 1, Fig. 3-described above Each step.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of embodiment is can to lead to Program is crossed to instruct relevant hardware and complete, which can be stored in a computer readable storage medium, storage medium It may include: read-only memory (ROM, Read Only Memory), random access memory (RAM, RandomAccess Memory), disk or CD etc..
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.

Claims (10)

1. a kind of ETL method for testing software, which is characterized in that for source data, pre-generate expected results, the method packet It includes:
Source data is imported preset testing process to handle;Wherein, the testing process includes reading data, data interaction Conversion is loaded with data;
Treated test result is obtained, test result and expected results are subjected to substep matching, and matching result is remembered Record;
The matching result of record is fed back into tester;
Wherein, the test result is made of with the expected results bivariate table, it is described by test result and expected results into Row substep matches specifically: matches the data structure of the test result with the data structure of the expected results;Number After structure matching success, the line number of the test result is matched with the line number of the expected results;Line number matching at After function, continue with the corresponding row data in the expected results to carry out every data line in the test result line by line Match.
2. ETL method for testing software according to claim 1, which is characterized in that the data by the test result Structure is matched with the data structure of the expected results specifically: by the columns of the test result and the expected results Columns matched;After columns successful match, continue the data definition of each column of the test result and the expected knot The data definition of each column of fruit is matched.
3. ETL method for testing software according to claim 1, which is characterized in that when the test result and the expection As a result between after line number successful match, the method also includes: the total amount of data of the test result is counted, the test is tied The total amount of data size of fruit is matched with the total amount of data size of the expected results, after total amount of data size successful match, Continue to match every data line in the test result with the corresponding row data in the expected results.
4. ETL method for testing software according to claim 1 or 3, which is characterized in that it is described will be in the test result Every data line is matched with the corresponding row data in the expected results specifically:
Count the data volume size of every a line in the test result;
The data volume size that row is corresponded in the data volume size and the expected results of every row in the test result is matched line by line;
After the data volume size of each row equal successful match, continuation match in the test result line by line the specific data of every row with The specific data of row are corresponded in the expected results.
5. ETL method for testing software according to claim 1 or 3, which is characterized in that it is described will be in the test result Every data line is matched with the corresponding row data in the expected results specifically:
Count the data volume size of every a line in the test result, and by the test result according to data volume from small to large Sequence arranges each row in bivariate table;
The data volume size that row is corresponded in the data volume size and the expected results of every row in the test result is matched line by line;
After the data volume size of each row equal successful match, puts in order according to capable, successively match the test result line by line In every row specific data and the expected results in correspond to the specific data of row.
6. ETL method for testing software according to claim 1, which is characterized in that when the test result and the expection As a result data structure, which mismatches, between perhaps exists between line number mismatch or the test result and the expected results When the data of any row mismatch, the method also includes:
According to Data Matching as a result, the differentiation data between the test result and the expected results is exported;
Again source data is imported preset testing process to handle, and to the intermediate data that link each in treatment process generates It is recorded;
The intermediate data that the differentiation data is generated with each link respectively is matched, and then determines the differentiation data Appearance link in the testing process, and generate test report and feed back to tester.
7. ETL method for testing software according to claim 1, which is characterized in that it is described will be in the test result Every data line and the corresponding row data in the expected results carry out it is matched during, when first appearing the test result In any row data and the expected results in correspond to row data mismatch when, stop Data Matching;Alternatively, working as the survey When the accounting of differentiation data between test result and the expected results reaches preset threshold, stop Data Matching.
8. ETL method for testing software according to claim 1, which is characterized in that described that source data is imported preset survey Examination process carries out processing and specifically includes:
The synchronous testing process of data is created, reading data component, data cleansing transition components are added in the testing process With dataload component, and it is arranged and needs synchronous source table;Wherein, the source table, which is used to store, needs synchronous source data;
Different conversion designs are carried out to testing process according to test function, and the conversion designs include that incremental data is synchronized and set It is one or more in meter, data filtering design and data cleansing conversion designs;
By each data package of addition, source data is made to carry out the synchronous processing of data according to the testing process of design.
9. ETL method for testing software according to claim 1, which is characterized in that in the matching knot by record After fruit feeds back to tester, writing and executing the method also includes ant script;Writing for the ant script is specific Are as follows:
Write the script of the prerequisite of the testing process;
It calls and source data is imported into testing process and obtains the code write when test result matches, for completing the test Process and Data Matching;
The code write when calling feedback matching result, for completing the feedback of matching result;
Recovery script is write, for returning to original state the source data of test and expected results;
The execution of the ant script specifically:
The ant script, and then the ETL software test of execution cycle property are executed according to the preset period.
10. a kind of ETL software testing device, which is characterized in that including at least one processor and memory, it is described at least one It is connected between processor and memory by data/address bus, the memory, which is stored with, to be executed by least one described processor Instruction, described instruction by the processor after being executed, for completing any ETL software test of claim 1-9 Method.
CN201811366641.3A 2018-11-16 2018-11-16 ETL software testing method and device Active CN109634846B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811366641.3A CN109634846B (en) 2018-11-16 2018-11-16 ETL software testing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811366641.3A CN109634846B (en) 2018-11-16 2018-11-16 ETL software testing method and device

Publications (2)

Publication Number Publication Date
CN109634846A true CN109634846A (en) 2019-04-16
CN109634846B CN109634846B (en) 2021-10-19

Family

ID=66068184

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811366641.3A Active CN109634846B (en) 2018-11-16 2018-11-16 ETL software testing method and device

Country Status (1)

Country Link
CN (1) CN109634846B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781070A (en) * 2019-09-06 2020-02-11 平安科技(深圳)有限公司 Big data test verification method and device, computer equipment and storage medium
CN112597221A (en) * 2020-12-17 2021-04-02 四川新网银行股份有限公司 Test environment data extraction optimization execution method based on cross section data
CN112685325A (en) * 2021-01-22 2021-04-20 中信银行股份有限公司 ETL software research and development test management method and system
WO2022083266A1 (en) * 2020-10-19 2022-04-28 中兴通讯股份有限公司 Data table synchronization method and apparatus, data exchange device, and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105653445A (en) * 2015-12-25 2016-06-08 中电科航空电子有限公司 Implementation method capable of meeting DO-178C test result
CN106293977A (en) * 2015-05-15 2017-01-04 阿里巴巴集团控股有限公司 A kind of data verification method and equipment
US20170060969A1 (en) * 2015-09-02 2017-03-02 International Business Machines Corporation Automating extract, transform, and load job testing
CN106815100A (en) * 2015-11-27 2017-06-09 北京国双科技有限公司 Interface test method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106293977A (en) * 2015-05-15 2017-01-04 阿里巴巴集团控股有限公司 A kind of data verification method and equipment
US20170060969A1 (en) * 2015-09-02 2017-03-02 International Business Machines Corporation Automating extract, transform, and load job testing
CN106815100A (en) * 2015-11-27 2017-06-09 北京国双科技有限公司 Interface test method and device
CN105653445A (en) * 2015-12-25 2016-06-08 中电科航空电子有限公司 Implementation method capable of meeting DO-178C test result

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781070A (en) * 2019-09-06 2020-02-11 平安科技(深圳)有限公司 Big data test verification method and device, computer equipment and storage medium
WO2022083266A1 (en) * 2020-10-19 2022-04-28 中兴通讯股份有限公司 Data table synchronization method and apparatus, data exchange device, and storage medium
CN112597221A (en) * 2020-12-17 2021-04-02 四川新网银行股份有限公司 Test environment data extraction optimization execution method based on cross section data
CN112597221B (en) * 2020-12-17 2023-04-11 四川新网银行股份有限公司 Test environment data extraction optimization execution method based on cross section data
CN112685325A (en) * 2021-01-22 2021-04-20 中信银行股份有限公司 ETL software research and development test management method and system
CN112685325B (en) * 2021-01-22 2023-07-28 中信银行股份有限公司 ETL software research and development test management method and system

Also Published As

Publication number Publication date
CN109634846B (en) 2021-10-19

Similar Documents

Publication Publication Date Title
CN109634846A (en) A kind of ETL method for testing software and device
US11829360B2 (en) Database workload capture and replay
CN104317843B (en) A kind of data syn-chronization ETL system
US7984015B2 (en) Database workload capture and replay architecture
US7890457B2 (en) Transactionally consistent database workload replay
US8024299B2 (en) Client-driven functionally equivalent database replay
CN102323945B (en) SQL (Structured Query Language)-based database management method and device
CN102981947B (en) Data preparation method in test and system provided with the same
CN109947646A (en) Interface test method, device, computer equipment and storage medium
US20080097961A1 (en) Capturing database workload while preserving original transactional and concurrency characteristics for replay
CN107209704A (en) Detect the write-in lost
US9740595B2 (en) Method and apparatus for producing a benchmark application for performance testing
CN103365776A (en) Parallel system weak consistency verifying method and system based on deterministic replay
CN110321383A (en) Big data platform method of data synchronization, device, computer equipment and storage medium
Rabl et al. Just can't get enough: Synthesizing Big Data
EP1952241B1 (en) Database workload capture and replay architecture
CN110377583A (en) Database script executes method, apparatus, computer equipment and storage medium
CN110515958A (en) Data consistency method, apparatus, equipment and storage medium based on big data
CN107798007A (en) A kind of method, apparatus and relevant apparatus of distributed data base data check
CN108255477A (en) A kind of method and system by SQL compiler simulative optimization database performances
CN106528364A (en) Method for building automated co-verification platform on the basis of memory access driving
CN107356864B (en) PLL circuit anti-radiation performance appraisal procedure
Salama A regression testing framework for financial time-series databases: an effective combination of fitnesse, scala, and kdb/q
Baek et al. RSX: Reproduction scenario extraction technique for business application workloads in DBMS
Majeed Testing and Debugging Event-based Software through OS-level Replay

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 430000 16-19 / F, building C3, future technology building, 999 Gaoxin Avenue, Donghu New Technology Development Zone, Wuhan, Hubei Province

Applicant after: Wuhan dream database Co., Ltd

Address before: 430000 16-19 / F, building C3, future technology building, 999 Gaoxin Avenue, Donghu New Technology Development Zone, Wuhan, Hubei Province

Applicant before: WUHAN DAMENG DATABASE Co.,Ltd.

CB02 Change of applicant information
CB03 Change of inventor or designer information

Inventor after: Fu Bowen

Inventor after: Yu Yuanlan

Inventor after: Feng Yuan

Inventor before: Fu Bowen

Inventor before: Yu Yuanlan

Inventor before: Feng Yuan

Inventor before: Fu Quan

CB03 Change of inventor or designer information
GR01 Patent grant
GR01 Patent grant