CN114490413A

CN114490413A - Test data preparation method and device, storage medium and electronic equipment

Info

Publication number: CN114490413A
Application number: CN202210134172.2A
Authority: CN
Inventors: 王颖; 曹雯葭; 陆媛媛; 王琼璞
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2022-02-14
Filing date: 2022-02-14
Publication date: 2022-05-13

Abstract

The application discloses a test data preparation method and device, a storage medium and electronic equipment, and relates to the field of big data. The method comprises the following steps: acquiring a target batch script, wherein the target batch script is used for processing big data; analyzing the target codes in the target batch scripts to obtain a tree structure of the target codes and operators of the target codes, wherein the tree structure represents structural information among the codes in the target codes, and the operators represent mapping relations among the codes in the target codes; generating a target data table and a target field in the target data table according to the tree structure and the operator; and analyzing the target data table and the target field to generate target test data. By the method and the device, the problem that test data preparation efficiency is low due to the fact that test data are prepared in a manual mode in the related technology is solved.

Description

Test data preparation method and device, storage medium and electronic equipment

Technical Field

The application relates to the field of big data, in particular to a test data preparation method and device, a storage medium and electronic equipment.

Background

In the related art, when testing a large data batch script, the test data is mainly prepared through manual testing and the program is automatically run in a test environment for verification. The method comprises the following specific steps:

(1) manually analyzing and sorting the data-taking logic list: compiling a data fetching script required by testing a big data batch script, wherein the data fetching script consists of a series of SQL statements, and from the big data batch script, a source data field required by a testing target script and an operator logic required to be tested need to be accurately summarized and found from a testing database;

(2) test data were prepared manually locally from the consolidated manifest: executing according to the manually written script in the step (1) to obtain source data needing to be tested;

(3) manually importing data into a test environment for testing;

(4) automatic loading test of big data script: executing the big data script to be tested, observing the executing process, and finally checking whether the running result meets the expectation.

However, the current big data testing scheme in the related art has no method for automatically analyzing the number of the manufactured data, and the test data preparation is still a manual mode. Moreover, manual analysis of the number takes much time, and testers need to isomorphically read large-data batch script codes, analyze the codes and write the number analysis script by themselves through professional coding knowledge. However, in the process of analyzing the test object code, the technical level and the level of the tester are not uniform due to different degrees of understanding of the code logic, so that the manual number script is easy to make mistakes, and the problems of rework, low working efficiency and the like are caused.

Aiming at the problem that the preparation efficiency of test data is low due to the fact that the test data is prepared in a manual mode in the related art, an effective solution is not provided at present.

Disclosure of Invention

The application mainly aims to provide a method and a device for preparing test data, a storage medium and electronic equipment, so as to solve the problem that the efficiency of preparing the test data is low due to the fact that the test data is prepared manually in the related art.

In order to achieve the above object, according to one aspect of the present application, there is provided a method of preparing test data. The method comprises the following steps: acquiring a target batch script, wherein the target batch script is used for processing big data; analyzing the target codes in the target batch scripts to obtain a tree structure of the target codes and operators of the target codes, wherein the tree structure represents structural information among the codes in the target codes, and the operators represent mapping relations among the codes in the target codes; generating a target data table and a target field in the target data table according to the tree structure and the operator; and analyzing the target data table and the target field to generate target test data.

Further, after parsing the target data table and the target field to generate target test data, the method further includes: determining a target test file, wherein the target test file at least comprises two or more target test data; and verifying the correctness of the target batch scripts by utilizing the target test file.

Further, before verifying the correctness of the target batch script using the target test file, the method further comprises: determining the data volume of each target test data in the target test file and the association degree between every two target test data; obtaining a predicted result of running the target batch script according to the data volume and the correlation degree; determining a test case according to the target test data in the target test file; and executing the target batch scripts according to the test cases to obtain test results of the target batch scripts.

Further, verifying the correctness of the target batch script using the target test file comprises: judging whether the predicted result is the same as the test result; if the predicted result is the same as the test result, representing that the target batch script is correct; and if the predicted result is different from the test result, representing that the target batch script is wrong.

Further, analyzing the target data table and the target field, and generating target test data includes: determining an operator of the target field; analyzing an associated field in a target source table according to the operator of the target field, wherein the target source table is a table associated with the target data table; extracting data from the target source table according to the associated fields; and generating the target test data according to the data extracted from the source table.

Further, after analyzing the object codes in the object batch scripts to obtain the tree structures and operators corresponding to the object codes, the method further comprises: generating a first target script and a second target script according to the tree structure and the operator; analyzing the target code by using the first target script to obtain the target data table and the operator of the target field; and generating the target test data according to the second target script and the target field.

Further, determining the target test file comprises: acquiring target test data I and target test data II according to a preset mode, wherein the target test data I is target test data which accords with the target batch script operation scene, and the target test data II is target test data which does not accord with the target batch script operation scene; and determining the target test file according to the first target test data and the second target test data.

Further, before analyzing the object codes in the target batch scripts to obtain the tree structure of the object codes and the operators of the object codes, the method further includes: judging whether the target batch script has an unanalyzed code or not; if the unresolved codes exist in the target batch script, taking the unresolved codes as the target codes; and if the unresolved codes do not exist in the target batch script, the target batch script is analyzed and completed.

In order to achieve the above object, according to another aspect of the present application, there is provided a test data preparing apparatus. The device includes: the device comprises a first obtaining unit, a second obtaining unit and a processing unit, wherein the first obtaining unit is used for obtaining a target batch script, and the target batch script is used for processing big data; the first analysis unit is used for analyzing the target codes in the target batch scripts to obtain a tree structure of the target codes and operators of the target codes, wherein the tree structure represents structural information among codes in the target codes, and the operators represent mapping relations among the codes in the target codes; the first generation unit is used for generating a target data table and a target field in the target data table according to the tree structure and the operator; and the second analysis unit is used for analyzing the target data table and the target field to generate target test data.

Further, the apparatus further comprises: a first determining unit, configured to determine a target test file after analyzing the target data table and the target field and generating target test data, where the target test file at least includes two or more target test data; and the first verification unit is used for verifying the correctness of the target batch script by using the target test file.

Further, the apparatus further comprises: a second determining unit, configured to determine a data amount of each target test data in the target test file and a correlation between every two target test data before verifying correctness of the target batch script by using the target test file; the first processing unit is used for obtaining a predicted result of running the target batch script according to the data volume and the correlation degree; the third determining unit is used for determining a test case according to the target test data in the target test file; and the second processing unit is used for executing the target batch scripts according to the test cases to obtain the test results of the target batch scripts.

Further, the first authentication unit includes: the first judgment module is used for judging whether the predicted result is the same as the test result or not; the first processing module is used for representing that the target batch script is correct if the expected result is the same as the test result; and the second processing module is used for representing that the target batch script is wrong if the expected result is different from the test result.

Further, the second parsing unit includes: a first determining module, configured to determine an operator of the target field; the first analysis module is used for analyzing the associated fields in the target source table according to the operators of the target fields, wherein the target source table is a table associated with the target data table; the first extraction module is used for extracting data from the target source table according to the associated field; and the first generation module is used for generating the target test data according to the data extracted from the source table.

Further, the apparatus further comprises: the second generation unit is used for generating a first target script and a second target script according to the tree structure and the operator after analyzing the target codes in the target batch scripts to obtain the tree structure and the operator corresponding to the target codes; the third processing unit is used for analyzing the target code by using the first target script to obtain the target data table and the operator of the target field; and the third generating unit is used for generating the target test data according to the target script II and the target field.

Further, the first determination unit includes: the first obtaining module is used for obtaining first target test data and second target test data according to a preset mode, wherein the first target test data is the target test data which accords with the running scene of the target batch script, and the second target test data is the target test data which does not accord with the running scene of the target batch script; and the second determining module is used for determining the target test file according to the first target test data and the second target test data.

Further, the apparatus further comprises: the first judgment unit is used for judging whether unresolved codes exist in the target batch script or not before analyzing the target codes in the target batch script to obtain a tree structure of the target codes and operators of the target codes; a fourth processing unit, configured to, if the unresolved code exists in the target batch script, take the unresolved code as the target code; and the fifth processing unit is used for representing that the target batch script is analyzed and completed if the unresolved codes do not exist in the target batch script.

In order to achieve the above object, according to another aspect of the present application, there is provided a computer-readable storage medium storing a program, wherein the program performs the method of preparing test data according to any one of the above.

To achieve the above object, according to another aspect of the present application, there is provided an electronic device including one or more processors and a memory for storing one or more programs, wherein when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the method for preparing test data according to any one of the above.

Through the application, the following steps are adopted: acquiring a target batch script, wherein the target batch script is used for processing big data; analyzing the target codes in the target batch scripts to obtain a tree structure of the target codes and operators of the target codes, wherein the tree structure represents structural information among the codes in the target codes, and the operators represent mapping relations among the codes in the target codes; generating a target data table and a target field in the target data table according to the tree structure and the operator; the target data table and the target field are analyzed to generate target test data, and the problem that the test data preparation efficiency is low due to the fact that the test data is prepared manually in the related technology is solved. According to the tree structure of the target codes and the operator of the target codes, which are obtained by analyzing the target codes in the obtained target batch scripts, the target data table and the target fields in the target data table are generated, the target data table and the target fields are analyzed, and target test data are generated, so that test data can be automatically generated, and the effect of improving the preparation efficiency of the test data is achieved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application. In the drawings:

FIG. 1 is a flow chart of a method for preparing test data provided according to an embodiment of the present application;

FIG. 2 is a flow diagram of a test batch big data batch script in an embodiment of the present application;

FIG. 3 is a flow chart of automated parsing of upstream data tables and corresponding fields in an embodiment of the present application;

FIG. 4 is a flow diagram of automated parsing of big data batch scripts in an embodiment of the present application;

FIG. 5 is a schematic diagram of an apparatus for preparing test data according to an embodiment of the present application;

fig. 6 is a schematic diagram of an electronic device provided according to an embodiment of the present application.

Detailed Description

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the application described herein may be used. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It should be noted that, the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for presentation, analyzed data, etc.) referred to in the present disclosure are information and data authorized by the user or sufficiently authorized by each party.

The present invention is described below with reference to preferred implementation steps, and fig. 1 is a flowchart of a method for preparing test data according to an embodiment of the present application, as shown in fig. 1, the method includes the following steps:

step S101, a target batch script is obtained, wherein the target batch script is used for processing big data.

In the application, Big Data (Big Data) refers to a Data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, and is a massive, high-growth rate and diversified information asset which can have stronger decision-making power, insight discovery power and flow optimization capability only by a new processing mode; batch scripts, which are a series of programs running on a big data platform and used for extracting and reprocessing (classifying, summarizing, calculating, etc.) data, generally include database scripting languages such as SQL (structured query language) voice and stored procedures; the big data platform is a system for managing big data batch scripts, and comprises program monitoring, alarming and upstream and downstream dependency relationship management among big data batches, so that the big data batches can execute big data batch execution steps according to a correct sequence.

For example, a batch script for processing big data is acquired, that is, a series of programs running on a big data platform are acquired, and by using the programs, data can be extracted and reprocessed (classified, summarized, calculated, and the like).

And S102, analyzing the target codes in the target batch scripts to obtain a tree structure of the target codes and operators of the target codes, wherein the tree structure represents structural information among the codes in the target codes, and the operators represent mapping relations among the codes in the target codes.

In the present application, a tree structure is a hierarchical nested structure, i.e. the outer layer and the inner layer of a tree structure have similar structures, so that the structure can be represented recursively, and various dendrograms in a classical data structure are a typical tree structure: a tree can be simply represented as root, left sub-tree, right sub-tree. The left sub-tree and the right sub-tree have own sub-trees; an operator refers to a mapping relation from one function space to another function space, and the algorithm relation comprises field mathematical formula operation, classification, summarization, logic condition judgment, sequencing, type conversion and the like; and the upstream batch is a previous-level batch program which the current batch depends on, the current batch program can be executed only when the upstream batch is correctly executed, otherwise, the current batch program can be started to be executed after the upstream batch is executed. Often, a large data batch program has multiple upstream batches; downstream batch, the next-level batch program that the current batch depends on, the downstream batch program can be executed only when the current batch is executed correctly; the dependency relationship represents the order of the big data batch tasks, and the big data tasks must be executed in sequence according to a specific upstream and downstream relationship, and finally the data program operation tasks can be completed correctly.

For example, on the basis of analyzing library names and table names, table fields, field associations or screening conditions are used from the latest code analysis of a program, the analysis is performed layer by layer, and finally, upstream and downstream tree structures and relational logic operators are obtained.

And step S103, generating a target data table and a target field in the target data table according to the tree structure and the operator.

For example, an upstream data table and corresponding fields are automatically generated in the database system according to the obtained upstream and downstream tree structure relational graph and the logic operator.

And step S104, analyzing the target data table and the target field to generate target test data.

For example, the test data is obtained by analyzing an upstream data table and corresponding fields that are automatically generated in the database system. In addition, the process of generating test data is also the process of making numbers. And, the manufacture number is a method for preparing a source value of a big data test, an extraction method can be used, and the whole method process is called manufacture number.

Through the steps S101 to S104, the target data table and the target field in the target data table are generated according to the tree structure of the target code and the operator of the target code obtained by analyzing the target code in the obtained target batch script, and the target test data is generated by analyzing the target data table and the target field, so that the test data can be automatically generated, and the effect of improving the preparation efficiency of the test data is achieved.

Optionally, in the method for preparing test data provided in the embodiment of the present application, after analyzing the target data table and the target field and generating the target test data, the method further includes: determining a target test file, wherein the target test file at least comprises two or more target test data; and verifying the correctness of the target batch scripts by using the target test file.

For example, after the test data is confirmed, an automation test is performed (data is imported, a batch program is executed, a data result is generated, and verification is performed). The method specifically comprises the following steps: determining a test file according to two or more obtained test data; and testing the batch scripts by using the test files, namely executing automatic tests after the test files are confirmed, and testing a series of programs corresponding to the batch scripts.

By the scheme, the defect of manual test data preparation in the prior art can be overcome, a tree structure standard configuration model can be formed, the number making script is automatically generated, data files required by batch test are further generated, the automatic test is carried out, and meanwhile the batch scheduling logic correctness is verified. In addition, after program logic of a large-data batch script is automatically analyzed, corresponding fields of upstream and downstream incidence relation, classification, summarization, mathematical formula calculation, condition logic judgment and the like are operated to perform logic, logic operators are automatically formed, the tree structure relation between an upstream table and a downstream table is identified on the basis, the node at the most upstream is source data, the node at the most downstream is a data result of the last layer of a test object, and a required test data file is automatically generated, so that the aim of accelerating test data preparation is fulfilled, the accuracy of summarizing the number analysis script in a test link is improved, the preparation efficiency of test data is improved, and the effect of optimizing a batch automatic test method is achieved.

Optionally, in the method for preparing test data provided in this embodiment of the present application, before verifying the correctness of the target batch script using the target test file, the method further includes: determining the data volume of each target test data in the target test file and the correlation degree between every two target test data; obtaining a predicted result of running the target batch scripts according to the data volume and the correlation degree; determining a test case according to target test data in the target test file; and executing the target batch scripts according to the test cases to obtain test results of the target batch scripts.

In this embodiment, fig. 2 is a flowchart of a test batch big data batch script in an embodiment of the present application, and as shown in fig. 2, the flowchart of the test batch big data batch script specifically includes: (1) automatically counting the data quantity stored in the test file and the data association degree of the association table through a tool; (2) calculating the expected result of the existing test data according to the obtained data quantity stored in the test file and the data association degree of the association table; (3) automatically importing the data stored in the test file into a database through a tool; (4) deploying an automatic test case according to data stored in the test file; (5) and automatically executing the large-data batch scripts through a tool according to the test cases to obtain test results of the large-data batch scripts.

In summary, the calculation result of the test data can be obtained quickly and accurately by automatically counting the data quantity stored in the test file and the data association degree of the association table by using the tool. In addition, the automatic test case can be deployed according to the data stored in the test file, so that the test result of the large-data batch script can be automatically obtained according to the test case and the large-data batch script.

Optionally, in the method for preparing test data provided in the embodiment of the present application, verifying the correctness of the target batch script by using the target test file includes: judging whether the predicted result is the same as the test result or not; if the predicted result is the same as the test result, the target batch script is correct; and if the predicted result is different from the test result, representing that the target batch script is wrong.

For example, the generated test result can be automatically verified by determining the consistency of the calculated expected result and the obtained test result. The method specifically comprises the following steps: if the estimated result is the same as the test result, the big data batch script is correct; if the predicted result is different from the test result, the big data batch script is wrong.

By the scheme, the verification efficiency of the large-data batch scripts can be improved.

Optionally, in the method for preparing test data provided in the embodiment of the present application, analyzing the target data table and the target field, and generating the target test data includes: determining an operator of the target field; analyzing an associated field in a target source table according to an operator of the target field, wherein the target source table is a table associated with a target data table; extracting data from the target source table according to the associated fields; and generating target test data according to the data extracted from the source table.

In this embodiment, fig. 3 is a flowchart of automatically parsing an upstream data table and corresponding fields in the embodiment of the present application, and as shown in fig. 3, the process of automatically parsing the upstream data table and corresponding fields specifically includes: (1) obtaining a section of SQL (structured query language) from the big data batch script; (2) analyzing operators of an Id field and a Val field of a target table A and a table A by using an analyzer; (3) analyzing the associated fields Id of the source table B and the source table C by using an analyzer to obtain a calculation field Val; (4) extracting data of the source table B according to the Id field by using a data file generator; (5) and extracting the data of the source table C according to the Id fields by using a data file generator and referring to the data of the source table B, namely the data in the source table B and the source table C are test data.

By the scheme, the logic operator can be automatically analyzed to generate the tree structure, and the test data with the service scene is extracted, so that the test data is generated, and the preparation efficiency of the test data is improved.

Optionally, in the method for preparing test data provided in this embodiment of the present application, after analyzing the object code in the object batch script to obtain the tree structure and the operator corresponding to the object code, the method further includes: generating a first target script and a second target script according to the tree structure and the operator; analyzing the target code by using the target script I to obtain a target data table and an operator of a target field; and generating target test data according to the target script II and the target field.

For example, the first target script may be a parser, and the second target script may be a data file generator. For example, as shown in fig. 3, the parser can parse operators of the Id field and Val field of the target table a and a table, the data file generator can extract data of the source table B according to the Id field, the data generator is used again, and the data of the source table B is referred to, and at the same time, the data of the source table C is extracted according to the Id field. And finally, the data in the source table B and the source table C are test data.

In summary, by generating the test data by using the method of automatically analyzing the large data batch scripts, the possibility of analysis errors, analysis omission, preparation errors and preparation omission in the preparation process is reduced, and the process of manually preparing the test data is also avoided, so that the preparation accuracy of the test data and the preparation efficiency of the test data are improved.

Optionally, in the method for preparing test data provided in the embodiment of the present application, determining the target test file includes: acquiring target test data I and target test data II according to a preset mode, wherein the target test data I is target test data which accords with a target batch script operation scene, and the target test data II is target test data which does not accord with the target batch script operation scene; and determining a target test file according to the target test data I and the target test data II.

For example, according to the analyzed tree structure and the relational logic operator, test data meeting a logic scene and test data not meeting the logic scene are proportionally generated from the test environment in an extraction and modification mode, and a test file is formed.

In summary, the generated test file includes both data that conforms to the script running scenario and data that does not conform to the script running scenario, so that the coverage of the test data can be expanded, and the accuracy of testing a large amount of data of the script can be improved.

Optionally, in the preparation method of test data provided in the embodiment of the present application, before analyzing the object code in the target batch script to obtain the tree structure of the object code and the operator of the object code, the method further includes: judging whether unresolved codes exist in the target batch scripts; if the target batch script has the unresolved codes, taking the unresolved codes as target codes; and if the target batch script does not have the unresolved codes, the target batch script is resolved.

In this embodiment, fig. 4 is a flowchart of automatically parsing a big data batch script in an embodiment of the present application, and as shown in fig. 4, the flow of automatically parsing a big data batch script specifically includes: (1) inputting any single big data batch script; (2) analyzing a latest section of unresolved SQL codes in the script, and if the unresolved SQL codes do not exist, entering an automatic analysis completion process; (3) inputting the unresolved SQL into an analyzer to obtain a final upstream and downstream tree structure relational graph and a logical operator; (4) automatically generating an upstream data table and corresponding fields in a database system according to the upstream and downstream tree structure relational graph and the logic operator; (5) marking the currently analyzed SQL code as analyzed completion, and returning to the step of analyzing the latest section of unresolved SQL code in the script; (6) and (5) completing the automatic analysis.

Through the scheme, the big data script is analyzed, and the logic operator and the tree structure can be rapidly output.

In summary, the preparation method of the test data provided by the embodiment of the present application obtains the target batch script, where the target batch script is used to process the big data; analyzing the target codes in the target batch scripts to obtain a tree structure of the target codes and operators of the target codes, wherein the tree structure represents structural information among the codes in the target codes, and the operators represent mapping relations among the codes in the target codes; generating a target data table and a target field in the target data table according to the tree structure and the operator; the target data table and the target field are analyzed to generate target test data, and the problem that the test data preparation efficiency is low due to the fact that the test data is prepared manually in the related technology is solved. According to the tree structure of the target codes and the operator of the target codes, which are obtained by analyzing the target codes in the obtained target batch scripts, the target data table and the target fields in the target data table are generated, the target data table and the target fields are analyzed, and target test data are generated, so that test data can be automatically generated, and the effect of improving the preparation efficiency of the test data is achieved.

It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.

The embodiment of the present application further provides a device for preparing test data, and it should be noted that the device for preparing test data of the embodiment of the present application may be used to execute the method for preparing test data provided by the embodiment of the present application. The following describes a test data preparation apparatus provided in an embodiment of the present application.

Fig. 5 is a schematic diagram of a device for preparing test data according to an embodiment of the present application. As shown in fig. 5, the apparatus includes: a first acquisition unit 501, a first analysis unit 502, a first generation unit 503, and a second analysis unit 504.

Specifically, the first obtaining unit 501 is configured to obtain a target batch script, where the target batch script is used to process big data;

a first analyzing unit 502, configured to analyze target codes in the target batch scripts to obtain a tree structure of the target codes and operators of the target codes, where the tree structure represents structure information between codes in the target codes, and the operators represent mapping relationships between the codes in the target codes;

a first generating unit 503, configured to generate a target data table and a target field in the target data table according to the tree structure and the operator;

a second analyzing unit 504, configured to analyze the target data table and the target field, and generate target test data.

To sum up, the device for preparing test data provided in the embodiment of the present application obtains a target batch script through the first obtaining unit 501, where the target batch script is used to process big data; the first analysis unit 502 analyzes the target codes in the target batch scripts to obtain a tree structure of the target codes and operators of the target codes, wherein the tree structure represents structural information among the codes in the target codes, and the operators represent mapping relations among the codes in the target codes; the first generating unit 503 generates a target data table and a target field in the target data table according to the tree structure and the operator; the second analyzing unit 504 analyzes the target data table and the target field to generate target test data, and solves the problem of low preparation efficiency of the test data caused by manual preparation of the test data in the related art.

Optionally, in the apparatus for preparing test data provided in this embodiment of the present application, the apparatus further includes: the device comprises a first determining unit, a second determining unit and a third determining unit, wherein the first determining unit is used for determining a target test file after analyzing a target data table and a target field and generating target test data, and the target test file at least comprises two or more target test data; and the first verification unit is used for verifying the correctness of the target batch scripts by using the target test file.

Optionally, in the apparatus for preparing test data provided in this embodiment of the present application, the apparatus further includes: the second determining unit is used for determining the data volume of each target test data in the target test file and the correlation degree between every two target test data before verifying the correctness of the target batch script by using the target test file; the first processing unit is used for obtaining a predicted result of running the target batch scripts according to the data volume and the correlation degree; the third determining unit is used for determining the test case according to the target test data in the target test file; and the second processing unit is used for executing the target batch scripts according to the test cases to obtain the test results of the target batch scripts.

Optionally, in the apparatus for preparing test data provided in the embodiment of the present application, the first verification unit includes: the first judgment module is used for judging whether the predicted result is the same as the test result or not; the first processing module is used for representing that the target batch script is correct if the predicted result is the same as the test result; and the second processing module is used for representing that the target batch script is wrong if the predicted result is different from the test result.

Optionally, in the apparatus for preparing test data provided in the embodiment of the present application, the second parsing unit includes: the first determining module is used for determining an operator of the target field; the first analysis module is used for analyzing the associated fields in the target source table according to the operators of the target fields, wherein the target source table is a table associated with the target data table; the first extraction module is used for extracting data from the target source table according to the associated fields; and the first generation module is used for generating target test data according to the data extracted from the source table.

Optionally, in the apparatus for preparing test data provided in this embodiment of the present application, the apparatus further includes: the second generation unit is used for generating a first target script and a second target script according to the tree structure and the operator after analyzing the target codes in the target batch scripts to obtain the tree structure and the operator corresponding to the target codes; the third processing unit is used for analyzing the target code by using the target script I to obtain a target data table and an operator of a target field; and the third generating unit is used for generating target test data according to the target script II and the target field.

Optionally, in the apparatus for preparing test data provided in the embodiment of the present application, the first determining unit includes: the first acquisition module is used for acquiring first target test data and second target test data according to a preset mode, wherein the first target test data is the target test data which accords with the running scene of the target batch script, and the second target test data is the target test data which does not accord with the running scene of the target batch script; and the second determining module is used for determining the target test file according to the first target test data and the second target test data.

Optionally, in the apparatus for preparing test data provided in this embodiment of the present application, the apparatus further includes: the first judgment unit is used for judging whether unresolved codes exist in the target batch script before analyzing the target codes in the target batch script to obtain a tree structure of the target codes and operators of the target codes; the fourth processing unit is used for taking the unresolved codes as target codes if the unresolved codes exist in the target batch script; and the fifth processing unit is used for representing that the target batch script is analyzed and completed if the unresolved codes do not exist in the target batch script.

The device for preparing test data comprises a processor and a memory, wherein the first acquiring unit 501, the first analyzing unit 502, the first generating unit 503, the second analyzing unit 504 and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.

The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be set to be one or more, and the preparation efficiency of the test data is improved by adjusting the kernel parameters.

The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.

An embodiment of the present invention provides a computer-readable storage medium on which a program is stored, the program implementing the preparation method of the test data when executed by a processor.

The embodiment of the invention provides a processor, which is used for running a program, wherein the preparation method of test data is executed when the program runs.

As shown in fig. 6, an embodiment of the present invention provides an electronic device, where the device includes a processor, a memory, and a program stored in the memory and executable on the processor, and the processor executes the program to implement the following steps: acquiring a target batch script, wherein the target batch script is used for processing big data; analyzing the target codes in the target batch scripts to obtain a tree structure of the target codes and operators of the target codes, wherein the tree structure represents structural information among the codes in the target codes, and the operators represent mapping relations among the codes in the target codes; generating a target data table and a target field in the target data table according to the tree structure and the operator; and analyzing the target data table and the target field to generate target test data.

The processor executes the program and further realizes the following steps: after parsing the target data table and the target field to generate target test data, the method further includes: determining a target test file, wherein the target test file at least comprises two or more target test data; and verifying the correctness of the target batch scripts by utilizing the target test file.

The processor executes the program and further realizes the following steps: before verifying the correctness of the target batch script using the target test file, the method further comprises: determining the data volume of each target test data in the target test file and the association degree between every two target test data; obtaining a predicted result of running the target batch script according to the data volume and the correlation degree; determining a test case according to the target test data in the target test file; and executing the target batch scripts according to the test cases to obtain test results of the target batch scripts.

The processor executes the program and further realizes the following steps: verifying the correctness of the target batch scripts by using the target test file comprises the following steps: judging whether the predicted result is the same as the test result; if the predicted result is the same as the test result, representing that the target batch script is correct; and if the predicted result is different from the test result, representing that the target batch script is wrong.

The processor executes the program and further realizes the following steps: analyzing the target data table and the target field, and generating target test data comprises: determining an operator of the target field; analyzing an associated field in a target source table according to the operator of the target field, wherein the target source table is a table associated with the target data table; extracting data from the target source table according to the associated fields; and generating the target test data according to the data extracted from the source table.

The processor executes the program and further realizes the following steps: after analyzing the object codes in the object batch script to obtain a tree structure and an operator corresponding to the object codes, the method further comprises the following steps: generating a first target script and a second target script according to the tree structure and the operator; analyzing the target code by using the first target script to obtain the target data table and the operator of the target field; and generating the target test data according to the second target script and the target field.

The processor executes the program and further realizes the following steps: determining the target test file comprises: acquiring target test data I and target test data II according to a preset mode, wherein the target test data I is target test data which accords with the target batch script operation scene, and the target test data II is target test data which does not accord with the target batch script operation scene; and determining the target test file according to the first target test data and the second target test data.

The processor executes the program and further realizes the following steps: before analyzing the object codes in the object batch scripts to obtain the tree structure of the object codes and the operators of the object codes, the method further comprises the following steps: judging whether the target batch script has an unanalyzed code or not; if the unresolved codes exist in the target batch script, taking the unresolved codes as the target codes; and if the unresolved codes do not exist in the target batch script, the target batch script is analyzed and completed. The device herein may be a server, a PC, a PAD, a mobile phone, etc.

The present application further provides a computer program product adapted to perform a program for initializing the following method steps when executed on a data processing device: acquiring a target batch script, wherein the target batch script is used for processing big data; analyzing the target codes in the target batch scripts to obtain a tree structure of the target codes and operators of the target codes, wherein the tree structure represents structural information among the codes in the target codes, and the operators represent mapping relations among the codes in the target codes; generating a target data table and a target field in the target data table according to the tree structure and the operator; and analyzing the target data table and the target field to generate target test data.

When executed on a data processing device, is further adapted to perform a procedure for initializing the following method steps: after parsing the target data table and the target field to generate target test data, the method further includes: determining a target test file, wherein the target test file at least comprises two or more target test data; and verifying the correctness of the target batch scripts by utilizing the target test file.

When executed on a data processing device, is further adapted to perform a procedure for initializing the following method steps: before verifying the correctness of the target batch script using the target test file, the method further comprises: determining the data volume of each target test data in the target test file and the association degree between every two target test data; obtaining a predicted result of running the target batch script according to the data volume and the correlation degree; determining a test case according to the target test data in the target test file; and executing the target batch scripts according to the test cases to obtain test results of the target batch scripts.

When executed on a data processing device, is further adapted to perform a procedure for initializing the following method steps: verifying the correctness of the target batch scripts by using the target test file comprises the following steps: judging whether the predicted result is the same as the test result or not; if the predicted result is the same as the test result, representing that the target batch script is correct; and if the predicted result is different from the test result, representing that the target batch script is wrong.

When executed on a data processing device, is further adapted to perform a procedure for initializing the following method steps: analyzing the target data table and the target field, and generating target test data comprises: determining an operator of the target field; analyzing an associated field in a target source table according to the operator of the target field, wherein the target source table is a table associated with the target data table; extracting data from the target source table according to the associated fields; and generating the target test data according to the data extracted from the source table.

When executed on a data processing device, is further adapted to perform a procedure for initializing the following method steps: after analyzing the object codes in the object batch scripts to obtain the tree structures and operators corresponding to the object codes, the method further comprises the following steps: generating a first target script and a second target script according to the tree structure and the operator; analyzing the target code by using the first target script to obtain the target data table and the operator of the target field; and generating the target test data according to the second target script and the target field.

When executed on a data processing device, is further adapted to perform a procedure for initializing the following method steps: determining the target test file comprises: acquiring target test data I and target test data II according to a preset mode, wherein the target test data I is target test data which accords with the target batch script operation scene, and the target test data II is target test data which does not accord with the target batch script operation scene; and determining the target test file according to the first target test data and the second target test data.

When executed on a data processing device, is further adapted to perform a procedure for initializing the following method steps: before analyzing the object codes in the object batch scripts to obtain the tree structure of the object codes and the operators of the object codes, the method further comprises the following steps: judging whether the target batch script has an unanalyzed code or not; if the unresolved codes exist in the target batch script, taking the unresolved codes as the target codes; and if the unresolved codes do not exist in the target batch script, the target batch script is analyzed and completed.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A method for preparing test data, comprising:

acquiring a target batch script, wherein the target batch script is used for processing big data;

analyzing the target codes in the target batch scripts to obtain a tree structure of the target codes and operators of the target codes, wherein the tree structure represents structural information among the codes in the target codes, and the operators represent mapping relations among the codes in the target codes;

generating a target data table and a target field in the target data table according to the tree structure and the operator;

and analyzing the target data table and the target field to generate target test data.

2. The method of claim 1, wherein after parsing the target data table and the target field to generate target test data, the method further comprises:

determining a target test file, wherein the target test file at least comprises two or more target test data;

and verifying the correctness of the target batch scripts by utilizing the target test file.

3. The method of claim 2, wherein prior to verifying the correctness of the target batch script using the target test file, the method further comprises:

determining the data volume of each target test data in the target test file and the association degree between every two target test data;

obtaining a predicted result of running the target batch script according to the data volume and the correlation degree;

determining a test case according to the target test data in the target test file;

and executing the target batch scripts according to the test cases to obtain test results of the target batch scripts.

4. The method of claim 3, wherein verifying the correctness of the target batch script using the target test file comprises:

judging whether the predicted result is the same as the test result;

if the predicted result is the same as the test result, representing that the target batch script is correct;

and if the predicted result is different from the test result, representing that the target batch script is wrong.

5. The method of claim 1, wherein parsing the target data table and the target field to generate target test data comprises:

determining an operator of the target field;

analyzing an associated field in a target source table according to the operator of the target field, wherein the target source table is a table associated with the target data table;

extracting data from the target source table according to the associated fields;

and generating the target test data according to the data extracted from the source table.

6. The method of claim 1, wherein after parsing the object code in the target batch script to obtain the tree structure and the operator corresponding to the object code, the method further comprises:

generating a first target script and a second target script according to the tree structure and the operator;

analyzing the target code by using the first target script to obtain the target data table and the operator of the target field;

and generating the target test data according to the second target script and the target field.

7. The method of claim 2, wherein determining a target test file comprises:

acquiring target test data I and target test data II according to a preset mode, wherein the target test data I is target test data which accords with the target batch script operation scene, and the target test data II is target test data which does not accord with the target batch script operation scene;

and determining the target test file according to the first target test data and the second target test data.

8. The method of claim 1, wherein before parsing the object code in the target batch script to obtain the tree structure of the object code and the operator of the object code, the method further comprises:

judging whether the target batch script has an unanalyzed code or not;

if the unresolved codes exist in the target batch script, taking the unresolved codes as the target codes;

and if the unresolved codes do not exist in the target batch script, the target batch script is analyzed and completed.

9. An apparatus for preparing test data, comprising:

the device comprises a first obtaining unit, a second obtaining unit and a processing unit, wherein the first obtaining unit is used for obtaining a target batch script, and the target batch script is used for processing big data;

the first analysis unit is used for analyzing the target codes in the target batch scripts to obtain a tree structure of the target codes and operators of the target codes, wherein the tree structure represents structural information among codes in the target codes, and the operators represent mapping relations among the codes in the target codes;

the first generation unit is used for generating a target data table and a target field in the target data table according to the tree structure and the operator;

and the second analysis unit is used for analyzing the target data table and the target field to generate target test data.

10. A computer-readable storage medium characterized in that the storage medium stores a program, wherein the program executes the method of preparing test data according to any one of claims 1 to 8.

11. An electronic device comprising one or more processors and memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of test data preparation of any one of claims 1 to 8.