CN109726096B - Test data generation method and device - Google Patents

Test data generation method and device Download PDF

Info

Publication number
CN109726096B
CN109726096B CN201711317105.XA CN201711317105A CN109726096B CN 109726096 B CN109726096 B CN 109726096B CN 201711317105 A CN201711317105 A CN 201711317105A CN 109726096 B CN109726096 B CN 109726096B
Authority
CN
China
Prior art keywords
test data
file
generating
memory
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711317105.XA
Other languages
Chinese (zh)
Other versions
CN109726096A (en
Inventor
梁新刚
梁双春
滕滨
白国涛
张琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Suzhou Software Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Publication of CN109726096A publication Critical patent/CN109726096A/en
Application granted granted Critical
Publication of CN109726096B publication Critical patent/CN109726096B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The application relates to the technical field of mobile communication, in particular to a test data generation method and a test data generation device, which are used for solving the problems that the mode of modifying the rule for generating test data is inflexible and the efficiency of generating test data is low when test data is generated in the prior art; the test data generation method provided by the embodiment of the application comprises the following steps: when an instruction for generating test data is received, analyzing a stored configuration file, wherein the configuration file comprises a rule for generating the test data; obtaining an executable file for generating the test data by utilizing a reflection mechanism and a rule for generating the test data obtained by analyzing the configuration file; the executable file is operated to generate the test data meeting the rule, so that the rule for generating the test data is obtained by analyzing the configuration file, the rule for generating the test data can be modified only by modifying the configuration file, the modification mode is flexible, and the code for generating the test data is not required to be developed for the second time, so that the efficiency for generating the test data can be improved.

Description

Test data generation method and device
The present application claims priority of chinese patent application having application number 201711027591.1 entitled "a test data generating method and apparatus" filed by chinese patent office on 27/10/2017, the entire contents of which are incorporated herein by reference.
Technical Field
The present application relates to the field of mobile communications technologies, and in particular, to a method and an apparatus for generating test data.
Background
With the rapid development of big data technology, big data products appear in many industries, and in practical application, testing the big data products is a very important link, but because the data volume required for testing the big data products is large, the test data of the big data products generally needs to be generated in advance.
In the prior art, when test data of a large data product is generated, rules for generating the test data are fixed in program codes, and if the rules for generating the test data are required to be modified, developers are required to modify all relevant program codes and then regenerate the test data, so that the mode for modifying the rules for generating the test data is not flexible, and the efficiency for generating the test data is low.
Therefore, when generating test data in the prior art, the method for modifying the rule for generating the test data is not flexible, and the efficiency for generating the test data is low.
Disclosure of Invention
The embodiment of the application provides a test data generation method and a test data generation device, which are used for solving the problems that in the prior art, when test data are generated, the mode of modifying the rule of generating the test data is inflexible, and the efficiency of generating the test data is low.
The test data generation method provided by the embodiment of the application comprises the following steps:
when an instruction for generating test data is received, analyzing a stored configuration file, wherein the configuration file comprises a rule for generating the test data;
obtaining an executable file for generating the test data by utilizing a reflection mechanism and a rule for generating the test data obtained by analyzing the configuration file;
and running the executable file to generate test data meeting the rule.
The test data generation device provided by the embodiment of the application comprises:
the analysis module is used for analyzing a stored configuration file when receiving an instruction for generating test data, wherein the configuration file comprises a rule for generating the test data;
the executable file generation module is used for obtaining an executable file for generating the test data by utilizing a reflection mechanism and a rule for generating the test data obtained by analyzing the configuration file;
and the test data generation module is used for operating the executable file to generate the test data meeting the rule.
An electronic device provided in an embodiment of the present application includes at least one processing unit and at least one storage unit, where the storage unit stores program codes, and when the program codes are executed by the processing unit, the electronic device is caused to execute the steps of the test data generation method.
A computer-readable storage medium provided in an embodiment of the present application includes a program code, and when the program code runs on an electronic device, the electronic device is caused to execute the steps of the test data generation method.
In the embodiment of the application, when an instruction for generating the test data is received, the stored configuration file containing the rule for generating the test data is analyzed, the reflection mechanism and the rule for generating the test data obtained by analyzing the configuration file are utilized to obtain the executable file for generating the test data, and then the executable file is operated to generate the test data meeting the rule.
Drawings
Fig. 1 is a flowchart of a test data generation method according to an embodiment of the present disclosure;
FIG. 2 is a flowchart of another test data generation method provided in the embodiments of the present application;
FIG. 3 is a flow chart of local-to-local generation of test data provided by an embodiment of the present application;
FIG. 4 is a flowchart of local-to-HDFS test data generation provided in an embodiment of the present application;
FIG. 5 is a flow chart of generating test data from a local database according to an embodiment of the present application;
FIG. 6 is a flow chart of a database to locally generate test data according to an embodiment of the present application;
FIG. 7 is a flowchart illustrating the database-to-HDFS test data generation provided by an embodiment of the present application;
FIG. 8 is a flow chart of database-to-database generation of test data provided by an embodiment of the present application;
FIG. 9 is a flow chart of yet another database-to-database test data generation provided by an embodiment of the present application;
fig. 10 is a block diagram of a test data generating apparatus according to an embodiment of the present application;
fig. 11 is a schematic diagram of a hardware structure of an electronic device for implementing a test data generation method according to an embodiment of the present application.
Detailed Description
In the embodiment of the application, when an instruction for generating the test data is received, the stored configuration file containing the rule for generating the test data is analyzed, the reflection mechanism and the rule for generating the test data obtained by analyzing the configuration file are utilized to obtain the executable file for generating the test data, and then the executable file is operated to generate the test data meeting the rule.
The embodiments of the present application will be described in further detail with reference to the drawings attached hereto.
Example one
As shown in fig. 1, a flowchart of a test data generation method provided in the embodiment of the present application includes the following steps:
s101: and when an instruction for generating the test data is received, analyzing the stored configuration file containing the rule for generating the test data.
The rule for generating test data contained in the configuration file is preset by a technician according to the test requirement.
Optionally, when the configuration file is set, the source file may be further referred to, for example, one piece of data in a certain source file includes 3 fields, name, age, and mobile phone number, and when the rule for generating the test data is determined by referring to the source file, each piece of generated test data may be designated to include several fields, such as name, age, mobile phone number, and gender, then, the generation rule is designated to any two Chinese characters for the name field, the value range is designated to the age field is 0-100, the first digit is designated to the mobile phone number field is 1, and the remaining 10 digits may be randomly generated between 0-9; the designation for gender field is chosen randomly between male and female.
S102: and obtaining an executable file for generating the test data by utilizing a reflection mechanism and a rule for generating the test data obtained by analyzing the configuration file.
For example, a Java file implementing the rule for generating test data can be obtained by using a function for implementing a reflection mechanism provided in a Java development tool and a rule for generating test data obtained by parsing a configuration file, and then the Java file is compiled to obtain an executable Class file.
S103: the executable file is run to generate test data that satisfies the rules.
Optionally, the configuration file may further be analyzed to obtain a total amount of test data to be generated, a number of threads used when the test data is generated, and a storage path of the generated test data, when the test data is generated, a corresponding number of threads may be started according to the number of threads used when the test data is generated, a task of generating the test data is allocated to each thread, then an executable file is run by using each thread, the test data generated by running the executable file is stored in the storage path analyzed from the configuration file, and when it is determined that the total amount of the test data stored in the storage path reaches the total amount of the test data to be generated, the generation of the test data is stopped.
In the process, the efficiency of generating the mass test data is improved by means of a multithreading technology, and in the specific implementation process, the aim of generating the mass test data with high performance can be achieved by combining a cache mechanism.
Specifically, for any thread, when the test data generated by the thread is stored in the storage path, the test data generated by the thread may be written into the memory object corresponding to the thread line by line, and when it is determined that the line number of the test data in the memory object reaches the threshold value of the number of cache lines, the test data in the memory object is stored in the specified storage path, where the threshold value of the number of cache lines is obtained by parsing the configuration file.
In addition, in the specific implementation process, in order to prevent the memory from being seriously consumed and influencing the efficiency of generating massive test data, the use condition of the memory in the process of generating the test data can be monitored, if the available memory is determined to be smaller than the target value, then the optimization coefficient for optimizing the threshold of the number of cache lines can be determined according to the maximum memory of the Java virtual machine, the memory used by the Java virtual machine, the number of threads, the size of the source file, and the number of data lines in the source file, and then the threshold of the number of cache lines is optimized based on the optimization coefficient, wherein the target value is equal to the product of the number of bytes of the memory object and the number of threads, and the source file is a reference file for a technician when determining the configuration file, such that, under the condition of insufficient memory space, the threshold of the number of cache lines is properly reduced, and the task of generating massive test data with high performance can be ensured to be smoothly carried out.
For example, in the implementation process, the optimization coefficient OPT for optimizing the buffer line number threshold may be determined according to the following formulaRATION
Figure BDA0001503993110000051
Wherein:
MEMJVM_MAXcalculating the maximum memory of the Java virtual machine by taking the byte number as a unit;
BYTEJVM_ALREADY_USEDcalculating the used memory of the Java virtual machine by taking the byte number as a unit;
BYTESRC_DATAcalculating the size of a source file by taking the number of bytes as a unit;
THREADCURRENT_LAUNCHthe number of threads;
LINECACHEa cache line number threshold;
LINESRC_DATAis the number of lines of data in the source file.
Further, the LINE number threshold LINE may be set according to the following formulaCACHEOptimizing:
Figure BDA0001503993110000061
where 0 < α < 1, for example, α is 0.36.
In addition, in the prior art, when generating test data, the method is based on expansion operation between databases, and the limitation is large, but the method provided in this embodiment of the present application modifies rules when generating test data by modifying configuration files, so that the method is suitable for a variety of scenarios of generating test data, such as local- > local, local- > Hadoop Distributed File System (HDFS), database- > database, database- > File, database- > HDFS, File- > database, where local- > local means that a rule for generating test data is determined according to a data structure of a local a File, and the generated test data is stored in a local b File; local- > HDFS means that a rule for generating test data is determined according to a data structure of a local c file, and the generated test data is stored in a certain file d of the HDFS; the database- > database means that a rule for generating test data is determined according to a storage structure of data stored in the database e, and the generated test data is stored in the database f; the database- > file refers to a rule for determining and generating test data according to a data structure of data stored in the database g, and storing the generated test data into a local h file; the database- > HDFS means that a rule for generating test data is determined according to a data structure of data stored in a database i, and the generated test data is stored in a certain file j of the HDFS; file- > database refers to determining a rule for generating test data according to the data structure of the local k file, and storing the generated test data in the database l.
Optionally, if the test data is stored in a local file or a certain file of the HDFS, the files may be stored in a specified file format, such as a csv file format, and the size of a single csv file may be limited, at this time, for any thread, when the test data in the memory object corresponding to the thread is stored in a corresponding storage path, a data amount of the csv file written once may be preset, and then according to the data amount, the test data in the memory object is written in a single csv file in a batch by using the thread, if it is determined that the single csv file reaches the limited size, the csv file may be newly created, and the remaining test data in the memory object is written in the newly created csv file in a batch manner, where the data amount of the preset csv file written once may be predetermined by a technician according to network IO and disk IO performance.
In the embodiment of the application, when an instruction for generating the test data is received, the stored configuration file containing the rule for generating the test data is analyzed, the reflection mechanism and the rule for generating the test data obtained by analyzing the configuration file are utilized to obtain the executable file for generating the test data, and then the executable file is operated to generate the test data meeting the rule.
Example two
In the embodiment of the application, a template of the configuration file can be provided for a user, the user can flexibly modify the rule when generating the test data through the stored configuration file only by filling the rule when generating the test data in the template of the configuration file according to the specification or modifying the rule when generating the test data and storing the filled configuration file, and then the modified configuration file is used for generating the test data.
Specifically, as shown in fig. 2, a flowchart of a further test data generation method provided in the embodiment of the present application includes the following steps:
s201: xml, a configuration file dmcfg.
In a specific implementation, a technician can design rules for generating test data according to the data structure of an existing file stored in a file or a distributed system, the rules defining the generation criteria of fields in the test data when each piece of test data is generated, and then write the rules into a configuration file dmcfg.
The configuration file also includes at least the total amount of test data to be generated, the number of threads used to generate the test data, and the storage path of the generated test data.
S202: dm.jar was generated using a reflection mechanism according to dmcfg.xml.
In the specific implementation process, XML structure definitions (XML schema Definition, XSD) can be compared, whether dmcfg.xml is an XML file meeting the specification or not is judged, after the dmcfg.xml is determined to be the XML file meeting the specification, the rule of generating test data in dmcfg.xml is analyzed, a corresponding Java file for generating the test data is further constructed by using a reflection mechanism, the Java file is compiled to generate a Class file executable by a computer, the Class file is injected into a dm.jar package, and the related Class file in the dm.jar package is replaced or newly added.
S203: jar is executed.
And executing the sh.jar to generate a command line file for allocating memory for the Java virtual machine.
S204: jar is executed.
And further, executing the command line file, configuring memory parameters of the Java virtual machine, and running a newly generated project package dm.jar to generate test data.
In the specific implementation process, in order to improve the efficiency of generating the test data as much as possible, the generation of the test data may be concurrently completed by using the high-speed Processing characteristics of a Central Processing Unit (CPU) and a memory.
Specifically, each thread of the CPU may process a data generation task, test data generated by each thread is first cached in a memory object corresponding to the thread line by line, the memory object supports a high-speed write-in characteristic, and when it is determined that the number of cached data lines in the memory object reaches a preset cache line threshold, a persistence operation is performed.
Optionally, if the storage location is a local file or a file in the HDFS, the performance of network IO and disk IO may be considered comprehensively, and a suitable single data write amount is set, and then, in a persistent operation link, any thread may write the cache data in the memory object corresponding to the thread into the file in batches according to the single data write amount.
In addition, the cache line threshold value can be optimized according to the memory use condition of the current computer.
Specifically, assuming that when a tool for implementing the method according to the embodiment of the present application is started, the detected available physical memory capacity is a, the maximum memory allocated to the Java virtual machine may be 0.75A, which may ensure that the physical memory is efficiently utilized, and then an initial cache line threshold is set according to the maximum memory of the Java virtual machine, and further, in the process of generating test data, the available memory space may be monitored, and if the size of the available memory space is smaller than the product of the number of bytes of the memory object and the number of threads, an optimization coefficient OPT for optimizing the cache line threshold is determined according to the following formulaRATION
Figure BDA0001503993110000091
Wherein:
MEMJVM_MAXcalculating the maximum memory of the Java virtual machine by taking the byte number as a unit;
BYTEJVM_ALREADY_USEDcalculating the used memory of the Java virtual machine by taking the byte number as a unit;
BYTESRC_DATAcalculating the size of a source file by taking the number of bytes as a unit;
THREADCURRENT_LAUNCHthe number of threads;
LINECACHEa cache line number threshold;
LINESRC_DATAis the number of lines of data in the source file.
Further, LINE is a cache LINE number threshold according to the following formulaCACHEOptimizing:
Figure BDA0001503993110000092
where 0 < α < 1, for example, α is 0.36.
Therefore, when the available memory space is reduced, the threshold value of the cache object is properly reduced, the computer can still normally run under the condition of insufficient memory space, and the task of generating the test data can be smoothly carried out.
The data source and destination data storage modes supported by the embodiment of the application can be expanded on the premise of not changing the architecture design, only the configuration file and the processing program of the newly added response configuration item need to be changed during expansion, the existing functions and programs are not affected, and more data sources are supported, such as data sources of a local file system, an HDFS file system, a HugeTable, Greenplus, PostgreSQL, MySQL and the like. Moreover, the embodiment of the application can be realized by adopting Java language, and JRE provides good cross-platform characteristics, so that the method provided by the embodiment of the application can be used in cross-platform on Windows and Linux environments.
The following illustrates a case where test data generation is performed among a plurality of data sources according to an embodiment of the present application.
1) The test data is generated locally to locally.
In a specific implementation process, the local-to-local test data generation may be performed according to the flow shown in fig. 3:
s301: and writing the source file data into the memory.
Here, the data in the source file may be read through a secondary caching mechanism according to the storage address of the local source file parsed from the configuration file, and the read data may be stored as a source string List.
S302: test data is generated.
And further traversing the data in the source character string List, and modifying the traversed data according to a rule for generating test data, so as to generate a target character string List.
S303: and writing the test data into the target file.
Further, the target character string List is stored in the target text through a secondary cache mechanism, and whether the target file is compressed, whether the size of the target file is limited, and the number of the target files which can be stored are determined according to configuration information obtained from the configuration file.
2) local-to-HDFS generates test data.
In a specific implementation process, the generation of local-to-HDFS test data may be performed according to the flow shown in fig. 4:
s401: and writing the source file data into the memory.
Here, the data in the source file may be read through a secondary caching mechanism according to the storage address of the local source file parsed from the configuration file, and the read data may be stored as a source string List.
S402: test data is generated.
And further traversing the data in the source character string List, and modifying the traversed data according to a rule for generating test data, so as to generate a target character string List.
S403: the test data is written to the HDFS.
Further, the target character string List is stored in the HDFS through a secondary caching mechanism, and the HDFS is stored in a compressed manner by default.
3) And generating test data from the local database.
In a specific implementation process, the generation of the local-to-database test data may be performed according to the flow shown in fig. 5:
s501: and writing the source file data into the memory.
Here, the data in the source file may be read through a secondary caching mechanism according to the storage address of the local source file parsed from the configuration file, and the read data may be stored as a source string List.
S502: test data is generated.
And further traversing the data in the source character string List, and modifying the traversed data according to a rule for generating test data, so as to generate a target character string List.
S503: and writing the test data into the database.
And further, traversing the target character string List, generating a precompiled SQL set, and inserting the precompiled SQL set into the database in a segmented batch manner.
4) The database locally generates test data.
In a specific implementation process, the generation of the database-to-local test data may be performed according to the flow shown in fig. 6:
s601: and writing the data in the database into a memory.
Here, the data in the database may be read in batches, and the read data may be written into the memory and stored as the source string List.
S602: test data is generated.
And further traversing the data in the source character string List, and modifying the traversed data according to a rule for generating test data, so as to generate a target character string List.
S603: the test data is written into the target text.
Further, the target character string List is stored in a local target text through a secondary cache mechanism, and whether the target file is compressed or not, whether the size of the target file is limited or not and the number of the target files which can be stored are determined according to configuration information obtained from the configuration file.
5) The database to HDFS generated test data.
In a specific implementation process, the generation of the database-to-HDFS test data may be performed according to the flow shown in fig. 7:
s701: and writing the data in the database into a memory.
Here, the data in the database may be read in batches, and the read data may be written into the memory and stored as the source string List.
S702: test data is generated.
Further, traversing the data in the source character string List, and modifying the traversed data according to a rule for generating test data, so as to generate a target character string List;
s703: the test data is written to the HDFS.
Further, the target character string List is stored in the HDFS through a secondary caching mechanism, and the HDFS is stored in a compressed manner by default.
6) And generating test data from the database to the database.
Alternatively, in the case of performing data expansion modification across databases, the data expansion modification may be performed according to the flow shown in fig. 8:
s801: and writing the data in the source database into a memory.
Here, the data in the source database may be read in batches, and the read data may be written into the memory and stored as the source string List.
S802: test data is generated.
And further traversing the data in the source character string List, and modifying the traversed data according to a rule for generating test data, so as to generate a target character string List.
S803: and writing the test data into the target database.
And further, traversing the target character string List, generating a precompiled SQL set, and inserting the precompiled SQL set into the target database in a segmented batch manner.
Alternatively, in the case of a directly operated database, such as a distributed data warehouse like HugeTable, greenplus, PostgreSQL, etc., a clustered database, and a commonly used database, such as mysql, may be performed according to the process shown in fig. 9:
s901: a database table is copied.
Here, the data tables in the pending database may be copied first.
S902: an insert operation is performed.
Furthermore, data of corresponding fields in the copy table are modified according to rules for generating test data, the modified data are inserted back into the original table, and if partitions of the partition table do not exist, partitions are automatically created, and then insertion is executed.
S903: the copy table is deleted.
Further, after the insert operation is performed, the copy table is deleted.
EXAMPLE III
Based on the same inventive concept, the embodiment of the present application further provides a test data generation device corresponding to the test data generation method, and as the principle of solving the problem of the device is similar to that of the test data generation method in the embodiment of the present application, the implementation of the device can refer to the implementation of the method, and repeated details are not repeated.
As shown in fig. 10, a structure diagram of a test data generating apparatus provided in an embodiment of the present application includes:
the analysis module 1001 is configured to analyze a stored configuration file when receiving an instruction to generate test data, where the configuration file includes a rule for generating the test data;
an executable file generating module 1002, configured to obtain an executable file for generating test data by using a reflection mechanism and a rule for generating test data obtained by analyzing the configuration file;
a test data generating module 1003, configured to run the executable file to generate test data meeting the rule.
Optionally, the configuration file is analyzed to obtain a total amount of test data to be generated, a number of threads used when the test data is generated, and a storage path of the generated test data, and the test data generation module 1003 is specifically configured to:
starting threads with corresponding number according to the number of the threads used when the test data is generated;
distributing the task for generating the test data to each thread;
running the executable file by utilizing each thread, and storing the test data generated by utilizing the thread under the storage path;
and stopping generating the test data when determining that the total amount of the test data stored in the storage path reaches the total amount of the test data required to be generated.
Optionally, the test data generating module 1003 is specifically configured to:
writing the test data generated by the thread into the memory object corresponding to the thread line by line;
and when determining that the line number of the test data written into the memory object reaches a cache line number threshold, storing the test data in the memory object under the storage path by using the thread, wherein the cache line number threshold is obtained by analyzing the configuration file.
Optionally, the monitoring module 1004 and the optimizing module 1005 are further included:
the monitoring module 1004 is configured to monitor a use condition of a memory in a process of generating test data;
the optimizing module 1005 is configured to determine an optimizing coefficient for optimizing the threshold of the number of cache lines according to the maximum memory of the Java virtual machine, the memory used by the Java virtual machine, the number of threads, the size of the source file, and the number of data lines in the source file if it is determined that the available memory is smaller than the target value, and optimize the threshold of the number of cache lines based on the optimizing coefficient, where the target value is equal to a product of the number of bytes of the memory object and the number of threads, and the source file is a reference file used for determining the configuration file.
Optionally, the optimization module 1005 specifically determines the optimization coefficient OPT for optimizing the cache line number threshold according to the following formulaRATION
Figure BDA0001503993110000141
Wherein MEM isJVM_MAXCalculating the maximum memory of the Java virtual machine by taking the byte number as a unit; BYTEJVM_ALREADY_USEDCalculating the used memory of the Java virtual machine by taking the byte number as a unit; BYTESRC_DATACalculating the size of a source file by taking the number of bytes as a unit; THREADCURRENT_LAUNCHThe number of threads; LINECACHEA cache line number threshold; LINESRC_DATAIs the number of lines of data in the source file.
Optionally, the optimization module 1005 specifically performs LINE threshold on the cache LINE number according to the following formulaCACHEOptimizing:
Figure BDA0001503993110000151
wherein alpha is more than 0 and less than 1.
Optionally, if the test data in the storage path is stored in a specified file format, and the size of a single file stored in the specified file format is limited, the test data generating module 1003 is specifically configured to:
according to a preset single data write-in amount, utilizing the thread to write the test data in the memory object into a single file stored in the specified file format in batches;
and if the size of the single file is determined to reach the limited size, creating a new file, and writing the residual test data in the memory object into the newly created file in batches.
Example four
As shown in fig. 11, a schematic hardware structure diagram of an electronic device for implementing a test data generation method provided in an embodiment of the present application includes at least one processing unit 1101 and at least one storage unit 1102, where the storage unit stores program codes, and when the program codes are executed by the processing unit, the electronic device is caused to execute the steps of the test data generation method.
EXAMPLE five
A computer-readable storage medium provided in an embodiment of the present application includes a program code, and when the program code runs on an electronic device, the electronic device is caused to execute the steps of the test data generation method.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (14)

1. A method for generating test data, comprising:
when an instruction for generating test data is received, analyzing a stored configuration file, wherein the configuration file comprises a rule for generating the test data;
obtaining an executable file for generating the test data by utilizing a reflection mechanism and a rule for generating the test data obtained by analyzing the configuration file;
running the executable file to generate test data meeting the rule;
obtaining an executable file for generating test data by using a reflection mechanism and a rule for generating test data obtained by analyzing the configuration file, wherein the executable file comprises:
obtaining a Java file for realizing the rule of generating the test data by utilizing a function for realizing a reflection mechanism provided in a Java development tool and a rule for generating the test data obtained by analyzing a configuration file, and compiling the Java file to obtain an executable Class file;
analyzing the configuration file to obtain the total amount of test data to be generated, the number of threads used when the test data is generated and the storage path of the generated test data, and then operating the executable file to generate the test data meeting the rule, wherein the steps of:
starting threads with corresponding number according to the number of the threads used when the test data is generated;
distributing the task for generating the test data to each thread;
running the executable file by utilizing each thread, and storing the test data generated by utilizing the thread under the storage path;
and stopping generating the test data when determining that the total amount of the test data stored in the storage path reaches the total amount of the test data required to be generated.
2. The method of claim 1, wherein storing test data generated with the thread under the storage path comprises:
writing the test data generated by the thread into the memory object corresponding to the thread line by line;
and when determining that the line number of the test data written into the memory object reaches a cache line number threshold, storing the test data in the memory object under the storage path by using the thread, wherein the cache line number threshold is obtained by analyzing the configuration file.
3. The method of claim 2, further comprising,
monitoring the use condition of a memory in the process of generating test data;
if the available memory is determined to be smaller than the target value, determining an optimization coefficient for optimizing the threshold of the number of cache lines according to the maximum memory of the Java virtual machine, the memory used by the Java virtual machine, the number of threads, the size of a source file and the number of data lines in the source file, wherein the target value is equal to the product of the number of bytes of the memory object and the number of threads, and the source file is a reference file when the configuration file is determined;
optimizing the cache line number threshold based on the optimization coefficient.
4. The method of claim 3, wherein the optimization coefficient OPT for optimizing the cache line number threshold is determined according to the following formulaRATION
Figure FDA0003276160580000021
Wherein MEM isJVM_MAXThe maximum memory of the Java virtual machine; BYTEJVM_ALREADY_USEDA used memory for the Java virtual machine; BYTESRC_DATAIs the source file size; THREADCURRENT_LAUNCHThe number of threads; LINECACHEA cache line number threshold; LINESRC_DATAIs the number of lines of data in the source file.
5. The method of claim 4, in which a rootSetting the LINE number threshold LINE according to the following formulaCACHEOptimizing:
Figure FDA0003276160580000022
wherein alpha is more than 0 and less than 1.
6. The method of claim 2, wherein if the test data in the storage path is stored in a specified file format and the size of a single file stored in the specified file format is limited, using the thread to store the test data in the memory object in the storage path comprises:
according to a preset single data write-in amount, utilizing the thread to write the test data in the memory object into a single file stored in the specified file format in batches;
and if the size of the single file is determined to reach the limited size, creating a new file, and writing the residual test data in the memory object into the newly created file in batches.
7. A test data apparatus, comprising:
the analysis module is used for analyzing a stored configuration file when receiving an instruction for generating test data, wherein the configuration file comprises a rule for generating the test data;
the executable file generation module is used for obtaining an executable file for generating the test data by utilizing a reflection mechanism and a rule for generating the test data obtained by analyzing the configuration file;
the test data generation module is used for operating the executable file to generate test data meeting the rule;
the executable file generation module is specifically used for obtaining a Java file for realizing the rule of generating the test data by utilizing a function for realizing a reflection mechanism provided in a Java development tool and a rule for generating the test data obtained by analyzing a configuration file, and compiling the Java file to obtain an executable Class file;
analyzing the configuration file to obtain the total amount of test data to be generated, the number of threads used when the test data is generated and the storage path of the generated test data, wherein the test data generation module is specifically used for:
starting threads with corresponding number according to the number of the threads used when the test data is generated;
distributing the task for generating the test data to each thread;
running the executable file by utilizing each thread, and storing the test data generated by utilizing the thread under the storage path;
and stopping generating the test data when determining that the total amount of the test data stored in the storage path reaches the total amount of the test data required to be generated.
8. The apparatus of claim 7, wherein the test data generation module is specifically configured to:
writing the test data generated by the thread into the memory object corresponding to the thread line by line;
and when determining that the line number of the test data written into the memory object reaches a cache line number threshold, storing the test data in the memory object under the storage path by using the thread, wherein the cache line number threshold is obtained by analyzing the configuration file.
9. The apparatus of claim 8, further comprising the monitoring module and the optimization module:
the monitoring module is used for monitoring the use condition of the memory in the process of generating the test data;
the optimization module is configured to determine an optimization coefficient for optimizing the threshold of the number of cache lines according to a maximum memory of the Java virtual machine, a memory used by the Java virtual machine, the number of threads, a source file size, and the number of data lines in a source file if it is determined that the available memory is smaller than a target value, and optimize the threshold of the number of cache lines based on the optimization coefficient, where the target value is equal to a product of the number of bytes of the memory object and the number of threads, and the source file is a reference file when the configuration file is determined.
10. The apparatus of claim 9, wherein the optimization module determines an optimization coefficient OPT to optimize the cache line number threshold based specifically on the following equationRATION
Figure FDA0003276160580000041
Wherein MEM isJVM_MAXThe maximum memory of the Java virtual machine; BYTEJVM_ALREADY_USEDA used memory for the Java virtual machine; BYTESRC_DATAIs the source file size; THREADCURRENT_LAUNCHThe number of threads; LINECACHEA cache line number threshold; LINESRC_DATAIs the number of lines of data in the source file.
11. The apparatus of claim 10, wherein said optimization module is to threshold LINE number of cache LINEs in particular according to the following formulaCACHEOptimizing:
Figure FDA0003276160580000042
wherein alpha is more than 0 and less than 1.
12. The apparatus of claim 8, wherein if the test data in the storage path is stored in a specified file format and the size of a single file stored in the specified file format is limited, the test data generation module is specifically configured to:
according to a preset single data write-in amount, utilizing the thread to write the test data in the memory object into a single file stored in the specified file format in batches;
and if the size of the single file is determined to reach the limited size, creating a new file, and writing the residual test data in the memory object into the newly created file in batches.
13. An electronic device, comprising at least one processing unit and at least one memory unit, wherein the memory unit stores program code that, when executed by the processing unit, causes the electronic device to perform the steps of the method of any of claims 1 to 6.
14. A computer-readable storage medium, comprising program code which, when run on an electronic device, causes the electronic device to perform the steps of the method of any of claims 1 to 6.
CN201711317105.XA 2017-10-27 2017-12-12 Test data generation method and device Active CN109726096B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711027591 2017-10-27
CN2017110275911 2017-10-27

Publications (2)

Publication Number Publication Date
CN109726096A CN109726096A (en) 2019-05-07
CN109726096B true CN109726096B (en) 2022-04-01

Family

ID=66293457

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711317105.XA Active CN109726096B (en) 2017-10-27 2017-12-12 Test data generation method and device

Country Status (1)

Country Link
CN (1) CN109726096B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110362314B (en) * 2019-07-12 2023-10-24 Oppo广东移动通信有限公司 Information processing method and device, computer readable medium and electronic equipment
CN110908891A (en) * 2019-09-18 2020-03-24 泰康保险集团股份有限公司 Test data generation method and device, electronic equipment and storage medium
CN110851357A (en) * 2019-11-04 2020-02-28 紫光云技术有限公司 Test data automatic construction method based on multiple database types
CN113535585A (en) * 2021-08-03 2021-10-22 广域铭岛数字科技有限公司 Test data generation method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1786925A (en) * 2005-03-08 2006-06-14 中国科学院软件研究所 TTCN-3 testing system basedon C++ mapping and its testing method
CN105095077A (en) * 2015-07-17 2015-11-25 北京奇虎科技有限公司 Automated testing method and device for user interfaces

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101221585A (en) * 2008-02-03 2008-07-16 华为技术有限公司 Data storage method and device
CN103186639B (en) * 2011-12-31 2017-10-10 腾讯科技(北京)有限公司 Data creation method and system
CN102722537A (en) * 2012-05-22 2012-10-10 苏州阔地网络科技有限公司 Database test data generation method and system thereof
CN103064968A (en) * 2012-12-31 2013-04-24 中国电子科技集团公司第十五研究所 Standardized data packing method based on cache
GB2516986B (en) * 2013-08-06 2017-03-22 Barclays Bank Plc Automated application test system
CN105138685B (en) * 2015-09-17 2018-11-16 福建新大陆软件工程有限公司 A kind of Performance Test System towards HBase
CN105512042B (en) * 2015-12-22 2018-09-04 广东金赋科技股份有限公司 A kind of automatic generation method of the test data of database, device and test system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1786925A (en) * 2005-03-08 2006-06-14 中国科学院软件研究所 TTCN-3 testing system basedon C++ mapping and its testing method
CN105095077A (en) * 2015-07-17 2015-11-25 北京奇虎科技有限公司 Automated testing method and device for user interfaces

Also Published As

Publication number Publication date
CN109726096A (en) 2019-05-07

Similar Documents

Publication Publication Date Title
CN109726096B (en) Test data generation method and device
US20200202246A1 (en) Distributed computing system, and data transmission method and apparatus in distributed computing system
US9697262B2 (en) Analytical data processing engine
US11314724B2 (en) Data deduplication acceleration
US10909086B2 (en) File lookup in a distributed file system
US9971808B2 (en) Fast query processing in columnar databases with GPUs
US20150293962A1 (en) Methods for In-Place Access of Serialized Data
US11169792B2 (en) Method and apparatus for generating patch
US10296497B2 (en) Storing a key value to a deleted row based on key range density
EP2997472B1 (en) Managing memory and storage space for a data operation
US11321228B1 (en) Method and system for optimal allocation, processing and caching of big data sets in a computer hardware accelerator
US20100057647A1 (en) Accommodating learned clauses in reconfigurable hardware accelerator for boolean satisfiability solver
US9594839B2 (en) Methods and systems for load balancing databases in a cloud environment
US20150347101A1 (en) R-language integration with a declarative machine learning language
EP2962186A1 (en) Providing code change job sets of different sizes to validators
US10417192B2 (en) File classification in a distributed file system
Sun et al. Accelerating graph analytics by utilising the memory locality of graph partitioning
CN105045789A (en) Game server database buffer memory method and system
US11861331B1 (en) Scaling high-level statistical languages to large, distributed datasets
US20170083537A1 (en) Mapping logical identifiers using multiple identifier spaces
US9176996B2 (en) Automated resolution of database dictionary conflicts
KR20150117522A (en) Graphics state manage apparatus and method
US20230244665A1 (en) Automatic selection of precompiled or code-generated operator variants
US20180150516A1 (en) Query plan generation for split table query operations
US11080299B2 (en) Methods and apparatus to partition a database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant