Test data base establishing method and test database create system
Technical field
The present invention relates to computer software testing field, more particularly to a kind of test data base establishing method and test data
Storehouse creates system.
Background technology
With the development of internet, the application of the emerging Trading Model such as B2B and B2C and develop rapidly, big data and cloud meter
Calculation field receives pursuing for increasing Research Team, data message by be following most worthy product.The place of information
Speed is managed, depending on processing speed of the data base management system to data.And the research of data base management system, it is after all
The research of database performance.And the TPC-E benchmark that the international affairs treatability energy committee (TPC) proposes, it is evaluation and test big data
The authoritative standard that db transaction is handled under scape.
TPC-E is the new OLTP test benchmarks for replacing TPC-C.TPC-E benchmark more really and accurately simulate existing
Some enterprise's application environments, have carried out huge innovation with improving on test model, have more emphasized the high simulation quality of model.TPC-
E is using American Stock Exchange as model, and test model is as shown in figure 1, including days such as online transaction, account inquiries, market surveys
Often operation.
The test simulation securities broker company can be also associated with extraneous financial market, and corresponding behaviour is performed according to turn of the market
Make and update the account and market information of correlation.It not only contains the C2B also included B2B of environment environment, this business
Model is more familiar to be also easier to understand, while closer to the actual application environment of existing user.
The quality and the scale of construction of the test environment of database benchmark test by the accuracy of determination data storehouse evaluation result, because
This, is when carrying out performance test by Benchmark test system to numerous data databases, the establishment of benchmark test data lab environment
It is particularly important.
At present, the establishment of the test data lab environment of Benchmark test system of the prior art under TPC-E loads, it owns
Field be all simply to substitute real data as test sample to build test environment using simple " 0 ", " 1 " numeral,
Evaluation and test of the main logical relation for stressing TPC-E benchmark models to the disposal ability of database.But, create in this way
Benchmark test environment, test sample data type is single, data composition and actually differ greatly, the mistake of performance test can be increased
Difference, has a great impact to test result.
The content of the invention
In order to solve the above-mentioned technical problem the present invention, proposes a kind of test data base establishing method, it is characterised in that bag
Include:
Step S1, obtains True Data source, the True Data source includes the data of nonnumeric type;
Step S2, source data word frequency base is set up according to the True Data source;
Step S3, sets up test database, and the test database includes test data table;
Step S4, triggers task affairs, the task affairs are according to the data record generated by the source data word frequency base
Update or add test data to the test data table.
Preferably, the True Data source includes data element, the source data word frequency base includes the data element
And the corresponding word frequency of the data element;The step S2 includes:
Step S2-1, analyzes the word frequency of the data element, and the word frequency reflects the data element in the true number
According to the number of the number of times occurred in source;
Step S2-2, sets up the source data word frequency base.
Preferably, the step S4 includes:
The word frequency and the data element of data element in step S4-1, generation data record set, the data record
Word frequency of the element in source data word frequency base is consistent;
Step S4-2, the input that original data record in the data record set makees the task affairs is believed
Breath, starts task affairs, and the task affairs update or added test data to the test data table;
Step S4-3, judges whether the test database meets preset requirement, if it is satisfied, test database creates knot
Beam, if be unsatisfactory for, returns to step S4-2.
Preferably, judging whether the test database meets the method for preset requirement and include in the step S4-3:
Judge whether the size of the test database meets preset requirement.
Preferably, the test data includes customer profile data, the test data table includes storing the client
The table of customer's information of information data;Judge whether the test database meets the method for requirement and include in the step S4-3:
Judge whether the customer profile data in the table of customer's information meets preset requirement.
Preferably, the task affairs, which include transaction, performs affairs, the transaction performs affairs and updated or addition institute
Customer profile data is stated to the table of customer's information, the transaction perform affairs triggering other types task affairs to update or
Test data is added to the test data table;In the step S4-2, start the transaction and perform affairs.
Preferably, the data record includes data field, the data field is made up of multiple data elements;
The step S4-1 includes:
Step S4-1-1, data element is selected from the source data word frequency base;
Step S4-1-2, data field is constituted by the data element;
Step S4-1-3, the data record is constituted by the data field;
Step S4-1-4, the data record set is added to by the data record.
The present invention also provides a kind of test database and creates system, it is characterised in that including:
Data source acquisition module, includes the True Data source of nonnumeric categorical data, the True Data source for obtaining
Including data element;
Memory module, source data word frequency base and test that the memory module storage is set up according to the True Data source
Database;The source data word frequency base includes the data element and the corresponding word frequency of the data element, the test number
Include test data table according to storehouse;
Performing module, performs task affairs;The task affairs are remembered according to the data generated by the source data word frequency base
Record updates or added test data to the test data table.
Preferably, the word frequency of the data element in the storage unit stores data set of records ends, the data record
It is consistent with word frequency of the data element in source data word frequency base;The performing module includes:
Data record set generation unit, record data is generated and by the record data according to the source data word frequency base
Added in the data record set;
Task affairs trigger element, the task affairs are made by original data record in the data record set
Input information, start task affairs, the task affairs update or addition test data is to the test data table;
Judging unit, judges whether the test database meets preset requirement.
Preferably, the test data includes customer profile data, the test data table includes storing the client
The table of customer's information of information data;
The task affairs include transaction and perform affairs, and the transaction performs affairs and updates or add the customer information
Data are to the table of customer's information;
The task executive officer business trigger element starts the transaction and performs affairs, and the transaction performs affairs and triggers other classes
Type task affairs are to update or add test data to the test data table;
The judging unit judges whether the customer profile data in the table of customer's information meets preset requirement.
The present invention, by carrying out word frequency analysis to True Data source, is built using the True Data source for being derived from stock exchange
Vertical word frequency base.Then data element is chosen so as to composition data field according to the size of word frequency, generation data line is circulated with this
All fields of record, to set up data record.Using diversified data type, model is set up by probability statistics, more
Real simulation TPC-E benchmark models, so as to more really evaluate and test out Database Performace.
Brief description of the drawings
Fig. 1 TPC-E test benchmark models;
The test database of Fig. 2 embodiment of the present invention creates system schematic;
The test data base establishing method flow chart of Fig. 3 embodiment of the present invention;
The pattern and entity relationship diagram of the test data table of Fig. 4 embodiment of the present invention;
The test test data top layer level graph of a relation of Fig. 5 embodiment of the present invention.
Embodiment
Specific examples below is only explanation of the invention, and it is not limitation of the present invention, art technology
Personnel can make the modification without creative contribution to the present embodiment as needed after this specification is read, but as long as
All protected in scope of the presently claimed invention by Patent Law.
Embodiment one
A kind of test database creates system, for suitable for creating the test data for TPC-E Benchmark test systems
Lab environment.Include as shown in Fig. 2 the test database described in the present embodiment creates system:
The data for obtaining the real data from stock exchange, and are carried out pre- by one, data source acquisition modules
True source data is obtained after processing.Here pretreatment typically refers to reject dealing money, bank's card number in the True Data
Deng the information of exposure privacy of user.So that the True Data source obtained is that ensure that abundant data type (including character type etc.
The data of non-data type).
Two, memory modules, memory module storage source data word frequency base, data record set and test data table.
Wherein source data word frequency base is set up according to True Data source:True Data source includes multiple data elements, for example,
Surname (" opening ") and name (" three ") in customer name (such as " Zhang San ") can be respectively as a data elements.Count True Data
The occurrence number of each data element in source, is used as the word frequency of the data element.And by the data element and the data element
The word frequency of element is stored into source data word frequency base.Preferably, the data element in the remittance of source data word frequency base is according to the big of word frequency
Small order arrangement.
Data record set include in data record, data record the word frequency of the data element of all data records with should
Word frequency of the data element in source data word frequency base is consistent.
Test data table in the present embodiment is towards during SQLite TCP-E Benchmark test systems establishment test run
Data environment.It includes 33 tables of data, and table 1 shows 33 test data tables in the test database of the present embodiment.
In 33 test data tables, there are 27 test data table tables to contain foreign key constraint, a total of 50 external keys.Test number
Include multiple data fields according to table, the type of data field has Char (character type), Integer (integer), Byte (byte type)
Deng ten kinds of data types, 188 row are had.Pattern, field information, adduction relationship, index entry and the entity of 33 test data tables
The specific design description of relation is as shown in Figure 4.Wherein, indicate PK for table major key, indicate FK for off-balancesheet key.Fig. 5 is shown
The hierarchical relationship of 33 test data tables.
Three, performing modules, perform task affairs.Transaction types, the description such as table 2 of each transaction types during TPC-E includes 12
It is shown:
Each task affairs upon execution, can accordingly update or add the test number in each related test data table
According to, and the execution of a task affairs can also trigger the execution of other relative task affairs.
Specifically, performing module further comprises:
Data record set generation unit, generates record data according to source data word frequency base and record data is added into number
According in set of records ends.Source data word frequency base is fitted using Matlab to export multiple data elements, and causes output
Each data element word frequency it is identical with its word frequency in source data word frequency base.Generated according to the characteristic of the task affairs of execution
The data record of input required for performing the task affairs.Data record is typically made up of multiple data fields, one data word
Section generally includes one or more data element;Number is constituted according to the data element that the type of data field chooses corresponding types
Record, and stored into data record set according to field, then by multiple data field composition datas.
Task affairs trigger element, the input that original data record in data record set makees task affairs is believed
Breath, starts task affairs, and task affairs update or added test data to test data table.Preferably, data are remembered successively
Data record in record set starts task affairs as the input information of task affairs.Preferably start in the present embodiment and hand over
Affairs are easily performed, transaction, which performs affairs and needs to input, includes the data record of customer information, and affairs, energy are performed by performing transaction
Enough additions into CUSTOMER (table of customer's information) update customer profile data.Meanwhile, transaction performs affairs and can also triggered not
Trade transactions with as being added in other test data tables or updating other test datas.
Judging unit, judges whether test data table meets preset requirement.The preset requirement is that test database creates system
The systematic parameter that system is configured, can be the size of test database, or a certain test data in test database
The population size of test data in table.If table 3 is the configuration parameter list that the test database creates system, the parameter list
In to specify the population size of customer profile data in CUSTOMER in test database (table of customer's information) be 5000:
For example, table 4 is according to SQLite wide area information server characteristics, client's amount of setting TPC-E test databases is
When 5000, the record number of 33 tables is as shown in the table, now, and the size of test database is 2.8G.
When judging unit judges that the size of test database is more than or equal to preset value, show the wound of test database
Build end.In the TXT files for saving the data in DATA files, the EGenLoader in the Gen program bags provided using TPC is complete
Into data loading operations.When judging unit judges that the size of test database is less than preset value, show test database also not
Reach default requirement, transaction execution unit needs to continue executing with task affairs to be extended to test database, until test
Untill the size of database reaches preset value.Or when judging unit judges the survey in the test data table of some in test database
When the quantity for trying data is more than or equal to preset value, show that the establishment of test database terminates.Save the data in DATA texts
In the TXT files of part, the EGenLoader in the Gen program bags provided using TPC completes data loading operations.Work as judging unit
When judging that the quantity of the test data in the test data table of some in test database is less than preset value, show test database also
Not up to default to require, transaction execution unit needs to continue executing with task affairs to be extended to test database, until surveying
Untill the quantity of the test data of the test data table reaches preset value in examination database.
Below by taking COMPANY tables as an example, the kernel program such as table 5 of performing module is realized with Python:
Such as Fig. 3, the test data base establishing method for creating system based on above-mentioned test database includes:
Step S1, obtains True Data source, True Data source includes the data of nonnumeric type.Wherein, True Data source
Including data element.
Step S2, source data word frequency base is set up according to True Data source, and source data word frequency base includes data element and number
According to the corresponding word frequency of element.Specifically include:
Step S2-1, the word frequency of analyze data element, the number of times that word frequency reflection data element occurs in True Data source
Number.
Step S2-2, sets up source data word frequency base.
Step S3, sets up test database, and test database includes test data table.Test data includes customer information number
According to test data table includes the table of customer's information of storage customer profile data.
Step S4, triggers task affairs, task affairs according to the data record renewal generated by source data word frequency base or
Test data is added to test data table.Specifically include:
The word frequency of data element in step S4-1, generation data record set, data record is with data element in source number
It is consistent according to the word frequency in word frequency base.Data record is made up of multiple data fields, and data field includes multiple data elements again
Element composition.Step S4-1 further comprises:
Step S4-1-1, data element is selected from source data word frequency base;
Step S4-1-2, data field is constituted by data element;
Step S4-1-3, data field composition data is recorded;
Step S4-1-4, data record set is added to by data record.
Step S4-2, original data record in data record set is made the input information of task affairs, starts
Task affairs, task affairs update or added test data to test data table.Preferably start transaction in the present embodiment to perform
Affairs.Transaction, which performs affairs, can update or add customer profile data to table of customer's information;Meanwhile, transaction performs affairs energy
Other types task affairs are enough triggered to update or add test data to test data table.
Step S4-3, judges whether test data table meets preset requirement, terminates if it is satisfied, test database is created,
If be unsatisfactory for, step S4-2 is returned to.Judging the method whether test data table meets preset requirement can be:Judge test
Whether the size of tables of data meets preset requirement;Or can also be:Judge whether test data table meets in step S4-3 to want
The method asked includes:Judge whether the customer profile data in table of customer's information meets preset requirement.
Specific embodiment described herein is only to spirit explanation for example of the invention.Technology neck belonging to of the invention
The technical staff in domain can be made various modifications or supplement to described specific embodiment or be replaced using similar mode
Generation, but without departing from the spiritual of the present invention or surmount scope defined in appended claims.