CN113434414A

CN113434414A - Data testing method and device, electronic equipment and storage medium

Info

Publication number: CN113434414A
Application number: CN202110721429.XA
Authority: CN
Inventors: 向乾; 尤薇; 李桂芸
Original assignee: Ping An Bank Co Ltd
Current assignee: Ping An Bank Co Ltd
Priority date: 2021-06-28
Filing date: 2021-06-28
Publication date: 2021-09-24

Abstract

The invention relates to the technical field of testing, and discloses a data testing method, which comprises the following steps: identifying a data version of sample data in a training sample data set; when the sample data is of a first version, calling a test case corresponding to the first version as a test case of the training sample data set; and when the sample data is of a second version, calculating first similarity of the training sample data and the reference data and second similarity between field-level data in the reference data and a plurality of standard fields, wherein the first similarity meets the condition, screening out a target test case according to the second similarity, and testing the target test case executed by the data to be tested by using a test engine to obtain a test result. In addition, the invention also relates to a block chain technology, and the first similarity can be stored in a node of the block chain. The invention also provides a data testing device, electronic equipment and a computer readable storage medium. The invention can solve the problem of low efficiency of data test.

Description

Data testing method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of test technologies, and in particular, to a data testing method and apparatus, an electronic device, and a computer-readable storage medium.

Background

With the advent of the big data era, various data are increasing, and meanwhile, certain requirements are also placed on the accuracy of the data, so that data testing is required. The traditional data test generally needs to manually compile data test cases, the manual compilation depends on the experience of testers and the understanding degree of services, and the test coverage rate cannot be guaranteed. Meanwhile, data testing lags behind data development, instant testing cannot be achieved, and efficiency needs to be improved. Therefore, a better data testing method is urgently needed.

Disclosure of Invention

The invention provides a data testing method, a data testing device and a computer readable storage medium, and mainly aims to solve the problem of low data testing efficiency.

In order to achieve the above object, the present invention provides a data testing method, which includes:

acquiring a historical data set, and screening the historical data set according to a pre-compiled check rule to obtain a training sample data set;

identifying a data version of sample data in the training sample data set;

when the sample data in the training sample data set is a first version, calling a test case corresponding to the first version as a test case of the training sample data set;

when the sample data in the training sample data set is a second version, calculating a first similarity between the training sample data set and pre-acquired reference data, and taking test cases corresponding to one or more groups of reference data with the first similarity being greater than or equal to a preset similarity threshold as the test cases of the training sample data set;

extracting field-level data in one or more groups of corresponding reference data of which the first similarity is greater than or equal to a preset similarity threshold, calculating second similarities between the field-level data and a plurality of standard fields in a standard field library, and taking test cases corresponding to the standard fields of which the second similarities are greater than or equal to the field threshold as field-level test cases;

determining at least one item in the test cases of the training sample data set, the field level test cases and the table level test cases corresponding to the field level test cases as a target test case;

and inputting the target test case into a preset test engine, and executing the target test case to test the pre-acquired data to be tested by using the test engine to obtain a test result.

Optionally, the screening the historical data set according to a pre-written verification rule to obtain a training sample data set, including:

verifying the historical data set by using a basic data verification rule in the verification rules to obtain first data which accords with the basic data verification rule;

screening out second data which accord with the user-defined check rule in the check rule from the historical data set;

and summarizing the first data and the second data to obtain a training sample data set.

Optionally, the verifying the historical data set by using a basic data verification rule in the verification rules to obtain first data meeting the basic data verification rule includes:

determining whether there is duplicate historical data in the historical data set;

and if the repeated historical data exist, deleting the repeated historical data, and screening out the historical data with the acquisition time being greater than or equal to a preset time threshold value in the historical data set as first data.

Optionally, the calculating a first similarity between the training sample data set and pre-acquired reference data includes:

calculating a covariance between the training sample data set and the reference data;

and calculating according to the covariance and a preset Pearson correlation formula to obtain a first similarity between the training sample data set and the pre-acquired reference data.

Optionally, said calculating covariance between said training sample data set and said reference data comprises:

training the covariance between the sample data set and the reference data using the following formula:

cov(X，Y)＝E(X-μ)(Y-υ)

wherein cov (X, Y) is the covariance, X is the training sample data set, Y is the reference data, μ represents the mathematical expectation of the training sample data set, and υ is the mathematical expectation of the reference data.

Optionally, the calculating according to the covariance and a preset pearson correlation formula to obtain a first similarity between the training sample data set and the pre-acquired reference data includes:

and calculating to obtain a first similarity between the training sample data set and the pre-acquired reference data by using the following formula:

where ρ is_x，yCov (X, Y) is the covariance, σ_xAnd σ_yAnd respectively corresponding standard deviations of the training sample data set and the pre-acquired reference data.

Optionally, the identifying a data version of sample data in the training sample data set comprises:

acquiring a version information corresponding table;

and identifying the data version corresponding to the sample data according to the mapping relation between the sample data and the data version in the version information corresponding table.

In order to solve the above problem, the present invention also provides a data testing apparatus, comprising:

the data screening module is used for acquiring a historical data set, and screening the historical data set according to a pre-compiled check rule to obtain a training sample data set;

the data version identification module is used for identifying the data version of the sample data in the training sample data set;

a test case generation module, configured to, when sample data in the training sample data set is a first version, call a test case corresponding to the first version as the test case of the training sample data set, when the sample data in the training sample data set is a second version, calculate a first similarity between the training sample data set and pre-acquired reference data, use a test case corresponding to one or more groups of reference data having the first similarity greater than or equal to a preset similarity threshold as the test case of the training sample data set, extract field-level data in the one or more groups of reference data having the first similarity greater than or equal to the preset similarity threshold, calculate a second similarity between the field-level data and multiple standard fields in a standard field library, and use a test case corresponding to a standard field having the second similarity greater than or equal to the field threshold as the field-level test case Using a case;

and the test execution module is used for determining at least one of the obtained test cases of the training sample data set, the field level test cases and the table level test cases corresponding to the field level test cases as a target test case, inputting the target test case into a preset test engine, and executing the target test case on the pre-obtained data to be tested by using the test engine to obtain a test result.

In order to solve the above problem, the present invention also provides an electronic device, including:

a memory storing at least one instruction; and

and the processor executes the instructions stored in the memory to realize the data testing method.

In order to solve the above problem, the present invention further provides a computer-readable storage medium, which stores at least one instruction, and the at least one instruction is executed by a processor in an electronic device to implement the data testing method.

According to the invention, a pre-compiled check rule is executed on the historical data set to obtain a training sample data set, data meeting requirements are screened out by utilizing the check rule to serve as training sample data, the data version of the sample data in the training sample data set is judged, corresponding processing is carried out according to the data version of the sample data, when the sample data is a first version, a test case corresponding to the first version is called to serve as a test case of the training sample data set, wherein the first version is an old version, and a preset test case exists in the old version, so that the test case corresponding to the first version is directly called to serve as the test case of the training sample data set, the efficiency of data testing is improved, when the sample data in the training sample data set is a second version, the test case corresponding to one or more corresponding reference data screened out by calculating the first similarity between the sample data set and the pre-acquired reference data is used as the test case of the training sample data set After the use case is used as the test case of the training sample data set, field level data in the reference data are extracted, second similarity calculation and screening are carried out, the test case corresponding to the standard field with the second similarity being larger than or equal to the field threshold value is used as the field level test case, screening is carried out from the angle of the field level data, the accuracy of the test case is guaranteed, the target test case is executed to the engine, and the engine is used for testing the data to be tested to obtain a test result. Therefore, the data testing method, the data testing device, the electronic equipment and the computer readable storage medium provided by the invention can solve the problem of low data testing efficiency.

Drawings

Fig. 1 is a schematic flow chart of a data testing method according to an embodiment of the present invention;

FIG. 2 is a functional block diagram of a data testing apparatus according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of an electronic device implementing the data testing method according to an embodiment of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The embodiment of the application provides a data testing method. The execution subject of the data testing method includes, but is not limited to, at least one of electronic devices, such as a server and a terminal, which can be configured to execute the method provided by the embodiments of the present application. In other words, the data testing method may be performed by software or hardware installed in the terminal device or the server device, and the software may be a block chain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.

Fig. 1 is a schematic flow chart of a data testing method according to an embodiment of the present invention.

In this embodiment, the data testing method includes:

s1, obtaining a historical data set, and screening the historical data set according to a pre-written check rule to obtain a training sample data set.

In the embodiment of the invention, a historical data set can be obtained from a database storing the historical data set by using java statements with a data calling function, wherein the historical data set comprises data information such as a data source table, a calculation rule, a calculation cycle, a release cycle and a field.

Specifically, the screening the historical data set according to a pre-written verification rule to obtain a training sample data set includes:

In detail, the pre-written verification rule includes a basic data verification rule and a self-defined verification rule, wherein the basic verification rule is simple verification and data processing for the historical data set biased toward basic property data.

In the embodiment of the present invention, the basic data checking rule includes, but is not limited to, deduplication processing and checking against acquisition time. The user-defined check rule is a check rule written according to the operation purpose.

For example, the job purpose is "obtain a female client list with 1 ten thousand yuan of daily-average assets in the past week", the custom check rule may be that the daily-average assets are greater than or equal to 1 ten thousand yuan, the gender of the client is female, and the like.

Further, the verifying the historical data set by using a basic data verification rule in the verification rules to obtain first data meeting the basic data verification rule includes:

The acquisition time in the historical data set refers to the time called from the database.

And S2, identifying the data version of the sample data in the training sample data set.

In this embodiment of the present invention, the identifying the data version of the sample data in the training sample data set includes:

acquiring a version information corresponding table;

In detail, the version information correspondence table includes mapping relationships between a plurality of pieces of sample data and data versions, and the data versions of the sample data in the training sample data set can be identified according to the version information correspondence table.

And S3, when the sample data in the training sample data set is a first version, calling a test case corresponding to the first version as the test case of the training sample data set.

In the embodiment of the present invention, the first version is a version corresponding to reference data that is calculated and processed in advance, and when the sample data in the training sample data set is the first version, a test case corresponding to the first version is called as the test case of the training sample data set.

And S4, when the sample data in the training sample data set is a second version, calculating a first similarity between the training sample data set and the pre-acquired reference data, and taking the test cases corresponding to one or more groups of reference data with the first similarity being greater than or equal to a preset similarity threshold as the test cases of the training sample data set.

In the embodiment of the present invention, the second version refers to a version corresponding to reference data that is not subjected to calculation and processing, and when the sample data in the training sample data set is the second version, the corresponding test case cannot be directly obtained, so that similarity calculation and screening are required. The reference data is data which is stored in the database in advance and used for comparison, and can be obtained by calling from the database through a high-level program with a data calling function.

Specifically, the calculating a first similarity between the training sample data set and the pre-acquired reference data includes:

In detail, said calculating covariance between said training sample data set and said reference data comprises:

cov(X，Y)＝E(X-μ)(Y-υ)

In detail, the covariance is used to measure the overall error of two variables.

Further, the calculating according to the covariance and a preset pearson correlation formula to obtain a first similarity between the training sample data set and the pre-acquired reference data includes:

Specifically, the magnitude between the similarity and a preset similarity threshold is judged, and a test case corresponding to one or more reference data corresponding to the similarity greater than or equal to the preset similarity threshold is used as the test case of the training sample data set.

S5, extracting field level data in one or more groups of corresponding reference data with the first similarity being greater than or equal to a preset similarity threshold, calculating second similarities between the field level data and a plurality of standard fields in a standard field library, and taking the test case corresponding to the standard field with the second similarity being greater than or equal to the field threshold as the field level test case.

In the embodiment of the invention, the field level data in the corresponding one or more reference data with the first similarity being greater than or equal to the preset similarity threshold is extracted, the second similarity between the field level data and a plurality of standard fields in a standard field library is calculated, the test cases are screened according to the second similarity, the corresponding test cases can be further screened from the field angle based on the field level similarity calculation, and the accuracy of the test cases is ensured.

Specifically, the extracting of the field-level data in the corresponding one or more reference data with the first similarity greater than or equal to the preset similarity threshold is to trace the source of the field in the table according to SQL analysis.

Further, calculating a second similarity between the field-level data and a plurality of standard fields in a standard field library, comprising:

and respectively calculating second similarity between the field-level data and a plurality of standard fields in the standard field library by using a similarity formula.

In detail, the embodiment of the present invention may employ many calculation methods to calculate the second similarity between the field-level data and the plurality of standard fields in the standard field library, including, but not limited to, calculating by using a cosine similarity formula, calculating by using an euclidean distance, and the like.

Optionally, in an embodiment of the present invention, the calculating the second similarity between the field-level data and the plurality of standard fields in the standard field library includes:

calculating a second similarity between the field-level data and a plurality of standard fields in the standard field library using the following formula:

wherein cos (a, b) is the second similarity, a is the field vector, b is the standard vector, and | a |, | are the modulus corresponding to the field vector and the modulus corresponding to the standard vector, respectively.

The field level data and the plurality of standard fields in the standard field library can be vectorized according to a preset word2vec algorithm to obtain the field vector and the standard vector.

Specifically, the test case corresponding to the standard field with the second similarity greater than or equal to the field threshold is used as the field-level test case.

S6, determining at least one item of the test cases of the training sample data set, the field level test cases and the table level test cases corresponding to the field level test cases as target test cases.

In the embodiment of the present invention, the table-level test case corresponding to the field-level test case refers to a test case in which reference data corresponding to field-level data is extracted, a target test case is further determined according to the obtained response request, when the obtained response request is a full test, the test case of the training sample data set is determined as the target test case, when the obtained response request is a field test, the field-level test case is determined as the target test case, and when the obtained response request is a table-level test, the table-level test case corresponding to the field-level test case is determined as the target test case.

S7, inputting the target test case into a preset test engine, and executing the target test case to test the pre-acquired data to be tested by using the test engine to obtain a test result.

In the embodiment of the invention, a Hive engine is used for the test cases of the training sample data set and the field-level test cases to ensure the stability of the query sql, and a Presto engine is used for the table-level test cases corresponding to the field-level test cases to improve the test efficiency.

In detail, the Hive engine is a data warehouse basic tool used in Hadoop to process structured data, is structured above Hadoop to facilitate query and analysis, provides a simple sql query function, and can convert sql statements into MapReduce tasks for operation. The Presto engine is an open-source distributed SQL query engine, is suitable for real-time interactive analysis and query, supports massive data, and can solve the problem of low processing speed.

The data to be tested can be data which needs to be detected and evaluated and is obtained in a daily test environment.

In this embodiment, after the data to be tested is obtained, the test engine may be used to test a target test case executed on the pre-obtained data to be tested, so as to obtain a test result.

According to the invention, a pre-compiled check rule is executed on the historical data set to obtain a training sample data set, data meeting requirements are screened out by utilizing the check rule to serve as training sample data, the data version of the sample data in the training sample data set is judged, corresponding processing is carried out according to the data version of the sample data, when the sample data is a first version, a test case corresponding to the first version is called to serve as a test case of the training sample data set, wherein the first version is an old version, and a preset test case exists in the old version, so that the test case corresponding to the first version is directly called to serve as the test case of the training sample data set, the efficiency of data testing is improved, when the sample data in the training sample data set is a second version, the test case corresponding to one or more corresponding reference data screened out by calculating the first similarity between the sample data set and the pre-acquired reference data is used as the test case of the training sample data set After the use case is used as the test case of the training sample data set, field level data in the reference data are extracted, second similarity calculation and screening are carried out, the test case corresponding to the standard field with the second similarity being larger than or equal to the field threshold value is used as the field level test case, screening is carried out from the angle of the field level data, the accuracy of the test case is guaranteed, the target test case is executed to the engine, and the engine is used for testing the data to be tested to obtain a test result. Therefore, the data testing method provided by the invention can solve the problem of low data testing efficiency.

Fig. 2 is a functional block diagram of a data testing apparatus according to an embodiment of the present invention.

The data testing device 100 of the present invention can be installed in an electronic device. According to the implemented functions, the data testing apparatus 100 may include a data screening module 101, a data version identification module 102, a test case generation module 103, and a test execution module 104. The module of the present invention, which may also be referred to as a unit, refers to a series of computer program segments that can be executed by a processor of an electronic device and that can perform a fixed function, and that are stored in a memory of the electronic device.

In the present embodiment, the functions regarding the respective modules/units are as follows:

the data screening module 101 is configured to obtain a historical data set, and screen the historical data set according to a pre-compiled check rule to obtain a training sample data set;

the data version identification module 102 is configured to identify a data version of sample data in the training sample data set;

the test case generating module 103 is configured to, when sample data in the training sample data set is a first version, invoke a test case corresponding to the first version as the test case of the training sample data set, when the sample data in the training sample data set is a second version, calculate a first similarity between the training sample data set and pre-acquired reference data, use a test case corresponding to one or more groups of reference data having the first similarity greater than or equal to a preset similarity threshold as the test case of the training sample data set, extract field-level data in the one or more groups of reference data having the first similarity greater than or equal to the preset similarity threshold, calculate a second similarity between the field-level data and a plurality of standard fields in a standard field library, and use a test case corresponding to a standard field having the second similarity greater than or equal to the field threshold as the field-level data A class test case;

the test execution module 104 is configured to determine at least one of the obtained test cases of the training sample data set, the field-level test cases, and the table-level test cases corresponding to the field-level test cases as a target test case, input the target test case into a preset test engine, and execute the target test case on the pre-obtained data to be tested by using the test engine to obtain a test result.

In detail, the data testing apparatus 100 includes the following modules:

the method comprises the steps of firstly, obtaining a historical data set, and screening the historical data set according to a pre-compiled check rule to obtain a training sample data set.

And secondly, identifying the data version of the sample data in the training sample data set.

acquiring a version information corresponding table;

And step three, when the sample data in the training sample data set is a first version, calling a test case corresponding to the first version as the test case of the training sample data set.

And fourthly, when the sample data in the training sample data set is of a second version, calculating first similarity between the training sample data set and the pre-acquired reference data, and taking the test cases corresponding to one or more groups of reference data with the first similarity being greater than or equal to a preset similarity threshold as the test cases of the training sample data set.

cov(X，Y)＝E(X-μ)(Y-υ)

And fifthly, extracting field level data in one or more groups of corresponding reference data with the first similarity being greater than or equal to a preset similarity threshold, calculating second similarities between the field level data and a plurality of standard fields in a standard field library, and taking the test case corresponding to the standard field with the second similarity being greater than or equal to the field threshold as the field level test case.

And step six, determining at least one item of the test cases of the training sample data set, the field level test cases and the table level test cases corresponding to the field level test cases as a target test case.

And step seven, inputting the target test case into a preset test engine, and executing the target test case to test the pre-acquired data to be tested by using the test engine to obtain a test result.

According to the invention, a pre-compiled check rule is executed on the historical data set to obtain a training sample data set, data meeting requirements are screened out by utilizing the check rule to serve as training sample data, the data version of the sample data in the training sample data set is judged, corresponding processing is carried out according to the data version of the sample data, when the sample data is a first version, a test case corresponding to the first version is called to serve as a test case of the training sample data set, wherein the first version is an old version, and a preset test case exists in the old version, so that the test case corresponding to the first version is directly called to serve as the test case of the training sample data set, the efficiency of data testing is improved, when the sample data in the training sample data set is a second version, the test case corresponding to one or more corresponding reference data screened out by calculating the first similarity between the sample data set and the pre-acquired reference data is used as the test case of the training sample data set After the use case is used as the test case of the training sample data set, field level data in the reference data are extracted, second similarity calculation and screening are carried out, the test case corresponding to the standard field with the second similarity being larger than or equal to the field threshold value is used as the field level test case, screening is carried out from the angle of the field level data, the accuracy of the test case is guaranteed, the target test case is executed to the engine, and the engine is used for testing the data to be tested to obtain a test result. Therefore, the data testing device provided by the invention can solve the problem of low data testing efficiency.

Fig. 3 is a schematic structural diagram of an electronic device implementing a data testing method according to an embodiment of the present invention.

The electronic device 1 may comprise a processor 10, a memory 11, a communication interface 12 and a bus 13, and may further comprise a computer program, such as a data testing program, stored in the memory 11 and executable on the processor 10.

The memory 11 includes at least one type of readable storage medium, which includes flash memory, removable hard disk, multimedia card, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a removable hard disk of the electronic device 1. The memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 11 may be used not only to store application software installed in the electronic device 1 and various types of data, such as codes of a data test program, but also to temporarily store data that has been output or is to be output.

The processor 10 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device 1 by running or executing programs or modules (e.g., data test programs, etc.) stored in the memory 11 and calling data stored in the memory 11.

The communication interface 12 is used for communication between the electronic device and other devices, and includes a network interface and a user interface. Optionally, the network interface may include a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), which are typically used to establish a communication connection between the electronic device and other electronic devices. The user interface may be a Display (Display), an input unit such as a Keyboard (Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable, among other things, for displaying information processed in the electronic device and for displaying a visualized user interface.

The bus 13 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus 13 may be divided into an address bus, a data bus, a control bus, etc. The bus 13 is arranged to enable connection communication between the memory 11 and at least one processor 10 or the like.

Fig. 3 shows only an electronic device with components, and it will be understood by those skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than those shown, or some components may be combined, or a different arrangement of components.

For example, although not shown, the electronic device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so as to implement functions of charge management, discharge management, power consumption management, and the like through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.

Further, the electronic device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between the electronic device 1 and other electronic devices.

Optionally, the electronic device 1 may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 1 and for displaying a visualized user interface, among other things.

It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.

The data test program stored in the memory 11 of the electronic device 1 is a combination of instructions, which when executed in the processor 10, can implement:

identifying a data version of sample data in the training sample data set;

Specifically, the specific implementation method of the processor 10 for the instruction may refer to the description of the relevant steps in the embodiment corresponding to fig. 1, which is not described herein again.

Further, the integrated modules/units of the electronic device 1, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. The computer readable storage medium may be volatile or non-volatile. For example, the computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).

The present invention also provides a computer-readable storage medium, storing a computer program which, when executed by a processor of an electronic device, may implement:

identifying a data version of sample data in the training sample data set;

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.

The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.

Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. A method for data testing, the method comprising:

identifying a data version of sample data in the training sample data set;

2. The data testing method of claim 1, wherein the screening the historical data set according to a pre-written verification rule to obtain a training sample data set, comprises:

3. The data testing method of claim 2, wherein the verifying the historical data set using a base data verification rule of the verification rules to obtain first data that conforms to the base data verification rule comprises:

4. The data testing method of claim 1, wherein said calculating a first similarity between the training sample data set and pre-acquired reference data comprises:

5. The data testing method of claim 4, wherein said calculating covariance between said training sample data set and said reference data comprises:

cov(X,Y)＝E(X-μ)(Y-υ)

6. The data testing method of claim 4, wherein the calculating according to the covariance and a preset Pearson correlation formula to obtain a first similarity between the training sample data set and pre-acquired reference data comprises:

7. The data testing method of claim 1, wherein said identifying data versions of sample data in the training sample data set comprises:

acquiring a version information corresponding table;

8. A data testing apparatus, characterized in that the apparatus comprises:

9. An electronic device, characterized in that the electronic device comprises:

at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a data testing method according to any one of claims 1 to 7.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out a data testing method according to any one of claims 1 to 7.