CN109684198B

CN109684198B - Method, device, medium and electronic equipment for acquiring data to be tested

Info

Publication number: CN109684198B
Application number: CN201811348735.8A
Authority: CN
Inventors: 陈家荣
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2018-11-13
Filing date: 2018-11-13
Publication date: 2024-03-29
Anticipated expiration: 2038-11-13
Also published as: CN109684198A

Abstract

The invention relates to the technical field of data analysis, and discloses a method, a device, a medium and electronic equipment for acquiring data to be tested, wherein the method is implemented in a current node, the data to be tested is used for executing a test task, and the method comprises the following steps: determining a pending test data source node from the test data source nodes; aiming at each node of data source to be tested, acquiring the data exchange quantity and the data exchange times with the current node; determining a target test data source node from a plurality of undetermined test data source nodes based on the data exchange amount and the data exchange times; acquiring a label of data in a target test data source node; acquiring a test task description input by a user; and acquiring data to be tested based on the label and the test task description. According to the method, the acquisition node of the data to be tested is determined, and then the data to be tested is acquired from the nodes according to the labels of the data in the nodes, so that the accuracy of acquiring the data to be tested is ensured, and the acquisition efficiency of the data to be tested is improved.

Description

Method, device, medium and electronic equipment for acquiring data to be tested

Technical Field

The present invention relates to the field of data analysis, and in particular, to a method, an apparatus, a medium, and an electronic device for acquiring data to be tested.

Background

With the development of software and information industry, the development scale and complexity of the software are higher and higher; meanwhile, as market competition is increased and demands of users are more and more difficult to meet, higher requirements are put on functions of software, and the software has to be updated frequently to add new functions. When a function in a piece of software is integrated to some extent, a piece of software may be referred to as a system. To test these systems, some test data is selected for input into the system. However, the components, units, functions, etc. of these systems are often bulky and it is not possible to test all of these components, units, functions, etc. once per test. In order to make the test results more targeted, generally, each test is performed for only one of these components, units, functions.

In the implementation of the prior art, in order to complete a test task, related test data is acquired, the manner of acquiring the test data is mainly to manually select some data related to the test task from numerous data of upstream and downstream nodes of a system, and then the data is input into the system for testing, however, the system data is not friendly to people, errors often occur when the data is selected, even if the selected data is correct, a large amount of time is wasted when the proper data is selected because of the overlarge data amount, and the working efficiency is seriously reduced.

The prior art has the defects that the data are selected for testing by manpower, the proportion of the acquired data suitable for testing is low, the acquisition accuracy of the test data is low, and the acquisition efficiency of the test data is low.

Disclosure of Invention

In the technical field of data analysis, in order to solve the technical problem of low test data acquisition efficiency in the related art, the invention provides a method, a device, a medium and electronic equipment for acquiring data to be tested.

According to an aspect of the present application, there is provided a method for acquiring data to be tested, the method being implemented in a current node, the data to be tested being used for performing a test task, the method comprising:

determining a plurality of undetermined test data source nodes matched with the test task from all test data source nodes establishing communication with the current node, wherein the test data source nodes comprise data, and the data is provided with a label;

aiming at each determined to-be-tested data source node, acquiring the data exchange quantity and the data exchange times of the to-be-tested data source node and the current node in a preset time period;

determining a target test data source node for acquiring data to be tested from the plurality of to-be-tested data source nodes based on the data exchange amount and the data exchange times;

Acquiring a label of the data in the target test data source node;

acquiring a test task description input by a user;

and acquiring data to be tested based on the label and the test task description.

According to another aspect of the present application, there is provided a data acquisition device to be tested, the device belonging to a current node, the data to be tested being used for performing a test task, the device comprising:

the first determining module is configured to determine a plurality of pending test data source nodes matched with the test task from all test data source nodes establishing communication with the current node, wherein the test data source nodes comprise data with labels;

the first acquisition module is configured to acquire the data exchange amount and the data exchange times of the pending test data source node and the current node in a preset time period for each determined pending test data source node;

a second determining module configured to determine a target test data source node from the plurality of pending test data source nodes for obtaining data to be tested therefrom based on the data exchange amount and the data exchange number;

The second acquisition module is configured to acquire the label of the data in the target test data source node;

the third acquisition module is configured to acquire a test task description input by a user;

and the data to be tested acquisition module is configured to acquire the data to be tested based on the tag and the test task description.

According to another aspect of the present application, there is provided a computer readable program medium storing computer program instructions which, when executed by a computer, cause the computer to perform the method as described above.

According to another aspect of the present application, there is provided an electronic device including:

a processor;

a memory having stored thereon computer readable instructions which, when executed by the processor, implement a method as described above.

The technical scheme provided by the embodiment of the invention can comprise the following beneficial effects:

the method for acquiring the data to be tested provided by the invention comprises the following steps: determining a plurality of undetermined test data source nodes matched with the test task from all test data source nodes establishing communication with the current node, wherein the test data source nodes comprise data, and the data is provided with a label; aiming at each determined to-be-tested data source node, acquiring the data exchange quantity and the data exchange times of the to-be-tested data source node and the current node in a preset time period; determining a target test data source node for acquiring data to be tested from the plurality of to-be-tested data source nodes based on the data exchange amount and the data exchange times; acquiring a label of the data in the target test data source node; acquiring a test task description input by a user; and acquiring data to be tested based on the label and the test task description.

According to the method, the target test data source node for acquiring the data to be tested is determined, and then the data is acquired from the target test data source node according to the test task description input by the user and the label of the data in the target test data source node, so that the accuracy of acquiring the data to be tested is ensured, and the acquisition efficiency of the data to be tested is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

Fig. 1 is a schematic view of an application scenario of a data acquisition method to be tested according to an exemplary embodiment;

FIG. 2 is a flow chart illustrating a method of data acquisition to be tested, according to an exemplary embodiment;

FIG. 3 is a detailed flow diagram of step 230 according to an embodiment illustrated in the corresponding embodiment of FIG. 2;

FIG. 4 is a detailed flow diagram of step 240 according to one embodiment shown in the corresponding embodiment of FIG. 2;

FIG. 5 is a detailed flow diagram of step 260 according to an embodiment illustrated by the corresponding embodiment of FIG. 2;

FIG. 6 is a detailed flow chart of step 260 according to another embodiment shown in the corresponding embodiment of FIG. 2;

FIG. 7 is a detailed flow chart of step 260 according to another embodiment shown in the corresponding embodiment of FIG. 2;

FIG. 8 is a flow chart of predetermined rules according to one embodiment illustrated in the corresponding embodiment of FIG. 7;

FIG. 9 is a block diagram illustrating a data acquisition device to be tested in accordance with an exemplary embodiment;

FIG. 10 is an exemplary block diagram of an electronic device implementing the above-described method, according to an exemplary embodiment;

fig. 11 is a computer readable storage medium embodying the above method according to an exemplary embodiment.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the accompanying claims.

Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities.

The disclosure first provides a method for acquiring data to be tested. Data is an abstract information carrier of words, symbols, numbers, etc. that can be used to characterize the properties of an objective thing. The storage format of the data includes, but is not limited to TXT, JSON, EXCEL, XML, CSV and the like. In general, many systems function to process data input thereto for the purpose of interpreting the data. For example, in the financial field, financial data in a report is input into a financial data analysis system, and the financial data analysis system can obtain an analysis result of the data in the report. In the process of maintaining and upgrading software such as a data analysis system, page layout, software logic and even the whole architecture of the software of the system are often adjusted, so that after maintenance and upgrading, the software is often tested. The data analyzed by the system to be tested is input into the system to be tested after upgrading, the output result of the system to be tested before upgrading is compared with the output result of the system to be tested after upgrading, if the comparison results are consistent, the system is not problematic to upgrade, the purpose of testing the system after upgrading is achieved, and the performance of the system before and after upgrading is consistent.

Thus, in one embodiment, the test data may be data that has been analyzed historically by the system under test, the system having had a complete analysis of the data.

In another embodiment, the test data may be a table of correspondence between standard data and analysis results manually established by a user according to experience, and the analysis results in the table may be used to determine whether the performance of the system after upgrading is consistent. It should be understood that the test data and the system capable of analyzing the test data referred to in the present invention are not limited to those listed above, and the system capable of analyzing the test data may be any system capable of processing the data and outputting the result or performing a certain action according to the input of the data.

The environment in which the present invention is implemented may be a portable mobile device, such as a smart phone, tablet, notebook, PDA (Personal Digital Assistant), or a variety of stationary devices, such as a computer device, field terminal, desktop, server, workstation, etc.

Fig. 1 is a schematic view of an application scenario of a data acquisition method to be tested according to an exemplary embodiment. As shown in fig. 1, the center cylinder represents the current node, the cylinders arranged above the current node represent nodes that transmit data to the current node, the cylinders arranged below the current node represent nodes that receive data output by the current node, and the arrows represent the transmission directions of the data. Nodes are abstract representations of the end devices, which in this embodiment are shaped to facilitate the reader's understanding of the transfer of data, as if the objective thing were seen as particles in physics.

It should be noted that the schematic diagram shown in fig. 1 is only one embodiment of the present invention, and in a practical scenario, the connection relationship between the nodes and the transmission direction between the node data are likely to be completely different from the present embodiment, so the present embodiment should not be construed as limiting the present invention.

Fig. 2 is a flow chart illustrating a method of data acquisition to be tested, according to an exemplary embodiment. As shown in fig. 2, the method comprises the steps of:

and step 210, determining a plurality of undetermined test data source nodes matched with the test task from all the test data source nodes establishing communication with the current node.

A node is an abstract representation of a terminal or terminal component with arithmetic processing functionality.

In one embodiment, the implementation terminal of the present invention is abstracted to the current node.

In one embodiment, the computing architecture network in which the implementation terminal of the present invention resides is abstracted to the current node.

The transmission medium for establishing communication can be a wired transmission medium or a wireless transmission medium; the wired transmission medium may be twisted pair, coaxial cable, optical fiber, etc., and the wireless transmission medium may be electromagnetic wave such as laser, microwave, infrared, etc.

In one embodiment, a test data source node establishing communication with a current node is obtained first, and then a plurality of undetermined test data source nodes matched with the test task are determined from the test data source nodes establishing communication with the current node.

In one embodiment, by sending a request to each node, it is determined whether a communication connection is established between one node and the current node based on whether the current node can receive a response from each node.

In one embodiment, the IP addresses of all the nodes that establish communication with the current node are stored in the storage space of the present invention implementation terminal, and whether each node establishes communication with the current node is determined by determining whether the IP address of each node is consistent with the IP address in the storage space of the present invention implementation terminal.

In one embodiment, the test data source node establishing communication with the current node is known to the implementation terminal of the present invention, and determines a plurality of pending test data source nodes matching the test task by determining which output data will be changed according to which parts of the code of the system to be tested are modified.

In one embodiment, it is determined that each modification of the code file of the system under test corresponds to a source node of the data under test, and the correspondence is stored in a correspondence table. And judging which code file is modified, and then inquiring the corresponding relation table according to the modified code file to obtain the undetermined test data source node.

In one embodiment, whether the code file has been modified is determined by a hashing algorithm. The hash algorithm generates a unique hash value for each code file, and when the hash value of one code file changes, the code file is considered to have changed. The hash algorithm specifically adopted can be SHA-1, MD5 and the like.

In one embodiment, the test task and the pending test data source nodes corresponding to the test task are correspondingly stored in a corresponding relation table, and a plurality of pending test data source nodes matched with the test task can be directly determined according to the test task sent or input by a user. The test tasks are standardized phrase descriptions, such as for financial data analysis systems, one test task may be to estimate the stock value of a company based on the financial reports of the company on the market, then the test task is "stock value estimation".

Step 220, for each determined node of origin of data to be tested, obtaining the data exchange amount and the data exchange times of the node of origin of data to be tested and the current node in a predetermined time period.

In one embodiment, the data exchange refers only to downstream data sent by each node to the current node.

In one embodiment, the data exchange refers to the sum of the uplink and downlink data sent between the current node and each node.

In one embodiment, the system log in the current node may record the data exchange amount and the data exchange times between the current node and other nodes, and by reading the content in the system log, the data exchange amount and the data exchange times between each pending test data source node and the current node in the past preset time period may be obtained.

In one embodiment, a statistical node is further arranged in a node network formed by the nodes, when data exchange occurs between any two nodes in the node network, the exchanged data volume and exchange time are reported to the statistical node, the statistical node can count the data exchange volume and data exchange times between any two nodes, the implementation terminal of the invention sends a request for acquiring the data exchange volume and data exchange times between the current node and other nodes to the statistical node, and then acquires the data exchange volume and data exchange times between the undetermined test data source node and the current node in a preset time period in the past from a response sent by the statistical node to the current node.

And 230, determining a target test data source node for acquiring the data to be tested from the plurality of undetermined test data source nodes based on the data exchange amount and the data exchange times.

In many cases, a significant portion of the plurality of nodes from which the test data originates have data transferred from the current node, but the valuable data is very small, so that if the data is obtained from the portion of nodes, the data quality is poor and the requirements cannot be met. Therefore, the target test data source node is acquired based on the data exchange amount and the data exchange number.

In one embodiment, when the data exchange amount of a pending test data source node is greater than the data exchange amount threshold and the data exchange number is greater than the data exchange number threshold, the pending test data source node is considered as the target test data source node.

Step 240, obtaining the label of the data in the target test data source node.

Tags are abstracted representations of words, symbols, etc. that are used to characterize the nature or identity of a transaction, which can convey information. Such as corporate financial statement data, may be financial statements or finance, etc.

In one embodiment, each test data source node records a manually tagged for each data.

In one embodiment, the labels of the data in the target test data source nodes are obtained by crawling each target test data source node through the embedded script of the implementation terminal of the invention.

Step 250, obtaining a test task description input by a user.

A test task description is a specific and detailed explanation of a test task made by a user using natural language. For example, a test task description may include: the aim of the test task, the step of the test task, the purpose of the test task, the attention details of the test task, the completion standard of the test task, the evaluation index of the test task and the like.

In one embodiment, after determining the source node of the target test data, a prompt is sent to the user informing the user to input a test task description.

In one embodiment, the user inputs the test task description in advance and stores the test task description in the implementation terminal of the present invention, and when the test task description needs to be acquired, the test task description is directly read from the storage space in the implementation terminal of the present invention.

And step 260, acquiring data to be tested based on the label and the test task description.

Because the labels and the test task descriptions are composed of words in many cases, the data to be tested can be obtained through text matching.

In one embodiment, when a string of tags for data exists in the test task description, the data may be considered to be required by the test task, and the data may be obtained as the data to be tested.

FIG. 3 is a detailed flow diagram of step 230 according to an embodiment illustrated in the corresponding embodiment of FIG. 2. As shown in fig. 3, step 230 includes the steps of:

in step 231, the data exchange amount and the weight of the data exchange times are respectively acquired.

In one embodiment, the weights are set empirically and stored in the memory space of the terminal of the present invention.

In one embodiment, the weights are determined by: and selecting target test data source nodes manually, and sorting the historically obtained amount of data to be tested from each target test data source node to obtain target test data source nodes with the preset number in the front. And then sequencing the target test data source nodes with the preset number according to the data exchange amount and the data exchange times respectively, obtaining the number of the target test data source nodes with the data exchange amount sequenced smaller than the data exchange times, taking the ratio of the number to the preset number as the weight of the data exchange amount, and taking the difference of 1 and the weight as the weight of the data exchange times.

Step 232, determining, for each test data source node, a weighted sum of the data exchange amount and the data exchange number.

In one embodiment, the weighted sum is determined by: acquiring a product of the data exchange amount and the weight of the data exchange amount as a first product; obtaining the product of the data exchange times and the weight of the data exchange times as a second product; calculating a sum of the first product and the second product; and taking the sum as a weighted sum of the data exchange amount and the data exchange times.

And 233, sequencing the undetermined test data source nodes according to the weighted sum from large to small.

Step 234, determining a target test data source node according to the ranking.

In one embodiment, pending test data source nodes having a rank less than a first predetermined number are treated as target test data source nodes.

In one embodiment, the first predetermined number is preset empirically.

In one embodiment, after each pending test data source node is ranked from large to small according to the weighted sum, an interface for presenting a ranking result is loaded, a user selection is received through the interface, and the pending test data source node selected by the user is used as a target test data source node.

Specifically, the interface loaded for presenting the sorting result may be a Web page, an html5 page, a App (Application) interface, or the like.

In summary, the embodiment shown in fig. 3 has the advantage that the different roles of the data exchange amount and the data exchange times in determining the target test data source node are well reflected by calculating the weighted sum, so that the selected target test data source node is more reasonable. FIG. 4 is a detailed flow diagram of step 240 according to an embodiment illustrated by the corresponding embodiment of FIG. 2. As shown in fig. 4, step 240 includes the steps of:

step 241, for each data in the target test data source node, determining a location of a predetermined string in the data.

A data item is a collection of data items that can be entered into the system as a complete whole and allow the system to output the results of a comprehensive data analysis. For example, a financial statement of a marketing company contains many sub-data, and if the financial statement is analyzed to obtain stock estimates for the marketing company, the financial statement made up of all the sub-data is considered as one data.

Among the predetermined string of data is a data field having a specific positional relationship with the data field which may be a tag. For example, "NAME" may be used as a predetermined string, and "the NAME" may be used as a predetermined string.

In one embodiment, the data is stored in the framework of a database, the position of a string is represented by the string number and column number, and the position of a predetermined string in the data can be obtained by querying the database.

Step 242, determining the interception position of the tag according to the position of the predetermined character string.

The positional relationship between the position of the predetermined character string and the interception position of the tag may be varied according to the storage manner of each data in the database.

In one embodiment, the position of the predetermined string is the same as the line number of the cut-out position of the tag, and the column number of the cut-out position of the tag is greater than 1 than the column number of the position of the predetermined string.

In one embodiment, the position of the predetermined string is the same as the column number of the cut-out position of the tag, and the line number of the cut-out position of the tag is greater than 1 than the line number of the position of the predetermined string.

Step 243, obtaining the character string intercepted at the intercepting position as the label of the data.

Since the arrangement of various data is specifically regular, in the embodiment described above, the accuracy of the acquisition of the tag is improved by determining the position of the tag according to the position of the predetermined character string and then intercepting the tag from the position.

Fig. 5 is a detailed flow diagram of step 260 according to an embodiment illustrated by the corresponding embodiment of fig. 2. As shown in fig. 5, step 260 includes the steps of:

step 261, for each tag of each data in the target test data source node, obtaining the number of the tag in the test task description as the first number.

In one embodiment, a counter is provided in the implementation terminal of the present invention, for each tag, the test task description is traversed, whether the character string in the test task description is equal to the character string of the tag is determined, and the counter is incremented by 1 whenever it is determined that one character string is equal to the character string of the tag.

Step 262, for each data in the target test data source node, determining a sum of the first number of all tags in the data.

Step 263, ordering each data in the target test data source node according to the sum from big to small.

In one embodiment, each data in the target test data source node is ordered by the sum by an bubble ordering algorithm.

In one embodiment, each data in the target test data source node is ordered by the sum by a fast ordering algorithm.

In step 264, the second predetermined number of data ordered before is obtained as data to be tested.

For each data, the larger the sum, the more relevant the data is to the test task description. The method has the advantages that the possibility that the acquired data to be tested is not matched with the test task description is reduced, and the accuracy of the acquired data to be tested is improved.

Fig. 6 is a detailed flow chart of step 260 according to another embodiment shown in the corresponding embodiment of fig. 2. As shown in fig. 6, step 260 includes the steps of:

step 261', clustering all data in the target test data source node based on the label to divide the data into a plurality of classes.

In one embodiment, the clustering employs a K-means algorithm, specifically, by: extracting a predetermined number of tags from the tags of each data, sorting the tags extracted from each data in a predetermined order, and converting the tags extracted from each data into vectors according to the sorting; the method comprises the steps of obtaining the number of classes to be aggregated, and taking a certain number of vectors in the formed vectors as initial clustering centers, so that each initial clustering center only belongs to one class; for each vector converted from the label extracted from each data, gathering the vector to the class of the initial cluster center closest to the vector; and then acquiring the central vector of each class, clustering all the vectors converted from the labels extracted from each data again by taking the determined central vector as a clustering center until the central vector of each class is determined again, wherein the central vector of all the classes is consistent with the central vector in the last clustering.

In one embodiment, for example, there are three total data, the labels extracted from the first data are "market date", "net profit", "total stock", the labels extracted from the second data are "net profit", "net profit growth rate", "total stock", the labels extracted from the third data are "profit per share", "cash flow per share", "net profit", and if the predetermined order as a rule of constructing the labels into vectors is "market date", "net profit", "total stock", "net profit growth rate", "profit per share", "cash flow per share", the vector established for the first data is (1,1,1,0,0,0), the vector established for the second data is (0, 1, 0) and the vector established for the third data is (0,1,0,0,1,1).

Step 262' determines the number of each tag of data in each class in the test task description as a third number.

In one embodiment, a counter is provided in the implementation terminal of the present invention, and the number of each tag in the test task description may be counted.

Step 263', for each tag of data in a class, determines the number of data in the class having the tag as a fourth number.

Step 264' determines classes having a third number of tags greater than a third number threshold and a fourth number greater than a fourth number threshold.

And 265', acquiring data in the determined class as data to be tested.

In summary, the embodiment of the invention has the advantages that the classes with similar labels are obtained in a clustering mode, then the labels in each class are screened according to the matching degree with the test task description, the class which is most in line with the test task description is determined, and the possibility that the obtained data to be tested meets the test task requirements is improved.

Fig. 7 is a detailed flow chart of step 260 according to another embodiment shown in the corresponding embodiment of fig. 2. As shown in fig. 7, step 260 includes the steps of:

step 261", for each tag of each data in the target test data source node, obtaining the number of the tag in the test task description as a fifth number.

In one embodiment, a counter is provided in the implementation terminal of the present invention, and the counter is incremented by 1 each time a string of a tag is determined from the test task description, starting from the first character of the test task description.

Step 262", for each tag of each data in the target test data source node, determining a corresponding associated tag with the tag.

Step 263", for each relevant tag corresponding to the tag, the number of the relevant tag in the test task description is obtained as the sixth number.

Step 264", obtaining the sum of the fifth number of all the labels of each data in the target test data source node and the sum of the sixth number of related labels corresponding to all the labels respectively.

Step 265", obtaining data to be tested based on the sum of the fifth number and the sum of the sixth number according to a predetermined rule.

In one embodiment, the predetermined rule refers to obtaining a product of the sum of the fifth number and the sum of the sixth number, sorting each data in the source node of the target test data according to the product from large to small, and taking the data with the predetermined number arranged in front as the data to be tested.

In summary, in the embodiment shown in fig. 7, the data to be tested is obtained according to the labels of each data and the number of the related labels described in the test task, so that different roles of the labels and the related labels in obtaining the data to be tested are comprehensively considered, and the accuracy of the obtained data to be tested is improved.

Fig. 8 is a flow chart of predetermined rules according to an embodiment illustrated by the corresponding embodiment of fig. 7. As shown in fig. 8, the method comprises the following steps:

Step 810, dividing the sum of the fifth numbers into a third predetermined number of sum intervals of the fifth numbers according to the large-to-small number, wherein the sum of the fifth numbers of each data belongs to only one interval.

Step 820, sorting all data in the target test data source node according to the fifth number from big to small, wherein, for data in the same fifth number of sum intervals, sorting according to the sixth number of sum from big to small.

In step 830, the data ordered in the previous predetermined proportion is obtained as the data to be tested.

In one embodiment, the predetermined ratio is empirically set.

In one embodiment, the predetermined ratio is set based on the number of data in the target test data source node. For example, when the number of data is divided into a plurality of sections, the predetermined ratio of each section is different. This has the advantage that a small number of acquired data to be tested due to a too small number of data in the target test data source node is avoided.

In summary, the embodiment shown in fig. 8 has the advantage of balancing the roles of the tags and associated tags in acquiring data to be tested.

The following are device embodiments of the present invention.

The disclosure also provides a device for acquiring data to be tested. Fig. 9 is a block diagram illustrating a data acquisition device to be tested according to an exemplary embodiment. As shown in fig. 9, the apparatus 900 includes:

a first determining module 910, configured to determine a plurality of pending test data source nodes matched with the test task from all test data source nodes establishing communication with the current node, where the test data source nodes include data, and the data has a tag;

a first obtaining module 920, configured to obtain, for each determined node from which to test data, the data exchange amount and the data exchange times of the node from which to test data and the current node in a predetermined period of time;

a second determining module 930 configured to determine a target test data source node from the plurality of pending test data source nodes for obtaining data to be tested therefrom based on the data exchange amount and the data exchange number;

a second obtaining module 940 configured to obtain a tag of the data in the target test data source node;

a third obtaining module 950 configured to obtain a test task description input by a user;

The data to be tested acquisition module 960 is configured to acquire data to be tested based on the tag and the test task description.

According to a third aspect of the present disclosure, there is also provided an electronic device capable of implementing the above method.

Those skilled in the art will appreciate that the various aspects of the invention may be implemented as a system, method, or program product. Accordingly, aspects of the invention may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.

An electronic device 1000 according to this embodiment of the present invention is described below with reference to fig. 10. The electronic device 1000 shown in fig. 10 is merely an example and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.

As shown in fig. 10, the electronic device 1000 is embodied in the form of a general purpose computing device. Components of electronic device 1000 may include, but are not limited to: the at least one processing unit 1010, the at least one memory unit 1020, and a bus 1030 that connects the various system components, including the memory unit 1020 and the processing unit 1010.

Wherein the storage unit stores program code that is executable by the processing unit 1010 such that the processing unit 1010 performs steps according to various exemplary embodiments of the present invention described in the above-described "example methods" section of the present specification.

The memory unit 1020 may include readable media in the form of volatile memory units such as Random Access Memory (RAM) 1021 and/or cache memory unit 1022, and may further include Read Only Memory (ROM) 1023.

Storage unit 1020 may also include a program/utility 1024 having a set (at least one) of program modules 1025, such program modules 1025 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.

Bus 1030 may be representing one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 1000 can also communicate with one or more external devices 1200 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 1000, and/or with any device (e.g., router, modem, etc.) that enables the electronic device 1000 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 1050. Also, electronic device 1000 can communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 1060. As shown, the network adapter 1060 communicates with other modules of the electronic device 1000 over the bus 1030. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with the electronic device 1000, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.

From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.

According to a sixth aspect of the present disclosure, there is also provided a computer readable storage medium having stored thereon a program product capable of implementing the method described herein above. In some possible embodiments, the various aspects of the invention may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the invention as described in the "exemplary methods" section of this specification, when said program product is run on the terminal device.

Referring to fig. 11, a program product 1100 for implementing the above-described method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).

Furthermore, the above-described drawings are only schematic illustrations of processes included in the method according to the exemplary embodiment of the present invention, and are not intended to be limiting. It will be readily appreciated that the processes shown in the above figures do not indicate or limit the temporal order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, among a plurality of modules.

It is to be understood that the invention is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. A method of obtaining data to be tested, the method comprising:

determining a plurality of undetermined test data source nodes matched with a test task from all test data source nodes establishing communication with the current node, wherein the test data source nodes comprise data, and the data are provided with labels;

Respectively acquiring the data exchange quantity and the weight of the data exchange times;

determining a weighted sum of the data exchange amount and the data exchange times for each test data source node;

sequencing all the undetermined test data source nodes according to the weighted sum from big to small;

determining a target test data source node according to the sorting;

determining, for each data in the target test data source node, a location of a predetermined string in the data; determining the interception position of the tag according to the position of the preset character string; acquiring a character string intercepted at the intercepting position as a tag of the data;

acquiring a test task description input by a user;

acquiring data to be tested based on the tag and the test task description;

the obtaining the data to be tested based on the tag and the test task description specifically includes:

for each tag of each data in the target test data source node, acquiring the number of the tag in the test task description as a first number; determining, for each data in the target test data source node, a sum of the first number of all tags in the data; sequencing each data in the target test data source node according to the sum from big to small; acquiring a second preset number of data sequenced in front as data to be tested; or alternatively

Clustering all data in the target test data source node based on the label to divide the data into a plurality of classes; determining the number of each tag of the data in each class in the test task description as a third number; for each tag of data in a class, determining the number of data in the class having the tag as a fourth number; determining a class having a third number of tags greater than a third number threshold and a fourth number greater than a fourth number threshold; acquiring data in the determined class as data to be tested; or alternatively

For each tag of each data in the target test data source node, acquiring the number of the tag in the test task description as a fifth number; determining, for each tag of each data in the target test data source node, a corresponding correlation tag with the tag; for each relevant label corresponding to the label, acquiring the number of the relevant label in the test task description as a sixth number; respectively obtaining the sum of the fifth number of all the tags of each data in the target test data source node and the sum of the sixth number of related tags corresponding to all the tags; and acquiring data to be tested based on the sum of the fifth number and the sum of the sixth number according to a preset rule.

2. The method according to claim 1, characterized in that said predetermined rules comprise in particular:

dividing the sum of the fifth numbers into a third predetermined number of sum intervals of the fifth numbers from large to small, wherein the sum of the fifth numbers of each data belongs to only one interval;

sorting all data in the target test data source node according to the sum of the fifth number from large to small, wherein the data in the same sum interval of the fifth number is sorted according to the sum of the sixth number from large to small;

and acquiring the data of the predetermined proportion sequenced before as the data to be tested.

3. A device for obtaining data to be tested, wherein the device belongs to a current node, and the data to be tested is used for executing a test task, and the device comprises:

The second determining module is configured to acquire the data exchange quantity and the weight of the data exchange times respectively; determining a weighted sum of the data exchange amount and the data exchange times for each test data source node; sequencing all the undetermined test data source nodes according to the weighted sum from big to small; determining a target test data source node according to the sorting;

a second acquisition module configured to determine, for each data in the target test data source node, a location of a predetermined string in the data; determining the interception position of the tag according to the position of the preset character string; acquiring a character string intercepted at the intercepting position as a tag of the data;

the data to be tested obtaining module is configured to obtain data to be tested based on the tag and the test task description;

the data acquisition module to be tested is further configured to:

4. A computer readable program medium, characterized in that it stores computer program instructions, which when executed by a computer, cause the computer to perform the method according to any one of claims 1 to 2.

5. An electronic device, the electronic device comprising:

a processor;

a memory having stored thereon computer readable instructions which, when executed by the processor, implement the method of any of claims 1 to 2.