WO2023024474A1

WO2023024474A1 - Data set determination method and apparatus, and computer device and storage medium

Info

Publication number: WO2023024474A1
Application number: PCT/CN2022/079074
Authority: WO
Inventors: 张元瀚; 黄耿石; 刘冬阳; 滕家宁; 王坤; 尹榛菲; 邵婧
Original assignee: 上海商汤智能科技有限公司
Priority date: 2021-08-26
Filing date: 2022-03-03
Publication date: 2023-03-02
Also published as: CN113704519A; CN113704519B

Abstract

A data set determination method and apparatus, and a computer device and a storage medium. The method comprises: acquiring a semantic database, which includes a plurality of pieces of semantic information (S101); creating a plurality of pieces of tag data on the basis of the semantic database, wherein one piece of tag data corresponds to one semantic category, the tag data includes an object tag belonging to a corresponding semantic category, and semantic categories corresponding to the plurality of pieces of tag data are categories by which an omni-vision representation test can be performed on a model to be tested (S103); and on the basis of a preset data set, determining matching data for object tags of at least part of the tag data, and on the basis of the matching data, determining test data sets respectively corresponding to the at least part of the tag data, so as to obtain a plurality of test data sets (S105).

Description

Method, device, computer equipment and storage medium for determining a data set

This application claims the priority of the Chinese patent application submitted to the China Patent Office on August 26, 2021, with the application number 202110986886.1, and the title of the application is "a method, device, computer equipment, and storage medium for determining a data set", all of which The contents are incorporated by reference in this application.

technical field

The present disclosure relates to the field of computer technology, and in particular, to a method, device, computer equipment, and storage medium for determining a data set.

Background technique

In the field of computer vision, it is necessary to test the performance of the designed model. At this time, the performance test of the designed model can be performed according to the corresponding test set. However, the existing test set is usually a pre-set data set, for example, ImageNet data set and so on. Since the existing test set contains test data containing multiple types of objects in various scenarios, when the model is tested through the existing test set, it cannot reflect the corresponding performance of the model for various types of objects. Test performance on test data. At this time, when using the existing test set to test the performance of the model, the robustness of the model will be affected, thereby affecting the processing accuracy of the model.

Contents of the invention

Embodiments of the present disclosure at least provide a method, device, computer equipment, and storage medium for determining a data set.

In the first aspect, an embodiment of the present disclosure provides a method for determining a data set, including: acquiring a semantic database containing multiple semantic information; creating multiple label data based on the semantic database; one label data corresponds to one semantic category, and the The tag data includes object tags belonging to the corresponding semantic category; the semantic category corresponding to the plurality of tag data is a category that can perform a full range of representation tests on the model to be tested; based on the preset data set, it is at least part of the tag data The object tag determines matching data, and based on the matching data, determines test data sets corresponding to at least part of the tag data, to obtain multiple test data sets.

It can be seen from the above description that the embodiments of the present disclosure process the semantic database to obtain label data corresponding to multiple semantic categories, and create test data sets corresponding to multiple semantic categories based on the determined multiple label data. The test data sets of multiple semantic categories, when the performance test of the model to be tested is performed through the determined multiple test data sets, the model to be tested can be tested in an all-round way, so as to obtain the all-round performance of the model to be tested. Through this testing method, the robustness of the model to be tested can be improved, thereby improving the model processing accuracy of the model to be tested.

In an optional implementation manner, there are multiple semantic databases, and creating multiple label data based on the semantic databases includes: fusing semantic information in multiple semantic databases to obtain a fusion semantic database; wherein , the fusion semantic database includes a plurality of fusion semantic information and hierarchical information between the plurality of fusion semantic information; determine a plurality of semantic categories to be divided, and divide the fusion semantic database according to the plurality of semantic categories The plurality of label data.

From the above description, it can be known that a more comprehensive semantic database, that is, a fusion semantic database, can be obtained by performing semantic fusion of multiple semantic databases. When multiple label data are determined according to the fusion semantic database, label data with more abundant semantic categories can be obtained. When the test model is tested through the test data set corresponding to the multiple label data, the full range of the test model can be realized. Test, so as to obtain the full range of representation performance of the model to be tested.

In an optional implementation manner, the merging the semantic information in multiple semantic databases to obtain the fused semantic database includes: determining the semantic information to be fused in the first semantic database of the multiple semantic databases; The semantic information to be fused does not contain the semantic information of the next level in the first semantic database; based on the hierarchical information between the semantic information in the first semantic database, determine the semantic path where the semantic information to be fused is located, so The semantic path includes at least one semantic information; based on the high-level semantic information in the semantic path before the semantic information to be fused, the semantic information to be fused and the semantic information in the second semantic database are fused to obtain the The fusion semantic database, the second semantic database is a database other than the first semantic database among the plurality of semantic databases.

From the above description, it can be seen that by determining the semantic path where the semantic information to be fused is located based on the hierarchical information between the semantic information, and then according to the semantic path, the semantic information to be fused and the semantic information in the second semantic database are fused. Quickly and accurately determine the mapping relationship between the semantic information to be fused and the semantic information in the second semantic database, so as to realize the maximum possible fusion of each semantic information to be fused with the semantic information in the second semantic database, and then obtain the contained More comprehensive semantic information fusion semantic database.

In an optional implementation manner, the semantic information to be fused and the semantic information in the second semantic database are fused based on the high-level semantic information in the semantic path before the semantic information to be fused, Obtaining the fused semantic database includes: determining target semantic information in the high-level semantic information in order of levels from high to low; the target semantic information includes corresponding semantic information in the second semantic database ; Fusing the semantic information to be fused with the semantic information of the next level of the semantic information corresponding to the target semantic information in the second semantic database to obtain the fused semantic database.

In the embodiment of the present disclosure, by merging the semantic information in multiple semantic databases to obtain the fused semantic database, richer and more comprehensive semantic information can be obtained, and multiple tag data can be determined based on the fused semantic database. When , you can get label data corresponding to multiple semantic types, so as to realize the all-round representation test of the model to be tested, thereby improving the robustness of the model to be tested, and at the same time improving the scope of application of the model to be tested, so as to improve the performance of the model to be tested. The processing accuracy of the model.

In an optional implementation manner, the fused semantic database is a tree-structured database; dividing the fused semantic database into the plurality of label data according to the plurality of semantic categories includes: Determining nodes corresponding to at least part (for example, each) semantic category in the tree-structured database to obtain a plurality of target nodes; using each of the target nodes as a root node to divide the tree-structured database, Dividing databases with a plurality of sub-tree structures, wherein a database with a sub-tree structure corresponds to a target node; determining the plurality of label data based on the database with a sub-tree structure, wherein the label data The object label is the semantic information in the database corresponding to the sub-tree structure.

In the embodiment of the present disclosure, according to the semantic categories that need to be divided, the fusion semantic database is divided into label data corresponding to multiple semantic categories, and then multiple test data sets are determined according to the multiple label data, and the test model that can be tested can be obtained. The test data set is comprehensively represented, and when the model is tested according to the multiple test data sets, the performance of the model to be tested on each semantic category can be determined.

In an optional implementation manner, the preset data set includes a plurality of data and data tags of the plurality of data; the preset data set is based on object tags of at least part (for example, each) of the tag data Determining matching data includes: determining object tags included in the tag data; matching data tags in the preset data set with the object tags to determine at least one set of matching tags; Determining at least one piece of data corresponding to a data label in at least one set (for example, each set) of matching labels in the data set, and determining the corresponding at least one piece of data as matching the object label in the set of matching labels data.

In the embodiment of the present disclosure, the above preset data set may be selected as the following two data sets: ImageNet and Places. Since the data sets ImageNet and Places contain a large number of natural pictures, when multiple test data sets are determined based on the data set ImageNet and Places, a more comprehensive data set can be obtained, and the test model is treated according to the multiple test data sets. When testing, the performance of the model to be tested on at least some (eg, each) semantic categories can be determined.

In an optional implementation manner, the method further includes: performing test processing on the model to be tested through at least one test data set to obtain multiple test results; calculating the average value of the multiple test results, and The average value is determined as a test result of an all-round representation test on the model to be tested.

In the embodiment of the present disclosure, by testing the model to be tested on multiple test data sets, multiple test results are obtained, and then the average value of the multiple test results is calculated to obtain the test result of the comprehensive representation test of the model to be tested In this way, the omnidirectional representation of the model to be tested can be determined quantitatively, thereby determining the robustness of the model to be tested. By determining the above test results, relevant technical personnel can also be instructed to carry out targeted training on the model to be tested, so that the model to be tested can be better processed in the test data under at least part (for example, each) semantic category result.

In an optional implementation manner, the method further includes: in the case that no matching data is determined in the preset data set that matches the target object tag in the target tag data, determining the object tag corresponding to the target tag data The target semantic category; searching for a matching database matching the target semantic category in the candidate database, and looking for data matching the target object label in the matching database.

Through the above processing method, a more comprehensive test data set can be obtained, and more accurate test results can be obtained when the model to be tested is fully tested according to the test data set.

In an optional implementation manner, the method further includes: when a target data label is determined in the preset data set, determine the target data label based on hierarchical information between data labels in the preset data set The upper-level label of the target data label; the target data label is a data label that does not contain the corresponding object label in the object labels of multiple label data; determine the semantic information corresponding to the upper-level label, and in the Determine the semantic information that matches the semantic information corresponding to the upper-level label among the plurality of label data; add the semantic information corresponding to the target data label as new semantic information to the matched semantic information In the semantic information of the next level, and based on the preset data set, matching data is determined for the new semantic information.

In the implementation of the present disclosure, the semantic information corresponding to the object tags in multiple tag data is supplemented by the data tags in the preset data set, which can enrich the semantic information in the tag data and obtain more and more comprehensive fusion semantic databases. Thus, the test accuracy of the model to be tested can be obtained.

In the second aspect, an embodiment of the present disclosure further provides a data set determination device, including: an acquisition unit, configured to acquire a semantic database containing multiple semantic information; a creation unit, configured to create multiple tag data based on the semantic database ; A label data corresponds to a semantic category, and the label data includes an object label belonging to the corresponding semantic category; the semantic category corresponding to the plurality of label data is a category that can perform a full range of representation tests on the model to be tested; determine the unit, use Determining matching data for object tags of at least part of the tag data based on the preset data set, and determining test data sets respectively corresponding to at least part of the tag data based on the matching data, to obtain multiple test data sets.

In a third aspect, an embodiment of the present disclosure further provides a computer device, including: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the computer device is running, the processing The processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the above-mentioned first aspect, or the steps in any possible implementation manner of the first aspect are executed.

In a fourth aspect, embodiments of the present disclosure further provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the above-mentioned first aspect, or any of the first aspects of the first aspect, may be executed. Steps in one possible implementation.

In the fifth aspect, an optional implementation manner of the present disclosure further provides a computer program product, including computer-readable codes, or a computer-readable storage medium bearing computer-readable codes, when the computer-readable codes are processed in an electronic device When running in the processor, the processor in the electronic device executes the above first aspect, or the steps in any possible implementation manner of the first aspect.

In the embodiment of the present disclosure, firstly, a semantic database containing multiple semantic information is obtained, and then multiple label data can be created based on the semantic database, and based on a preset data set, object labels for at least part of (for example, each) label data Determine the matching data, and then obtain multiple test data sets. It can be seen from the above description that the embodiments of the present disclosure process the semantic database to obtain label data corresponding to multiple semantic categories, and create test data sets corresponding to multiple semantic categories based on the determined multiple label data. The test data sets of multiple semantic categories, when the performance test of the model to be tested is performed through the determined multiple test data sets, the model to be tested can be tested in an all-round way, so as to obtain the all-round performance of the model to be tested. Through this testing method, the robustness of the model to be tested can be improved, thereby improving the model processing accuracy of the model to be tested.

In order to make the above-mentioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments will be described in detail below together with the accompanying drawings.

Description of drawings

In order to illustrate the technical solutions of the embodiments of the present disclosure more clearly, the following will briefly introduce the accompanying drawings used in the embodiments. The accompanying drawings here are incorporated into the specification and constitute a part of the specification. The drawings show the embodiments consistent with the present disclosure, and are used together with the description to explain the technical solution of the present disclosure. It should be understood that the following drawings only show some embodiments of the present disclosure, and therefore should not be regarded as limiting the scope. For those skilled in the art, they can also make From these drawings other related drawings are obtained.

FIG. 1 shows a flowchart of a method for determining a data set provided by an embodiment of the present disclosure;

FIG. 2 shows a schematic structural diagram of a tree-structured first semantic database provided by an embodiment of the present disclosure;

FIG. 3 shows a flow chart of specific steps for determining matching data for object tags of each tag data based on a preset data set in the method for determining a data set provided by an embodiment of the present disclosure;

FIG. 4 shows a schematic diagram of an apparatus for determining a data set provided by an embodiment of the present disclosure;

Fig. 5 shows a schematic diagram of a computer device provided by an embodiment of the present disclosure.

Detailed ways

In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are only It is a part of the embodiments of the present disclosure, but not all of them. The components of the disclosed embodiments generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the present disclosure provided in the accompanying drawings is not intended to limit the scope of the claimed disclosure, but merely represents selected embodiments of the present disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without creative effort shall fall within the protection scope of the present disclosure.

It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.

The term "and/or" in this article only describes an association relationship, which means that there can be three kinds of relationships, for example, A and/or B can mean: there is A alone, A and B exist at the same time, and B exists alone. situation. In addition, the term "at least one" herein means any one of a variety or any combination of at least two of the more, for example, including at least one of A, B, and C, which may mean including from A, Any one or more elements selected from the set formed by B and C.

It is found through research that the existing test set is usually a pre-set data set, for example, the ImageNet data set and so on. Since the existing test set contains test data containing multiple types of objects in various scenarios, when the model is tested through the existing test set, it cannot reflect the corresponding performance of the model for various types of objects. Test performance on test data. At this time, when using the existing test set to test the performance of the model, the robustness of the model will be affected, thereby affecting the processing accuracy of the model.

Based on the above research, the present disclosure provides a method, device, computer equipment and storage medium for determining a data set. It can be seen from the above description that the embodiments of the present disclosure process the semantic database to obtain label data corresponding to multiple semantic categories, and create test data sets corresponding to multiple semantic categories based on the determined multiple label data. The test data sets of multiple semantic categories, when the performance test of the model to be tested is performed through the determined multiple test data sets, the model to be tested can be tested in an all-round way, so as to obtain the all-round performance of the model to be tested. Through this testing method, the robustness of the model to be tested can be improved, thereby improving the model processing accuracy of the model to be tested.

In order to facilitate the understanding of this embodiment, a method for determining a data set disclosed in an embodiment of the present disclosure is firstly introduced in detail. The method for determining a data set provided by an embodiment of the present disclosure is generally executed by a computer with certain computing power device, in some possible implementation manners, the method for determining the data set may be executed by a processor running computer executable codes.

Referring to FIG. 1 , which is a flow chart of a method for determining a data set provided by an embodiment of the present disclosure, the method includes steps S101 to S105, wherein:

S101: Acquire a semantic database including multiple semantic information.

Here, multiple semantic information contained in the semantic database can be used to represent the information of various entities, and here, the information of entities can also be called the conceptual information of objects.

Here, the semantic information may be Chinese information or foreign language information, which is not specifically limited in the present disclosure. For example, the semantic information may be Chinese information, and the semantic information may also be English information. For example, semantic information can be cat, dog, pedestrian, car, etc., or cat, domestic cat, person, etc.

In the embodiment of the present disclosure, in addition to multiple semantic information, the semantic database may also contain hierarchical information between multiple semantic information, where the hierarchical information is used to represent the ownership relationship between multiple semantic information ( or superior-subordinate relationship).

For example, a plurality of semantic information includes information on mammals, reptiles, tigers, dogs, snakes, lizards, and the like. At this time, information such as mammals and reptiles can be used as a level of semantic information. At this time, semantic information such as tigers and dogs belong to the next level of semantic information corresponding to the category of mammals. At this time, semantic information such as snakes and lizards belong to the next level of semantic information corresponding to the category of reptiles. At this time, the relationship between mammals and tigers, dogs; reptiles and snakes, lizards and other information constitutes the hierarchical information in the semantic database (that is, the affiliation relationship or the superior-subordinate relationship).

In the embodiments of the present disclosure, the number of acquired semantic databases may be multiple, and the present disclosure does not specifically limit the number of acquired semantic databases. For example, the number of acquired semantic databases may be 2, 3, 4, etc., which is not specifically limited in the present disclosure.

Exemplarily, the number of acquired multiple semantic databases may be two, and the semantic information in the two semantic databases can be used to represent objects in the natural environment. For example, the two semantic databases can be Wordnet semantic database and Wikidata semantic database. In addition, the multiple semantic databases can also be selected as other types of databases, which will not be listed in this disclosure.

S103: Create multiple label data based on the semantic database; one label data corresponds to one semantic category, for example, each label data can correspond to one semantic category, and each label data contains object labels belonging to the corresponding semantic category; the multiple labels The semantic category corresponding to the data is the category that can perform comprehensive representation tests on the model to be tested.

It can be seen from the above description that the semantic database contains multiple semantic information, wherein the multiple semantic information belongs to multiple semantic categories, for example, the multiple semantic categories can be person, food, location, bird, reptile, mammal, insect, fish, clothing, device, structure, vehicle, flower, herb, tree, fruit.

Here, by setting the above multiple semantic categories, omni-vision representation testing of the model to be tested can be realized. The omnidirectional representation test is used to characterize the performance test of the model to be tested by testing the test data (for example, natural pictures) under as many semantic categories as possible, so as to obtain the performance test results of the test data of the model to be tested under each semantic category.

At this point, multiple tag data can be created based on the semantic database, and each tag data corresponds to one of the above multiple semantic categories. For example, the plurality of label data includes: label data 1, label data 2 and label data 3, wherein the label data 1 corresponds to the semantic category flower; the label data 2 corresponds to the semantic category food; the label data 3 corresponds to the semantic category location, etc.

For each label data, an object label corresponding to a semantic category is included. For example, for label data 1, an object label belonging to the semantic category "flower" is included. For example, the object label can be "rose (rose)", "jasmine ( Jasmine)" and other object labels.

In the embodiment of the present disclosure, the object label in each label data can be understood as the semantic information under the corresponding semantic category in the semantic database.

S105: Based on the preset data set, determine matching data for at least part (for example, each) of the object tags of the tag data, and determine corresponding tests for at least part (for example, each) of the tag data based on the matching data Data set, get multiple test data sets.

In the embodiment of the present disclosure, by processing the semantic database to obtain label data corresponding to multiple semantic categories, and creating test data sets corresponding to multiple semantic categories based on the determined multiple label data, the data corresponding to multiple semantic categories can be obtained. The test data set, when the performance test of the model to be tested is performed through the determined multiple test data sets, can realize the comprehensive test of the model to be tested, so as to obtain the comprehensive performance of the model to be tested. Through this testing method, the robustness of the model to be tested can be improved, thereby improving the model processing accuracy of the model to be tested.

In an optional implementation manner, in the case of multiple semantic databases, for S103, multiple label data are created based on the semantic database, specifically including the following process:

Step S1031: Fusing semantic information in multiple semantic databases to obtain a fused semantic database; wherein, the fused semantic database includes multiple fused semantic information and hierarchical information between multiple fused semantic information;

Step S1032: Determine a plurality of semantic categories to be divided, and divide the fusion semantic database into the plurality of label data according to the plurality of semantic categories.

When the number of semantic databases is multiple, the semantic information in multiple semantic databases can be fused to obtain the fused semantic database; after that, the fused semantic database can be divided according to the multiple semantic categories to be divided, and multiple divisions can be obtained. label data.

In the embodiment of the present disclosure, one semantic database may be selected from multiple semantic databases as the reference semantic database. Then, the semantic mapping relationship between the semantic information in the benchmark semantic database and the semantic information in the remaining semantic databases in the multiple semantic databases is established, and then the semantic information in the multiple semantic databases is fused according to the semantic mapping relationship to obtain the fusion semantic database.

Exemplarily, when the number of multiple semantic databases acquired is 2, the two semantic databases can be Wordnet semantic database and Wikidata semantic database. At this time, Wikidata can be selected as the benchmark semantic database, and Wordnet is multiple The remaining semantic databases in the semantic database.

Here, the above-mentioned semantic mapping relationship may be established based on the semantic path of the semantic information in the benchmark semantic database that does not contain the semantic information of the next level in the benchmark semantic database.

When selecting a benchmark semantic database, a semantic database corresponding to a larger amount of conceptual information (semantic information) may be determined as a benchmark semantic database from among multiple semantic databases.

In an optional implementation manner, for S1031, the semantic information in multiple semantic databases is fused to obtain the fused semantic database, which specifically includes the following steps:

Step S11: determining the semantic information to be fused in the first semantic database of the plurality of semantic databases; the semantic information to be fused does not include semantic information of the next level in the first semantic database;

Step S12: Based on the hierarchical information among the semantic information in the first semantic database, determine the semantic path where the semantic information to be fused is located, and the semantic path contains at least one semantic information;

Step S13: Based on the high-level semantic information in the semantic path before the semantic information to be fused, the semantic information to be fused and the semantic information in the second semantic database are fused to obtain the fused semantic database. The second semantic database is a database other than the first semantic database among the plurality of semantic databases.

In the embodiment of the present disclosure, one or more semantic databases are selected from multiple semantic databases as the first semantic database. The first semantic database here is the reference semantic database described above. At this time, the semantic database corresponding to a larger amount of concept information (semantic information) among the plurality of semantic databases may be determined as the first semantic database.

After the first semantic database is determined, the semantic information to be fused can be determined in the first semantic database according to the hierarchical information among the semantic information contained in the first semantic database. Here, the semantic information of the next level not included in the first semantic database may be determined as the semantic information to be fused.

For example, as shown in Figure 2. As shown in Figure 2, it is the first semantic database of tree structure, as can be seen from the first semantic database shown in Figure 2, this first semantic database includes: node 1 and node 2, wherein, node 1 includes node 11 to Node 14, node 2 includes node 21 to node 23, node 11 includes node 111 and node 112, at this time, node 12 to node 14, node 21 to node 23, and the semantic information corresponding to node 111 and node 112 do not include the following One level of semantic information. In this case, the semantic information corresponding to the above nodes can be determined as the semantic information to be fused.

Afterwards, the semantic path of each semantic information to be fused in the first semantic database can be determined. For example, for "node 111" in FIG. 2, the semantic path corresponding to the speech information to be fused corresponding to the node 111 may be: node 1-node 11-node 111.

At this point, the semantic information to be fused and the semantic information in the second semantic database can be fused according to the high-level semantic information located between the semantic information to be fused in the semantic path. For example, according to the semantic information corresponding to "node 1" and the semantic information corresponding to "node 11", the semantic information to be fused corresponding to "node 111" may be fused with the semantic information in the second semantic database.

In a possible implementation, when the number of multiple semantic databases is greater than 2, a first semantic database can be determined from the multiple semantic databases in the manner described above, and then the first semantic database The semantic information to be fused in is respectively fused with the semantic information in the remaining semantic database (ie, the second semantic database). The specific fusion process is the process described in the above steps S11 to S13, which will not be repeated here.

In an optional implementation manner, for S13, based on the high-level semantic information in the semantic path before the semantic information to be fused, the semantic information to be fused and the semantic information in the second semantic database are fused , to obtain the fusion semantic database, comprising the following steps:

(1) Determine the target semantic information in the high-level semantic information according to the hierarchical order from high to low; the target semantic information includes corresponding semantic information in the second semantic database;

(2) Fusing the semantic information to be fused with the semantic information of the next level of the semantic information corresponding to the target semantic information in the second semantic database to obtain the fused semantic database.

In the embodiment of the present disclosure, after obtaining the semantic path of the semantic information to be fused, the high-level semantic information in the first semantic database before the semantic information to be fused can be obtained, for example, "node 1" as shown in FIG. 2 The corresponding semantic information and the semantic information corresponding to "node 11". At this point, the obtained high-level semantic information can be determined in order of high-level semantic information from high-level to low-level, and the target semantic information can be determined in the high-level semantic information. The specific process is described as follows:

Firstly, according to the semantic path, determine the upper level semantic information of the semantic information to be fused, and then judge whether the second semantic database contains the semantic information corresponding to the upper level semantic information. If it is determined that it is contained, the upper-level semantic information is determined as the target semantic information. In the case of judging that it does not contain, continue to determine the semantic information of the upper level of the semantic information of the upper level, and judge whether the second semantic database contains the semantic information of the upper level of the semantic information of the upper level The semantic information corresponding to the information. If it is determined that it is included, determine the semantic information of the upper level of the upper level semantic information as the target semantic information, otherwise, continue to search for higher level semantic information along the semantic path.

Suppose, multiple semantic databases include Wikidata database and Wordnet database. Here, the first semantic database may be selected as the Wikidata database, and the second semantic database may be selected as the Wordnet database.

First, select the semantic information to be fused from the Wikidata semantic database. The semantic information to be fused does not contain the semantic information of the next level. For example, the semantic information to be fused can be Toyger information, and then the Toyger information in the Wikidata semantic database can be determined. A semantic path, for example, the semantic path is Toyger-Domestic Cat-Cat.

After obtaining the above semantic path, the high-level semantic information of Toyger information can be determined, for example, Domestic Cat information and Cat information respectively. According to the obtained high-level semantic information, the target semantic information can be determined according to the hierarchical order from high to low (or understood as the hierarchical order from bottom to top). For example, the target semantic information is Domestic Cat information. At this time, the semantic information corresponding to the target semantic information in the Wordnet semantic database is also Domestic Cat information. At this point, the Toyger information (semantic information to be fused) in the Wikidata semantic database can be fused with the next-level semantic information of the Domestic Cat information in the Wordnet semantic database.

For each semantic information to be fused in the Wikidata semantic database, the manner described above can be used to fuse the semantic information to be fused with the semantic information in the Wordnet semantic database. After fusing each semantic information to be fused, a corresponding fused semantic database can be obtained.

In the embodiment of the present disclosure, when the number of acquired multiple semantic databases is greater than 2, it is assumed that the Nth semantic database can be selected as the first semantic database, and then one of the remaining N-1 semantic databases is arbitrarily selected The semantic database is used as the second semantic database. At this time, the semantic information to be fused can be selected from the first semantic database, and the semantic information to be fused is fused with the semantic information in the second semantic database, thereby completing the fusion of the two semantic databases , get the fusion semantic database M. After that, select a semantic database from the remaining N-2 semantic databases as the first semantic database, and the above-mentioned semantic database M is used as the second semantic database to carry out the fusion of semantic information, and so on, until all acquired semantic databases are completed. The fusion of semantic information results in the final fusion semantic database.

In an optional implementation manner, when the fused semantic database is a tree-structured database, for S1032, divide the fused semantic database into the multiple label data according to the multiple semantic categories, specifically including Follow the steps below:

Step S21: Determining nodes corresponding to at least some (for example, each) semantic categories in the tree-structured database to obtain a plurality of target nodes;

Step S22: Using the target node as a root node, divide the tree-structured database to obtain a plurality of sub-tree-structured databases, wherein one sub-tree-structured database corresponds to one target node;

Step S23: Determine the plurality of tag data based on the plurality of sub-tree structured databases, wherein the object tags in the tag data are semantic information in the corresponding sub-tree structured databases.

In an embodiment of the present disclosure, the multiple semantic databases may be tree-structured databases, where each node in the tree-structured database may represent a piece of semantic information, and each semantic information may represent corresponding object information. At this time, each node in the tree-structured database can contain corresponding child nodes. At this time, the hierarchical relationship between the node and the child nodes of the node constitutes the semantic information corresponding to the node and the corresponding child nodes. The hierarchical information among the semantic information.

After merging multiple semantic databases in the manner described above to obtain the fused semantic database, a tree-structured fused semantic database can also be obtained. Therefore, the fusion semantic database of the tree structure may also contain a plurality of nodes, and each node may contain a corresponding child node, and each node is used to represent semantic information in the fusion semantic database.

Here, after the multiple semantic categories to be divided are determined, the node corresponding to each semantic category can be determined in the fusion semantic database in tree structure. For example, multiple semantic categories can be person, food, location, bird, reptile, mammal, insect, fish, clothing, device, structure, vehicle, flower, herb, tree, fruit. At this point, the node corresponding to each semantic category in the fusion semantic database of the tree structure can be determined. For example, the multiple semantic categories are person, food, and location. At this time, it can be determined that the nodes corresponding to each semantic category are node A, node B, and node C. Among them, node A, node B, and node C are the above-mentioned Multiple target nodes.

After a plurality of target nodes are determined, each target node may be used as a root node to divide the tree-structured database, so as to obtain multiple sub-tree-structured databases.

After obtaining multiple subtree-structured databases, for each subtree-structured database, the semantic information contained in the subtree-structured database can be determined as the object label in the corresponding label data, and the subtree The hierarchical information between the semantic information contained in the database of the shape structure is determined as the hierarchical information between the object tags contained in the corresponding tag data.

Here, the number and names of the semantic categories to be divided can be determined according to the actual needs of the test model, and are not specifically limited here.

In an optional implementation, as shown in FIG. 3 , when the preset data set contains a plurality of data and data tags of the plurality of data; for the above step S105, based on the preset data set, at least partially (For example, each) the object tag of the tag data is determined to match the data, which specifically includes the following steps:

Step S1051: Determine the object tags included in the tag data;

Step S1052: Match the data tags in the preset data set with the object tags to determine at least one set of matching tags;

Step S1053: Determining at least one piece of data corresponding to data tags in at least one group (for example, each group) of matching tags in the preset data set, and determining the corresponding at least one piece of data as matching the group Labels match the object labels in the data.

Here, the preset data set may be a collection of natural pictures. In addition, the preset data set may also be a set containing other types of data, which will not be described in detail in this disclosure.

In the embodiment of the present disclosure, the object tags included in each tag data are first determined, and then the data tags included in the preset data set are matched with the object tags to obtain at least one set of matching tags.

Here, the process of matching the object labels in the label data with the data labels in the preset data set can be understood as comparing the semantic information corresponding to the object label with the semantic information corresponding to the data label. When the semantic information is the same or When they are similar, the matching is successful. At this time, the successfully matched object tags and data tags can form a set of matching tags.

The same semantic information can be understood as the object label is bike, and the data label is bike; the semantic information is similar can be understood as the object label is bike, and the data label is bicycle. Here, although the object label bike and the data label bicycle are different, the objects represented by bike and bicycle are the same. Therefore, in the embodiments of the present disclosure, similar semantic information may be interpreted as object tags and data tags corresponding to the same object.

After obtaining at least one group of matching tags in the manner described above, the data corresponding to the data tags in each group of matching tags in the preset data set can be determined, and then the data can be used as the object in the group of matching tags label to match the data.

The matching data of the object tag in each tag data can be determined through the above processing method. After obtaining the matching data of object tags in each tag data, the set of matching data of all object tags in each tag data can be used as the test data set corresponding to the tag data. At this time, you can get Multiple test data sets.

In the embodiment of the present disclosure, the above preset data set may be selected as the following two data sets: ImageNet and Places. Since the data sets ImageNet and Places contain a large number of natural pictures, when multiple test data sets are determined based on the data set ImageNet and Places, a more comprehensive data set can be obtained, and the test model is treated according to the multiple test data sets. When testing, the performance of the model under test on each semantic category can be determined.

In an optional implementation manner, the embodiment of the present disclosure also includes the following steps:

Step S11: performing test processing on the model to be tested through at least one (for example, each) test data set to obtain multiple test results;

Step S12: Calculate the average value of the plurality of test results, and determine the average value as the test result of the omnidirectional representation test on the model to be tested.

In the embodiment of the present disclosure, the obtained multiple test data sets may be respectively input into the model to be tested for test processing. The model to be tested can obtain a test result on each test data set. At this time, the average value of the obtained multiple test results can be calculated to obtain the test result of the all-round representation test of the model to be tested.

In the embodiment of the present disclosure, each test result can be used to reflect the performance of the model to be tested under the corresponding semantic category, for example, when the test result is greater than a certain threshold, it can be determined that the model to be tested is in the semantic category When the following data is obtained, better processing results can be obtained.

In the embodiment of the present disclosure, by testing the model to be tested on multiple test data sets, multiple test results are obtained, and then the average value of the multiple test results is calculated to obtain the test result of the comprehensive representation test of the model to be tested In this way, the omnidirectional representation of the model to be tested can be determined quantitatively, thereby determining the robustness of the model to be tested. By determining the above test results, relevant technical personnel can also be instructed to carry out targeted training on the model to be tested, so that the model to be tested can obtain better processing results in the test data under each semantic category.

In an optional embodiment, the disclosed method also includes the following steps:

Step S21: In the case where no data matching the target object label in the target label data is determined in the preset data set, determine the target semantic category corresponding to the target label data;

Step S22: Search for a matching database that matches the target semantic category in the candidate database, and search for data that matches the target object label in the matching database.

In the embodiment of the present disclosure, when the data matching the target object label in the target label data cannot be determined in the preset data set, it can be searched in the alternative database according to the semantic category corresponding to the target label data. The semantic category is matched with a matching database, and data matching the target object label is found in the matching database.

Here, the alternative database refers to a database other than the above-mentioned preset data set. For example, the alternative database can be the matching data obtained by searching the network according to the semantic information corresponding to the semantic category or the target object label. The alternative database It can also provide users with matching data based on semantic categories and semantic information. Here, there is no specific limitation on the candidate databases, which mainly meet actual needs.

Step S31: When the target data label is determined in the preset data set, based on the hierarchical information between the data labels in the preset data set, determine the upper level label of the target data label; The target data label is a data label that does not contain a corresponding object label among the object labels of multiple label data;

Step S32: Determine the semantic information corresponding to the upper-level label, and determine the semantic information matching the semantic information corresponding to the upper-level label in the plurality of label data;

Step S33: Add the semantic information corresponding to the target data tag as new semantic information to the semantic information of the next level of the matched semantic information, and set the new semantic information based on the preset data set The information identified matches the data.

In the embodiment of the present disclosure, if no object tag matching the target data tag is found in multiple tag data, the upper level of the target data tag can be determined according to the hierarchical information between the data tags in the preset data set label, and then determine the semantic information corresponding to the upper-level label, for example, the semantic information is marked as M. Afterwards, the semantic information matching the semantic information M can be determined in multiple tag data, which is recorded as the semantic information N. At this time, the semantic information corresponding to the target data label in the preset data set is added as new semantic information to the semantic information of the next level of semantic information N, and the semantic information corresponding to the target data label in the preset data set data as matching data for new semantic information.

Those skilled in the art can understand that in the above method of specific implementation, the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process. The specific execution order of each step should be based on its function and possible The inner logic is OK.

Based on the same inventive concept, the embodiment of the present disclosure also provides a data set determination method device corresponding to the data set determination method, because the problem-solving principle of the device in the embodiment of the present disclosure is consistent with the determination of the above-mentioned data set in the embodiment of the present disclosure The method is similar, so the implementation of the device can refer to the implementation of the method, and the repetition will not be repeated.

Referring to FIG. 4 , it is a schematic diagram of a device for determining a data set provided by an embodiment of the present disclosure. The device includes: an acquisition unit module 41, a creation unit module 42, and a determination unit module 43; wherein,

An acquisition unit module 41, configured to acquire a semantic database comprising a plurality of semantic information;

The creating unit module 42 is used to create a plurality of label data based on the semantic database; one label data corresponds to a semantic category, and the label data includes object labels belonging to the corresponding semantic category; the semantic category corresponding to the multiple label data A category that can perform comprehensive representation tests on the model to be tested;

The determining unit module 43 is configured to determine matching data for at least part (eg each) of the object tags of the tag data based on a preset data set, and determine at least part (eg each) of the tags based on the matching data The test data sets respectively correspond to the data, and a plurality of test data sets are obtained.

In the embodiment of the present disclosure, by processing the semantic database to obtain label data corresponding to multiple semantic categories, and creating test data sets corresponding to multiple semantic categories based on the determined multiple label data, the data corresponding to multiple semantic categories can be obtained. The test data set, when the performance test of the model to be tested is performed through the determined multiple test data sets, can realize the comprehensive test of the model to be tested, so as to obtain the comprehensive performance of the model to be tested. Through this testing method, the robustness of the model to be tested can be improved, and then the model processing accuracy of the model to be tested can be improved.

In a possible implementation manner, creating a unit module is also used to: fuse semantic information in multiple semantic databases to obtain a fusion semantic database; wherein, the fusion semantic database includes multiple fusion semantic information and multiple Fusing hierarchical information between semantic information; determining multiple semantic categories to be divided, and dividing the fused semantic database into the multiple label data according to the multiple semantic categories.

In a possible implementation manner, the creating unit module is further configured to: determine the semantic information to be fused in the first semantic database of the plurality of semantic databases; the semantic information to be fused is not in the first semantic database Including the semantic information of the next level; based on the hierarchical information among the semantic information in the first semantic database, determine the semantic path where the semantic information to be fused is located, and the semantic path contains at least one semantic information; based on the semantic path In the high-level semantic information before the semantic information to be fused, the semantic information to be fused and the semantic information in the second semantic database are fused to obtain the fused semantic database, and the second semantic database is the A database other than the first semantic database among the plurality of semantic databases.

In a possible implementation manner, the creating unit module is further configured to: determine the target semantic information in the high-level semantic information according to the hierarchical order from high to low; the target semantic information is stored in the second semantic database contains the corresponding semantic information; the semantic information to be fused and the semantic information of the next level of the semantic information corresponding to the target semantic information in the second semantic database are fused to obtain the fused semantic database .

In a possible implementation manner, the creation unit module is further used for: determining nodes corresponding to at least part (for example, each) semantic category in the database of the tree structure to obtain a plurality of target nodes; The target node is used as the root node, and the database of the tree structure is divided, and the database of multiple sub-tree structures is obtained by dividing, wherein, a database of a sub-tree structure corresponds to a target node; based on the multiple sub-tree structures The database determines the plurality of label data, wherein the object label in the label data is semantic information in the database corresponding to the sub-tree structure.

In a possible implementation manner, the determining unit module is further configured to: determine object tags included in the tag data; match the data tags in the preset data set with the object tags, and determine at least one set of matching Label; determine at least one data corresponding to the data label in at least one group (for example, each group) of matching labels in the preset data set, and determine the corresponding at least one data as matching the group of labels The data that matches the object labels in .

In a possible implementation manner, the determining unit module is further configured to: perform test processing on the model to be tested through at least one (for example, each) test data set to obtain multiple test results; calculate the multiple test results The average value of the results, and the average value is determined as the test result of the full-scale representation test on the model to be tested.

In a possible implementation manner, the determining unit module is further configured to: if no matching data is determined in the preset data set that matches the target object tag in the target tag data, determine Corresponding to the target semantic category; searching for a matching database matching the target semantic category in the candidate database, and looking for data matching the target object label in the matching database.

In a possible implementation manner, the determining unit module is further configured to: in the case that the target data tag is determined in the preset data set, determine based on the hierarchical information between the data tags in the preset data set The upper-level label of the target data label; the target data label is a data label that does not contain a corresponding object label in the object labels of multiple label data; determine the semantic information corresponding to the upper-level label, and Determining the semantic information that matches the semantic information corresponding to the upper-level label among the plurality of label data; adding the semantic information corresponding to the target data label as new semantic information to the matched semantic information In the semantic information of the next level of the information, and based on the preset data set, matching data is determined for the new semantic information.

For the description of the processing flow of each module in the device and the interaction flow between the modules, reference may be made to the relevant description in the above method embodiment, and details will not be described here.

Corresponding to the determination method of the data set in FIG. 1, the embodiment of the present disclosure also provides a computer device 500, as shown in FIG. 5, which is a schematic structural diagram of the computer device 500 provided by the embodiment of the present disclosure, including:

Processor 51, memory 52, and bus 53; memory 52 is used for storing and executing instruction, comprises memory 521 and external memory 522; memory 521 here is also called internal memory, is used for temporarily storing computing data in processor 51, and The data exchanged by the external memory 522 such as hard disk, the processor 51 exchanges data with the external memory 522 through the memory 521, and when the computer device 500 is running, the processor 51 communicates with the memory 52 through the bus 53, so that The processor 51 executes the following instructions:

Obtain a semantic database containing multiple semantic information;

Create a plurality of tag data based on the semantic database; a tag data corresponds to a semantic category, and the tag data includes object tags belonging to the corresponding semantic category; The orientation indicates the category of the test;

Based on the preset data set, matching data is determined for at least part (for example, each) of the object tags of the tag data, and a test data set corresponding to at least part (for example, each) of the tag data is determined based on the matching data. , to get multiple test data sets.

An embodiment of the present disclosure also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the method for determining a data set described in the above-mentioned method embodiments are executed . Wherein, the storage medium may be a volatile or non-volatile computer-readable storage medium.

An embodiment of the present disclosure also provides a computer program product, the computer program product carries a program code, and the instructions included in the program code can be used to execute the steps of the method for determining the data set described in the above method embodiment, for details, please refer to The foregoing method embodiments are not described in detail here.

Wherein, the above-mentioned computer program product may be specifically implemented by means of hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) etc. wait.

Those skilled in the art can clearly understand that for the convenience and brevity of description, the specific working process of the above-described system and device can refer to the corresponding process in the foregoing method embodiments, which will not be repeated here. In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.

If the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor. Based on this understanding, the technical solution of the present disclosure is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Finally, it should be noted that: the above-mentioned embodiments are only specific implementations of the present disclosure, and are used to illustrate the technical solutions of the present disclosure, rather than limit them, and the protection scope of the present disclosure is not limited thereto, although referring to the aforementioned The embodiments have described the present disclosure in detail, and those skilled in the art should understand that any person familiar with the technical field can still modify the technical solutions described in the foregoing embodiments within the technical scope disclosed in the present disclosure Changes can be easily imagined, or equivalent replacements can be made to some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be included in this disclosure. within the scope of protection. Therefore, the protection scope of the present disclosure should be defined by the protection scope of the claims.

Claims

A method for determining a data set, comprising:

Obtain a semantic database containing multiple semantic information;

Create a plurality of tag data based on the semantic database; a tag data corresponds to a semantic category, and the tag data includes object tags belonging to the corresponding semantic category; The orientation indicates the category of the test;

Determining matching data for object tags of at least part of the tag data based on the preset data set, and determining test data sets corresponding to at least part of the tag data based on the matching data, to obtain multiple test data sets.
The method according to claim 1, wherein there are multiple semantic databases, and creating a plurality of tag data based on the semantic databases includes:

Fusing semantic information in multiple semantic databases to obtain a fusion semantic database; wherein, the fusion semantic database includes multiple fusion semantic information and hierarchical information between multiple fusion semantic information;

A plurality of semantic categories to be divided is determined, and the fusion semantic database is divided into the plurality of label data according to the plurality of semantic categories.
The method according to claim 2, wherein said merging the semantic information in a plurality of semantic databases to obtain the fusion semantic database comprises:

Determining the semantic information to be fused in the first semantic database of the multiple semantic databases; the semantic information to be fused does not include semantic information of the next level in the first semantic database;

Determine the semantic path where the semantic information to be fused is located based on the hierarchical information among the semantic information in the first semantic database, where the semantic path includes at least one semantic information;

Based on the high-level semantic information in the semantic path before the semantic information to be fused, fuse the semantic information to be fused with the semantic information in the second semantic database to obtain the fused semantic database, the second The semantic database is a database other than the first semantic database among the plurality of semantic databases.
The method according to claim 3, characterized in that, based on the high-level semantic information in the semantic path before the semantic information to be fused, the semantic information to be fused and the semantic information in the second semantic database The information is fused to obtain the fusion semantic database, including:

Determining target semantic information in the high-level semantic information according to the hierarchical order from high to low; the target semantic information includes corresponding semantic information in the second semantic database;

Fusing the semantic information to be fused with the semantic information of the next level of the semantic information corresponding to the target semantic information in the second semantic database to obtain the fused semantic database.
The method according to claim 2, wherein the fusion semantic database is a tree-structured database; the division of the fusion semantic database into the plurality of label data according to the plurality of semantic categories includes :

Determining nodes corresponding to at least part of the semantic categories in the database of the tree structure to obtain a plurality of target nodes;

Using the target node as a root node, dividing the tree-structured database to obtain a plurality of sub-tree-structured databases, wherein one sub-tree-structured database corresponds to one target node;

The plurality of tag data is determined based on the plurality of sub-tree structured databases, wherein the object tags in the tag data are semantic information in the corresponding sub-tree structured databases.
The method according to any one of claims 1 to 5, wherein the preset data set includes a plurality of data and data tags of the plurality of data;

The determining matching data for at least part of the object tags of the tag data based on the preset data set includes:

determining object tags included in the tag data;

matching the data tags in the preset data set with the object tags to determine at least one set of matching tags;

Determining at least one data corresponding to a data tag in at least one group of matching tags in the preset data set, and determining the corresponding at least one data as matching an object tag in the group of matching tags data.
The method according to any one of claims 1 to 6, further comprising:

performing test processing on the model to be tested by using at least one test data set to obtain multiple test results;

calculating an average value of the plurality of test results, and determining the average value as a test result of performing an all-round representation test on the model to be tested.
The method according to any one of claims 1 to 7, further comprising:

In the case where no data matching the target object tag in the target tag data is determined in the preset data set, determine the target semantic category corresponding to the target tag data;

A matching database matching the target semantic category is searched in the candidate database, and data matching the target object label is searched in the matching database.
The method according to any one of claims 1 to 8, further comprising:

In the case where the target data tag is determined in the preset data set, based on the hierarchical information between the data tags in the preset data set, determine the upper layer tag of the target data tag; the target data tag The data label of the corresponding object label is not included in the object label of multiple label data;

determining the semantic information corresponding to the upper-level label, and determining the semantic information matching the semantic information corresponding to the upper-level label in the plurality of label data;

Add the semantic information corresponding to the target data tag as new semantic information to the semantic information of the next level of the matching semantic information, and determine the corresponding semantic information for the new semantic information based on the preset data set. match data.
A device for determining a data set, characterized in that it includes:

an acquisition unit, configured to acquire a semantic database containing a plurality of semantic information;

The creation unit is used to create a plurality of tag data based on the semantic database; one tag data corresponds to a semantic category, and the tag data includes object tags belonging to the corresponding semantic category; the semantic category corresponding to the plurality of tag data can be The category of the full representation test for the model to be tested;

The determining unit is configured to determine matching data for object tags of at least part of the tag data based on a preset data set, and determine test data sets corresponding to at least part of the tag data based on the matching data, to obtain a plurality of A collection of test data.
A computer device, characterized in that it includes: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the computer device is running, the connection between the processor and the memory communicate with each other through a bus, and when the machine-readable instructions are executed by the processor, the steps of the method for determining the data set according to any one of claims 1 to 9 are executed.
A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the method for determining a data set according to any one of claims 1 to 9 is executed A step of.
A computer program product, comprising computer-readable codes, or a computer-readable storage medium bearing computer-readable codes, when the computer-readable codes run in a processor of an electronic device, processing in the electronic device The device executes the steps for realizing the determination method of the data set according to any one of claims 1 to 9.