CN112948246B - AB test control method, device and equipment of data platform and storage medium - Google Patents

AB test control method, device and equipment of data platform and storage medium Download PDF

Info

Publication number
CN112948246B
CN112948246B CN202110220283.0A CN202110220283A CN112948246B CN 112948246 B CN112948246 B CN 112948246B CN 202110220283 A CN202110220283 A CN 202110220283A CN 112948246 B CN112948246 B CN 112948246B
Authority
CN
China
Prior art keywords
data
target
data item
test
data table
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110220283.0A
Other languages
Chinese (zh)
Other versions
CN112948246A (en
Inventor
惠盼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202110220283.0A priority Critical patent/CN112948246B/en
Publication of CN112948246A publication Critical patent/CN112948246A/en
Application granted granted Critical
Publication of CN112948246B publication Critical patent/CN112948246B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/368Test management for test version control, e.g. updating test cases to a new software version
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3692Test management for test results analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2358Change logging, detection, and notification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The disclosure discloses an AB test control method, device and equipment of a data platform and a storage medium, and relates to the technical field of data processing, in particular to the field of data testing. The specific implementation scheme is as follows: acquiring at least one test data item generated for an AB test; according to the association relation between each test data item and the stored data items in the data platform, performing incremental storage on the test data items, and adding new data item labels into the incremental storage data items; and inquiring the matched inquiry test results in each data item stored in the data platform according to the test inquiry conditions so as to evaluate the AB test effect. According to the embodiment of the disclosure, double saving of the calculated amount and the storage space is realized on the basis of ensuring the unchanged query times in the AB test process.

Description

AB test control method, device and equipment of data platform and storage medium
Technical Field
The disclosure relates to the technical field of data processing, in particular to the field of data testing, and particularly relates to an AB test control method, device and equipment of a data platform and a storage medium.
Background
With the continuous development of the information age, the data volume in the network is continuously increasing, and various data platforms for providing data query services, typically an ID (Identity document, identification number) mapping platform, are also continuously developing.
In the prior art, when a certain data platform needs to be on-line with a new rule or a new policy, the effect is generally required to be evaluated through an ABtest (AB test). Correspondingly, before the AB test is actually performed, a new piece of offline data is needed to be regenerated on the basis of the old data and is imported into the database in batches, and the new data and the old data are adopted to perform the AB test on the newly added rule or strategy.
The inventors found in the course of implementing the present invention that: in the prior art, new and old data are required to be stored in a database respectively, so that the number of inquiry tasks and the storage capacity are doubled, and an offline data stream with larger data volume is required to be redeveloped each time an AB test is performed, so that the online efficiency of a new rule or a new strategy is greatly affected.
Disclosure of Invention
The disclosure provides an AB test control method, an AB test control device, AB test control equipment and a storage medium for a data platform.
According to an aspect of the present disclosure, there is provided an AB test control method of a data platform, including:
acquiring at least one test data item generated for an AB test;
according to the association relation between each test data item and the stored data items in the data platform, performing incremental storage on the test data items, and adding new data item labels into the incremental storage data items;
and inquiring the matched inquiry test results in each data item stored in the data platform according to the test inquiry conditions so as to evaluate the AB test effect.
According to another aspect of the present disclosure, there is provided an AB test control device of a data platform, including:
the test data item acquisition module is used for acquiring at least one test data item generated for the AB test;
the incremental storage module is used for carrying out incremental storage on the test data items according to the association relation between each test data item and the stored data items in the data platform, and adding new data item labels into the incremental storage data items;
and the query test module is used for querying the matched query test results in each data item stored in the data platform according to the test query conditions so as to evaluate the AB test effect.
According to another aspect of the present disclosure, there is provided an electronic device including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the AB test control method of the data platform according to any one of the embodiments of the present disclosure.
According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the AB test control method of the data platform according to any one of the embodiments of the present disclosure.
According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements an AB test control method of a data platform according to any one of the embodiments of the present disclosure.
According to the embodiment of the disclosure, double saving of the calculated amount and the storage space is realized on the basis of ensuring the unchanged query times in the AB test process.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic diagram of an AB test control method for a data platform according to an embodiment of the disclosure;
FIG. 2 is a schematic diagram of an AB test control method for a data platform in accordance with an embodiment of the disclosure;
FIG. 3 is a schematic diagram of an AB test control device for a data platform in accordance with an embodiment of the disclosure;
fig. 4 is a block diagram of an electronic device used to implement the AB test control method of the data platform of an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic diagram of an AB test control method of a data platform according to an embodiment of the present disclosure, where the embodiment may be applicable to a case of performing an AB test on the data platform by means of incremental storage. The method can be executed by an AB test device of the data platform, and the device can be realized by software and/or hardware and can be generally integrated into the data platform or used together with the data platform. Specifically, referring to fig. 1, the method specifically includes the following steps:
s110, at least one test data item generated for the AB test is acquired.
Generally, the AB test is mainly applied to software testing, and generally, two (a/B) or more (a/B/n) versions are made for Web or App interfaces or processes, visitor groups with the same or similar composition are respectively allowed to randomly access the versions in the same time dimension, user experience data and business data of each group are collected, and finally, the best version is analyzed and evaluated, and the best version is formally adopted.
In this embodiment, the idea of AB testing is mainly applied to a data platform, and specifically, a new data (at least one test data item) needs to be constructed based on an old data (at least one stored data item) currently stored in the data platform. By comparing the data query effect of the new rule or the new strategy acting on the old data and the new data, the effect of the new rule can be effectively evaluated.
The test data item specifically refers to a data item consistent with the data type of the stored data item in the data platform.
And S120, performing incremental storage on the test data items according to the association relation between the test data items and the stored data items in the data platform, and adding new data item labels into the incremental storage data items.
In this embodiment, in a specific application scenario where the overlap ratio of new data and old data is relatively high, unlike the technical scheme in the prior art that when AB testing is performed on a data platform, new data needs to be stored in the data platform (database filling), the technical scheme disclosed in the present disclosure performs database filling operation on only that portion of incremental storage data items of test data items, which are different from stored data items, so that the total amount of data stored in the data platform can be greatly saved.
In particular, when some relatively simple database solutions (typically, simpleDB) are used in the data platform, the storage resources of the data platform are insufficient, and when the data size of the new and old data is relatively large and the overlap ratio is relatively high, the AB test cannot be implemented in the data platform using the prior art solutions. By adopting the incremental storage mode provided by the embodiment of the disclosure, storage resources can be effectively saved, and the short board of the existing AB test is solved.
That is, it should be noted again that, unlike the prior art, which stores two new and old copies of data, the incremental data items with new data item tags are stored for the old data in the embodiments of the present disclosure.
Specifically, the new data tag may be attached to the incremental data item as an attribute information, or may be embodied by modifying a name suffix (for example, an add_new suffix) of the incremental data item, which is not limited in this embodiment.
Based on the above embodiments, old data item tags may be further added to stored data items that do not overlap with the test data items, so as to clearly distinguish which data items in the data platform belong to only the stored data items in the data platform, which data items belong to both the stored data items in the data platform and the test data items, and which data items belong to only the test data items in the data platform.
S130, according to the test query conditions, the matched query test results are queried in all data items stored in the data platform so as to evaluate the AB test effect.
In this embodiment, after implementing the operation of only the database filling of the incremental data item with the new data item tag, the test query condition may be first constructed by the new rule, and then the data item meeting the test query condition may be queried in the data platform according to the test query condition, and finally the evaluation of the AB test effect may be performed on the new rule according to the incremental data item (with the new data item tag) included in the query test result fed back by the data platform.
According to the technical scheme, the test data items for carrying out the AB test are stored in the data platform in an increment mode, and the increment storage data are added into the storage mode of the new data tag, so that in an actual AB test scene with high coincidence degree between the new data and the old data, on the basis of ensuring the unchanged query times in the AB test process, the double saving of the calculated amount and the storage space is realized, the AB test efficiency is improved, and the efficiency of the new rule or the new strategy of the data platform is improved.
Based on the above embodiments, the data platform is an ID mapping platform;
the ID mapping platform comprises a forward data table and a reverse data table, wherein a plurality of data items in the form of key value pairs are stored in the forward data table and the reverse data table;
in the forward data table, the data item takes a person attribute identifier as a key name and an ID identifier as a key value; in the reverse data table, the data item uses an ID identifier as a key name and uses a person attribute identifier as a key value.
In an alternative implementation manner of this embodiment, the data platform may specifically be an ID mapping platform, where the ID mapping platform specifically refers to a service platform for acquiring multiple ID identifiers belonging to the same user. The ID mapping service implements the functions of: the user enters a query ID, and the service queries for other ID identifiers that are affiliated to the same person as the ID.
Typically, the ID identifier may be a login account number (useid) of the user setting up a web portal or social media, or a device ID identifier of a terminal device used by the user, etc. The device ID identifier may be IMEI (International Mobile Equipment Identity ) or MAC (Media Access Control, media access control layer), which is not limited in this embodiment.
Specifically, in the data platform of the ID mapping service, a forward data table and a reverse data table are mainly maintained. The mapping relation between the person attribute identification and the ID identification is stored in the forward data table, and the mapping relation between the ID identification and the person attribute identification is stored in the reverse data table.
In this embodiment, the person attribute identification is identification information for uniquely identifying one user (person) identity, and it is understood that one user uniquely has one person attribute identification, and the user may have a plurality of different ID identifications.
In this embodiment, the specific query mode of the ID query service provided by the ID mapping platform is that, first, a reverse data table is queried according to a query ID identifier input by a user, a person attribute identifier corresponding to the query ID identifier is obtained, then, a forward data table is queried again according to the person attribute identifier obtained by query, and all ID identifiers corresponding to the person attribute identifier are obtained as a query result.
In the above alternative embodiment, the application scenario of the present application is specifically defined, that is, the application scenario is applied to the AB test scenario in the ID mapping platform. In this optional application scenario, if the data items are also stored in the manner of a forward data table and a reverse data table, after a new piece of data is generated based on the AB test, the forward data table and the reverse data table corresponding to the new piece of data need to be generated simultaneously. And four data tables (two forward data tables and two reverse data tables) are maintained in the ID mapping platform at the same time, and when the data volume of one data is large, the four data tables consume large storage resources in the ID mapping platform. Even when the ID mapping platform is based on SimpleDB implementation, no effective AB test can be performed. In addition, even if the storage resources in the ID mapping platform are available, the four data tables need to be queried simultaneously when the subsequent data query test is performed, which increases the number of times of query and reduces the query efficiency.
In this embodiment, for a specific application scenario of the ID mapping platform, a specific feature of the ID mapping scenario is combined, and a forward data table and a reverse data table are used to perform data query, so that a manner of updating data items stored in the forward data table and the reverse data table only according to test data items can be implemented, and incremental storage of the test data items can be simply and conveniently implemented. In addition, after the incremental storage, the number of the forward data tables or the reverse data tables is not increased, so that the redundant query times are not increased, and the efficiency of the subsequent AB test is greatly improved.
Based on the above embodiments, according to the association relationship between each test data item and the stored data item in the data platform, incremental storage is performed on the test data item, and a new data item tag is added to the incremental storage data item, which may include:
acquiring target test data items from the test data items in turn;
the test data item is in a key value pair form, the key name in the test data item is a person attribute identifier, and the key value in the test data item is an ID identifier;
and updating and storing the forward data table and the reverse data table according to the difference between the target test data item and each data item currently stored in the forward data table and the reverse data table.
And returning to execute to acquire target test data items from the test data items in turn until the processing of all the test data items is completed.
In this alternative embodiment, each test data item may be acquired in turn, and each test data item may be used to update the forward data table and the reverse data table respectively (of course, if a certain test data item belongs to a stored data item in the data platform, there will not be any update effect on the forward data table and the reverse data table), so that a fast and accurate incremental storage effect may be achieved.
The forward data table and the reverse data table may be updated at the same time for each test data item, or the reverse data table may be updated after the forward data table is updated, or the forward data table may be updated after the reverse data table is updated, which is not limited in this embodiment.
Optionally, updating the forward data table according to the difference between the target test data item and each data item in the forward data table may include:
matching the target key name of the target test data item with the key name of each data item in the forward data table;
if the key name of the first target data item in the forward data table is matched with the target key name, continuing to compare whether the target key value is included in each key value of the first target data item;
if not, generating a new key name corresponding to the key name of the first target data item according to the new data item label, and adding the mapping relation between the new key name and the target key value into the forward data table;
and if the fact that the data item with the key name matched with the target key name does not exist in the forward data table is determined, the target test data item is added into the forward data table after the target key name in the target test data item is updated according to the new data item label.
Optionally, updating the reverse data table according to the difference between the target test data item and each stored data item in the reverse data table may include:
matching the target key value of the target test data item with the key name of each data item in the reverse data table;
if the key name of the second target data item in the reverse data table is determined to be matched with the target key value, continuing to compare whether the target key name is included in each key value of the second target data item;
if not, according to the new data item label, updating the target key name, and then taking the target key name as a new key value of the second target data item, and adding the new key value into the reverse data table;
and if the fact that the data item with the key name matched with the target key value does not exist in the reverse data table is determined, updating the target key name in the target test data item according to a new data item label, and adding the target test data item into the reverse data table after changing the key value pair sequence of the target test data item.
Fig. 2 is a schematic diagram of another AB test control method of a data platform according to an embodiment of the present disclosure, which is a further refinement of the foregoing technical solution, specifically provides an implementation manner in which each test data item is used to update a forward data table first and then update a reverse data table. The technical solution in this embodiment may be combined with each of the alternatives in one or more embodiments described above. As shown in fig. 2, the AB test control method of the data platform may include:
S210, at least one test data item generated for the AB test is acquired.
S220, acquiring target test data items from the test data items in turn.
The test data item is in a key value pair form, the key name in the test data item is a person attribute identifier, and the key value in the test data item is an ID identifier.
And S230, matching the target key name of the target test data item with the key name of each data item in the forward data table.
As described above, in the forward data table, the data item uses the person attribute identifier as the key name and the ID identifier as the key value, so that the target key name of the target test data item can be first matched with each key name in the forward data table, and it can be determined whether the person attribute identifier in the target test data item is stored in the data platform.
S240, judging whether a first target data item with a key name matched with the target key name exists in the forward data table or not: if yes, executing S250; otherwise, S260 is performed.
S250, continuously comparing whether each key value of the first target data item comprises the target key value or not: if yes, executing S270; otherwise, S280 is performed.
And S260, after updating the target key name in the target test data item according to the new data item label, adding the target test data item into the forward data table, and executing S290.
S270, discarding the target test data item, and executing S2150.
S280, generating a new key name corresponding to the key name of the first target data item according to the new data item label, adding the mapping relation between the new key name and the target key value into the forward data table, and executing S290.
In this embodiment, if the forward data table has a first target data item whose key name matches the target key name, and each key value of the first target data item does not include the target key value, it is indicated that the target test data item includes a new ID identifier of an existing user in the data platform. Therefore, in order not to change the subsequent query logic, it is necessary to re-associate the new ID identifier with a new user, that is, to create a new person attribute identifier, and therefore, according to a new data item label, it is necessary to generate a new key name corresponding to the key name of the first target data item, and add the mapping relationship between the new key name and the target key value to the forward data table.
In a specific example, if the target test data item is udwid1 (person attribute identification): ID2 (ID identification), the first target data item in the forward data table is udwid1: id1; at this time, after generating udwid1_new according to the target key name udwid1 in the target test data item, the udwid1_new and the id2 are combined to form the udwid1_new in the form of a key value pair: id2 is added to the forward data table.
If the forward data table has a first target data item with a key name matched with the target key name and each key value of the first target data item comprises the target key value, the method indicates that all contents in the target test data item are stored in the data platform, so that the target test data item can be directly discarded without any update of the forward data table or the reverse data table based on the target test data item;
if the forward data table does not have the first target data item with the key name matched with the target key name, the target test data item is indicated to comprise a new ID identification of a new user in the data platform, so that the target test data item is added into the forward data table after the target key name in the target test data item is updated according to the new data item label.
As previously described, if the target test data item is udwid1: id2, and the forward data table does not include the data item with the udwid1 as a key name, updating the udwid1 in the target test data item to udwid1_new, and then updating the udwid1_new: id2 is added to the forward data table.
By the arrangement, the incremental storage data items which are not stored in the forward data table can be simply and conveniently updated by taking the test data items as the minimum units, and the updating efficiency of the forward data table is improved.
S290, matching the target key value of the target test data item with the key name of each data item in the reverse data table.
S2100, judging whether a second target data item with a key name matched with the target key value exists in the reverse data table or not: if yes, executing S2110; otherwise, S2120 is performed.
S2110 continuously compares whether each key value of the second target data item includes the target key name: if yes, execution S2130; otherwise, S2140 is executed.
S2120, updating a target key name in the target test data item according to the new data item label, adding the target test data item into the reverse data table after changing the key value pair sequence of the target test data item, and executing S2150.
In a specific example, if the target test data item is udwid1: id2, and the reverse data table does not include the data item with udwid1 as a key value, updating the udwid1 in the target test data item to udwid1_new, and then, updating id2: udwid1_new is added to the reverse data table.
S2130, discard the target test data item, and execute S2150.
S2140, after updating the target key name according to the new data item label, adding the target key name as a new key value of the second target data item to the reverse data table, and executing S2150.
In this embodiment, the conventional storage mode of the reverse data table is broken through, that is, the data item uses the ID identifier as the key name and uses the single attribute identifier as the reverse data table storage mode of the key value.
At this time, when the ID identification is the same in the stored data item and the newly added test data item in the data platform and the person attribute identification is different for two data items, only the two data items in the above case can be stored as different data items in the reverse data table. At this time, if the query test is required, the reverse data table may need to be traversed to obtain all the required query test results.
Correspondingly, in this embodiment, even if the situation that the same ID identifier corresponds to two different person attribute identifiers occurs, the same ID identifier is still stored as one data item in the form of a key value pair, that is, the target key name is used as a new key value of the second target data item and added to the reverse data table, so that the data items included in the reverse data table are effectively reduced, and further, the query time required by subsequent queries can be greatly reduced, because once one data item in the reverse data table is hit, the query is not required to be continued.
In a specific example, if the target test data item is udwid1: id2, and the second target data item included in the reverse data table is id2: and after the udwid2 is updated to the udwid1_new in the target test data item, taking the udwid1_new as a new key value of the second target data item, and adding the new key value into the reverse data table. The newly generated second target data item in the reverse data table is: id2: udwid2, udwid1_new.
S2150, judging whether the processing of all the test data items is finished, if yes, executing S2160; otherwise, execution returns to S220.
S2160, according to the test query conditions, querying the matched query test results in each data item stored in the data platform to evaluate the AB test effect.
In an optional implementation manner of this embodiment, according to the test query condition, the matched query test result is queried in each data item stored in the data platform. May include:
extracting an inquiry ID identifier in a test inquiry condition;
querying at least one target person attribute identifier matched with the query ID identifier in a reverse data table in the data platform;
inquiring a forward data table in the data platform according to each target person attribute identifier to obtain all ID identifiers respectively corresponding to each target person attribute identifier;
And using all the obtained ID identifiers as query test results matched with the query conditions.
Through the arrangement, the forward data table and the reverse data table can be comprehensively used according to the data characteristics in the ID mapping platform, and effective AB test is realized.
According to the technical scheme, the test data items for carrying out the AB test are stored in the data platform in an increment mode, and the increment storage data are added into the storage mode of the new data tag, so that in an actual AB test scene with high coincidence degree between the new data and the old data, on the basis of ensuring the unchanged query times in the AB test process, the double saving of the calculated amount and the storage space is realized, the AB test efficiency is improved, and the efficiency of the new rule or the new strategy of the data platform is improved.
FIG. 3 is a schematic structural diagram of an AB test control device for a data platform according to an embodiment of the disclosure, which can execute the AB test control method for the data platform according to any embodiment of the disclosure; referring to fig. 3, an AB test control device 300 of the data platform includes:
a test data item acquisition module 310, configured to acquire at least one test data item generated for an AB test;
The incremental storage module 320 is configured to perform incremental storage on the test data items according to the association relationship between each test data item and the data items stored in the data platform, and add a new data item tag into the incremental storage data items;
and the query testing module 330 is configured to query the matched query test result in each data item stored in the data platform according to the test query condition, so as to evaluate the AB test effect.
According to the technical scheme, the test data items for carrying out the AB test are stored in the data platform in an increment mode, and the increment storage data are added into the storage mode of the new data tag, so that in an actual AB test scene with high coincidence degree between the new data and the old data, on the basis of ensuring the unchanged query times in the AB test process, the double saving of the calculated amount and the storage space is realized, the AB test efficiency is improved, and the efficiency of the new rule or the new strategy of the data platform is improved.
Based on the above embodiments, the data platform may be an ID mapping platform;
the ID mapping platform comprises a forward data table and a reverse data table, wherein a plurality of data items in the form of key value pairs are stored in the forward data table and the reverse data table;
In the forward data table, the data item takes a person attribute identifier as a key name and an ID identifier as a key value;
in the reverse data table, the data item uses an ID identifier as a key name and uses a person attribute identifier as a key value.
Based on the foregoing embodiments, the incremental storage module 320 may include:
the target test data item acquisition unit is used for acquiring target test data items from the test data items in sequence;
the test data item is in a key value pair form, the key name in the test data item is a person attribute identifier, and the key value in the test data item is an ID identifier;
and the updating storage unit is used for updating and storing the forward data table and the reverse data table according to the difference between the target test data item and each data item currently stored in the forward data table and the reverse data table.
And the return execution unit is used for returning and executing to acquire target test data items from the test data items in turn until the processing of all the test data items is completed.
On the basis of the above embodiments, the updating storage unit may specifically be used to:
matching the target key name of the target test data item with the key name of each data item in the forward data table;
If the key name of the first target data item in the forward data table is matched with the target key name, continuing to compare whether the target key value is included in each key value of the first target data item;
if not, generating a new key name corresponding to the key name of the first target data item according to the new data item label, and adding the mapping relation between the new key name and the target key value into the forward data table;
and if the fact that the data item with the key name matched with the target key name does not exist in the forward data table is determined, the target test data item is added into the forward data table after the target key name in the target test data item is updated according to the new data item label.
On the basis of the above embodiments, the updating storage unit may specifically be used to:
matching the target key value of the target test data item with the key name of each data item in the reverse data table;
if the key name of the second target data item in the reverse data table is determined to be matched with the target key value, continuing to compare whether the target key name is included in each key value of the second target data item;
If not, according to the new data item label, updating the target key name, and then taking the target key name as a new key value of the second target data item, and adding the new key value into the reverse data table;
and if the fact that the data item with the key name matched with the target key value does not exist in the reverse data table is determined, updating the target key name in the target test data item according to a new data item label, and adding the target test data item into the reverse data table after changing the key value pair sequence of the target test data item.
Based on the above embodiments, the query testing module 330 may specifically be configured to:
extracting an inquiry ID identifier in a test inquiry condition;
querying at least one target person attribute identifier matched with the query ID identifier in a reverse data table in the data platform;
inquiring a forward data table in the data platform according to each target person attribute identifier to obtain all ID identifiers respectively corresponding to each target person attribute identifier;
and using all the obtained ID identifiers as query test results matched with the query conditions.
The AB test control device of the data platform can execute the AB test control method of the data platform provided by any embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of the execution method. Technical details not described in detail in this embodiment may refer to the AB test control method of the data platform provided in any embodiment of the present disclosure.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
Fig. 4 illustrates a schematic block diagram of an example electronic device 400 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 4, the apparatus 400 includes a computing unit 401 that can perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 402 or a computer program loaded from a storage unit 408 into a Random Access Memory (RAM) 403. In RAM 403, various programs and data required for the operation of device 400 may also be stored. The computing unit 401, ROM 402, and RAM 403 are connected to each other by a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.
Various components in device 400 are connected to I/O interface 405, including: an input unit 406 such as a keyboard, a mouse, etc.; an output unit 407 such as various types of displays, speakers, and the like; a storage unit 408, such as a magnetic disk, optical disk, etc.; and a communication unit 409 such as a network card, modem, wireless communication transceiver, etc. The communication unit 409 allows the device 400 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The computing unit 401 may be a variety of general purpose and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 401 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 401 performs the various methods and processes described above, for example, an AB test control method for a data platform. For example, in some embodiments, the AB test control method of the data platform may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 408. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 400 via the ROM 402 and/or the communication unit 409. When the computer program is loaded into RAM 403 and executed by computing unit 401, one or more steps of the AB test control method of the data platform described above may be performed. Alternatively, in other embodiments, the computing unit 401 may be configured to perform the AB test control method of the data platform in any other suitable way (e.g. by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (6)

1. An AB test control method of a data platform, comprising:
acquiring at least one test data item generated for an AB test;
according to the association relation between each test data item and the stored data items in the data platform, performing incremental storage on the test data items, and adding new data item labels into the incremental storage data items;
inquiring the matched inquiry test results in each data item stored in the data platform according to the test inquiry conditions so as to evaluate the AB test effect;
The data platform is an identity number (ID) mapping platform;
the ID mapping platform comprises a forward data table and a reverse data table, wherein a plurality of data items in the form of key value pairs are stored in the forward data table and the reverse data table;
in the forward data table, the data item takes a person attribute identifier as a key name and an ID identifier as a key value;
in the reverse data table, the data item takes an ID identifier as a key name and takes a person attribute identifier as a key value;
according to the association relation between each test data item and the stored data items in the data platform, the test data items are stored in an increment mode, new data item labels are added into the increment storage data items, and the method comprises the following steps:
acquiring target test data items from the test data items in turn;
the test data item is in a key value pair form, the key name in the test data item is a person attribute identifier, and the key value in the test data item is an ID identifier;
updating and storing the forward data table and the reverse data table according to the difference between the target test data item and each data item currently stored in the forward data table and the reverse data table;
Returning to execute to acquire target test data items from the test data items in turn until the processing of all the test data items is completed;
wherein updating the forward data table according to the differences between the target test data item and each data item in the forward data table comprises:
matching the target key name of the target test data item with the key name of each data item in the forward data table;
if the key name of the first target data item in the forward data table is matched with the target key name, continuously comparing whether each key value of the first target data item comprises a target key value or not;
if not, generating a new key name corresponding to the key name of the first target data item according to the new data item label, and adding the mapping relation between the new key name and the target key value into the forward data table;
if the fact that the data item with the key name matched with the target key name does not exist in the forward data table is determined, the target test data item is added into the forward data table after the target key name in the target test data item is updated according to a new data item label;
wherein updating the reverse data table according to the difference between the target test data item and each stored data item in the reverse data table comprises:
Matching the target key value of the target test data item with the key name of each data item in the reverse data table;
if the key name of the second target data item in the reverse data table is determined to be matched with the target key value, continuing to compare whether the target key name is included in each key value of the second target data item;
if not, according to the new data item label, updating the target key name, and then taking the target key name as a new key value of the second target data item, and adding the new key value into the reverse data table;
and if the fact that the data item with the key name matched with the target key value does not exist in the reverse data table is determined, updating the target key name in the target test data item according to a new data item label, and adding the target test data item into the reverse data table after changing the key value pair sequence of the target test data item.
2. The method of claim 1, wherein querying matching query test results in each data item stored in the data platform according to test query conditions comprises:
extracting an inquiry ID identifier in a test inquiry condition;
querying at least one target person attribute identifier matched with the query ID identifier in a reverse data table in the data platform;
Inquiring a forward data table in the data platform according to each target person attribute identifier to obtain all ID identifiers respectively corresponding to each target person attribute identifier;
and using all the obtained ID identifiers as query test results matched with the query conditions.
3. An AB test control device for a data platform, comprising:
the test data item acquisition module is used for acquiring at least one test data item generated for the AB test;
the incremental storage module is used for carrying out incremental storage on the test data items according to the association relation between each test data item and the stored data items in the data platform, and adding new data item labels into the incremental storage data items;
the query test module is used for querying the matched query test results in each data item stored in the data platform according to the test query conditions so as to evaluate the AB test effect;
the data platform is an identity number (ID) mapping platform;
the ID mapping platform comprises a forward data table and a reverse data table, wherein a plurality of data items in the form of key value pairs are stored in the forward data table and the reverse data table;
in the forward data table, the data item takes a person attribute identifier as a key name and an ID identifier as a key value;
In the reverse data table, the data item takes an ID identifier as a key name and takes a person attribute identifier as a key value;
wherein, increment storage module includes:
the target test data item acquisition unit is used for acquiring target test data items from the test data items in sequence;
the test data item is in a key value pair form, the key name in the test data item is a person attribute identifier, and the key value in the test data item is an ID identifier;
the updating storage unit is used for updating and storing the forward data table and the reverse data table according to the difference between the target test data item and each data item currently stored in the forward data table and the reverse data table;
the return execution unit is used for returning and executing to acquire target test data items from the test data items in turn until the processing of all the test data items is completed;
the updating storage unit is specifically configured to:
matching the target key name of the target test data item with the key name of each data item in the forward data table;
if the key name of the first target data item in the forward data table is matched with the target key name, continuously comparing whether each key value of the first target data item comprises a target key value or not;
If not, generating a new key name corresponding to the key name of the first target data item according to the new data item label, and adding the mapping relation between the new key name and the target key value into the forward data table;
if the fact that the data item with the key name matched with the target key name does not exist in the forward data table is determined, the target test data item is added into the forward data table after the target key name in the target test data item is updated according to a new data item label;
the updating storage unit is specifically configured to:
matching the target key value of the target test data item with the key name of each data item in the reverse data table;
if the key name of the second target data item in the reverse data table is determined to be matched with the target key value, continuing to compare whether the target key name is included in each key value of the second target data item;
if not, according to the new data item label, updating the target key name, and then taking the target key name as a new key value of the second target data item, and adding the new key value into the reverse data table;
and if the fact that the data item with the key name matched with the target key value does not exist in the reverse data table is determined, updating the target key name in the target test data item according to a new data item label, and adding the target test data item into the reverse data table after changing the key value pair sequence of the target test data item.
4. The device of claim 3, wherein the query testing module is specifically configured to:
extracting an inquiry ID identifier in a test inquiry condition;
querying at least one target person attribute identifier matched with the query ID identifier in a reverse data table in the data platform;
inquiring a forward data table in the data platform according to each target person attribute identifier to obtain all ID identifiers respectively corresponding to each target person attribute identifier;
and using all the obtained ID identifiers as query test results matched with the query conditions.
5. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-2.
6. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-2.
CN202110220283.0A 2021-02-26 2021-02-26 AB test control method, device and equipment of data platform and storage medium Active CN112948246B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110220283.0A CN112948246B (en) 2021-02-26 2021-02-26 AB test control method, device and equipment of data platform and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110220283.0A CN112948246B (en) 2021-02-26 2021-02-26 AB test control method, device and equipment of data platform and storage medium

Publications (2)

Publication Number Publication Date
CN112948246A CN112948246A (en) 2021-06-11
CN112948246B true CN112948246B (en) 2023-08-04

Family

ID=76246603

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110220283.0A Active CN112948246B (en) 2021-02-26 2021-02-26 AB test control method, device and equipment of data platform and storage medium

Country Status (1)

Country Link
CN (1) CN112948246B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108446362A (en) * 2018-03-13 2018-08-24 平安普惠企业管理有限公司 Data cleansing processing method, device, computer equipment and storage medium
CN109086427A (en) * 2018-08-10 2018-12-25 深圳市牛鼎丰科技有限公司 Data comparison method, apparatus, computer equipment and storage medium
CN109496417A (en) * 2018-06-12 2019-03-19 优视科技新加坡有限公司 Data test method, apparatus, equipment/terminal/server and computer readable storage medium
EP3595239A1 (en) * 2018-07-13 2020-01-15 Nagravision SA Incremental assessment of integer datasets
CN111176705A (en) * 2019-12-10 2020-05-19 腾讯科技(深圳)有限公司 Feature library upgrading method and device
CN111444188A (en) * 2020-04-15 2020-07-24 中信银行股份有限公司 Stock test data preparation method and device, storage medium and electronic equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8078655B2 (en) * 2008-06-04 2011-12-13 Microsoft Corporation Generation of database deltas and restoration
US20150012852A1 (en) * 2013-07-08 2015-01-08 Kobo Incorporated User interface tool for planning an ab type of test

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108446362A (en) * 2018-03-13 2018-08-24 平安普惠企业管理有限公司 Data cleansing processing method, device, computer equipment and storage medium
CN109496417A (en) * 2018-06-12 2019-03-19 优视科技新加坡有限公司 Data test method, apparatus, equipment/terminal/server and computer readable storage medium
EP3595239A1 (en) * 2018-07-13 2020-01-15 Nagravision SA Incremental assessment of integer datasets
CN109086427A (en) * 2018-08-10 2018-12-25 深圳市牛鼎丰科技有限公司 Data comparison method, apparatus, computer equipment and storage medium
CN111176705A (en) * 2019-12-10 2020-05-19 腾讯科技(深圳)有限公司 Feature library upgrading method and device
CN111444188A (en) * 2020-04-15 2020-07-24 中信银行股份有限公司 Stock test data preparation method and device, storage medium and electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
云计算环境下测试数据的界定与管理;张一弛;熊湘文;黄雅文;王世雄;;现代图书情报技术(第11期);全文 *

Also Published As

Publication number Publication date
CN112948246A (en) 2021-06-11

Similar Documents

Publication Publication Date Title
CN113342345A (en) Operator fusion method and device of deep learning framework
CN112559631B (en) Data processing method and device of distributed graph database and electronic equipment
CN114253979B (en) Message processing method and device and electronic equipment
CN114202027B (en) Method for generating execution configuration information, method and device for model training
CN112866391A (en) Message pushing method and device, electronic equipment and storage medium
CN112528067A (en) Graph database storage method, graph database reading method, graph database storage device, graph database reading device and graph database reading equipment
CN114244795B (en) Information pushing method, device, equipment and medium
CN112948246B (en) AB test control method, device and equipment of data platform and storage medium
CN112560936A (en) Model parallel training method, device, equipment, storage medium and program product
CN114579311B (en) Method, device, equipment and storage medium for executing distributed computing task
CN112860811B (en) Method and device for determining data blood relationship, electronic equipment and storage medium
CN113553415B (en) Question-answer matching method and device and electronic equipment
CN116309002A (en) Graph data storage, access and processing methods, training methods, equipment and media
CN112328807A (en) Anti-cheating method, device, equipment and storage medium
CN114650222B (en) Parameter configuration method, device, electronic equipment and storage medium
CN115730681B (en) Model training method, device, equipment and storage medium
CN113326890B (en) Labeling data processing method, related device and computer program product
CN117609625A (en) Data processing method, device, electronic equipment and storage medium
CN116192999A (en) Message processing method, device, equipment, storage medium and program product
CN113377402A (en) Multi-version concurrent storage method and device
CN116708362A (en) Communication address processing method, device, equipment and storage medium
CN118012936A (en) Data extraction method, device, equipment and storage medium
CN115525659A (en) Data query method and device, electronic equipment and storage medium
CN116383498A (en) Data matching method and device, electronic equipment and storage medium
CN115858149A (en) Resource information acquisition method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant