CN113688124B - Data interference elimination method and related device - Google Patents

Data interference elimination method and related device Download PDF

Info

Publication number
CN113688124B
CN113688124B CN202110961015.4A CN202110961015A CN113688124B CN 113688124 B CN113688124 B CN 113688124B CN 202110961015 A CN202110961015 A CN 202110961015A CN 113688124 B CN113688124 B CN 113688124B
Authority
CN
China
Prior art keywords
user
index
acquisition
difference information
acquisition index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110961015.4A
Other languages
Chinese (zh)
Other versions
CN113688124A (en
Inventor
陈友洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huya Technology Co Ltd
Original Assignee
Guangzhou Huya Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huya Technology Co Ltd filed Critical Guangzhou Huya Technology Co Ltd
Priority to CN202110961015.4A priority Critical patent/CN113688124B/en
Publication of CN113688124A publication Critical patent/CN113688124A/en
Application granted granted Critical
Publication of CN113688124B publication Critical patent/CN113688124B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3692Test management for test results analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Data Mining & Analysis (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention provides a data interference elimination method and a related device, wherein the data interference elimination method and the related device are used for respectively obtaining first acquired index difference information and second acquired index difference information, wherein the first acquired index difference information represents the difference information of a first user data set and a second user data set when a test strategy is not configured, and the second acquired index difference information represents the difference information of the first user data set and the second user data set when the test strategy is configured. And further, by calculating the difference information of the second acquired index difference information and the first acquired index difference information, the difference information eliminates the data interference generated by the fact that the non-homogeneous data difference between the first user acquired index and the second user acquired index cannot be ensured when a testing strategy is not configured, so that the difference information is used as a testing measurement parameter, and the accuracy of data analysis is improved.

Description

Data interference elimination method and related device
Technical Field
The present invention relates to the field of data analysis, and in particular, to a data interference cancellation method and related apparatus.
Background
Aiming at the software products related to the Internet, the use condition and the user behavior of the product functions can be analyzed more accurately so as to optimize the products. Statistical analysis of data in the operation of a product is often required.
In the prior art, in order to realize data analysis, data acquisition and analysis are often performed by setting different experimental data groups. When data analysis is typically performed on a different basis, data acquisition is typically performed with different experimental strategies based on different test phases. For example, at a stage, a test policy is not issued to the terminals of different groups, and the purpose of this stage is to collect data of each terminal on the premise that the terminals of different groups are as homogeneous as possible; and then, in another stage, a test strategy is issued to the terminals of different groups, and at the moment, variables to be inspected are required to be introduced to the terminals of a certain group according to the test strategy, so that the data of each terminal are collected on the premise that the terminals of different groups are different. And finally, analyzing the results of the data in different stages by taking the data in the homogeneous stages as a standard to obtain an analysis conclusion of the test.
However, in the above analysis based on different data groups, in the homogeneous phase, the data differences between the different groups are often not truly eliminated. Therefore, the data difference of the homogeneous stage is brought in the subsequent data analysis process, and the accuracy of the analysis result is further reduced.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and provides a data interference elimination method and a related device, which ensure the accuracy of the final result of a test experiment by eliminating the difference between a control group and an experiment group when a test strategy is configured.
Embodiments of the invention may be implemented as follows:
in a first aspect, an embodiment of the present invention provides a method for data interference cancellation, where the method includes: respectively obtaining first acquisition index difference information and second acquisition index difference information; the first acquisition index difference information characterizes the difference information of a first user acquisition index contained in the first user data set and a second user acquisition index contained in the second user data set when a test strategy is not configured; the second acquisition index difference information characterizes the difference information of a third user acquisition index contained in the first user data set and a fourth user acquisition index contained in the second user data set when the testing strategy is configured; calculating difference information of the second acquired index difference information and the first acquired index difference information; and taking the difference information as a test measurement parameter.
Optionally, the first user acquisition index is a first user acquisition index in a first history period, and the second user acquisition index is a second user acquisition index in the first history period; the third user acquisition index is a third user acquisition index in a second history period, and the fourth user acquisition index is a fourth user acquisition index in the second history period;
The step of obtaining the first acquired index difference information and the second acquired index difference information respectively includes:
acquiring a first user acquisition mean value in the first historical period according to the first user acquisition index;
acquiring a second user acquisition mean value in the first historical period according to the second user acquisition index;
taking the difference value between the first user acquisition mean value and the second user acquisition mean value as the first acquisition index difference information;
acquiring a third user acquisition mean value in the second historical period according to the third user acquisition index;
acquiring a fourth user acquisition mean value in the second historical period according to the fourth user acquisition index;
and taking the difference value between the third user acquisition mean value and the fourth user acquisition mean value as the second acquisition index difference information.
Optionally, before the step of obtaining the first acquired index difference information and the second acquired index difference information respectively, the method further includes:
respectively obtaining a first user prediction index corresponding to the first user acquisition index and a second user prediction index corresponding to the second user acquisition index through a prediction model;
Determining whether the first user prediction index and the first user acquisition index and the second user prediction index and the second user acquisition index meet a stable condition;
if yes, executing the step of respectively obtaining the first acquired index difference information and the second acquired index difference information;
if not, updating parameters of the prediction model until the first user prediction index and the first user acquisition index obtained through the updated prediction model and the obtained second user prediction index and the second user acquisition index meet the stability condition.
In a second aspect, an embodiment of the present invention provides a data interference cancellation apparatus, including: the information acquisition module is used for respectively acquiring the first acquisition index difference information and the second acquisition index difference information;
the first acquisition index difference information characterizes the difference information of a first user acquisition index contained in the first user data set and a second user acquisition index contained in the second user data set when a test strategy is not configured; the second acquisition index difference information characterizes the difference information of a third user acquisition index contained in the first user data set and a fourth user acquisition index contained in the second user data set when the testing strategy is configured;
The difference value calculation module is used for calculating difference value information of the second acquisition index difference information and the first acquisition index difference information;
and the parameter acquisition module is used for taking the difference information as a test measurement parameter.
Optionally, the first user acquisition index is a first user acquisition index in a first history period, and the second user acquisition index is a second user acquisition index in the first history period; the third user acquisition index is a third user acquisition index in a second history period, the fourth user acquisition index is a fourth user acquisition index in the second history period, and the information acquisition module comprises:
the average value acquisition unit is used for acquiring a first user acquisition average value in the first historical period according to the first user acquisition index; acquiring a second user acquisition mean value in the first historical period according to the second user acquisition index; acquiring a third user acquisition mean value in the second historical period according to the third user acquisition index; acquiring a fourth user acquisition mean value in the second historical period according to the fourth user acquisition index;
the difference information acquisition unit is used for taking the difference value between the first user acquisition mean value and the second user acquisition mean value as the first acquisition index difference information; and taking the difference value between the third user acquisition mean value and the fourth user acquisition mean value as the second acquisition index difference information.
Optionally, the apparatus further comprises: a stability judging module;
the stability judging module includes:
the prediction index obtaining unit is used for respectively obtaining a first user prediction index corresponding to the first user acquisition index and a second user prediction index corresponding to the second user acquisition index through a prediction model;
the stability judging unit is used for determining that the first user prediction index and the first user acquisition index and the second user prediction index and the second user acquisition index meet stability conditions, and operating the information acquisition module to respectively acquire first acquisition index difference information and second acquisition index difference information;
the stability judging unit is further configured to determine that the first user prediction index and the first user acquisition index, and when the second user prediction index and the second user acquisition index do not meet a stability condition, update parameters of the prediction model until the first user prediction index and the first user acquisition index, and the second user prediction index and the second user acquisition index obtained through the updated prediction model meet the stability condition.
In a third aspect, an embodiment of the present invention provides a data interference cancellation system, including: a data acquisition device, apparatus of any of the preceding embodiments;
the data acquisition equipment is used for acquiring a first user data set and a second user data set when the test strategy is not configured and when the test strategy is configured respectively; when the testing strategy is not configured, the difference information of the first user acquisition index contained in the first user data set and the second user acquisition index contained in the second user data set is obtained; when the testing strategy is configured, the difference information of the third user acquisition index contained in the first user data set and the fourth user acquisition index contained in the second user data set is obtained.
In a fourth aspect, an embodiment of the present invention provides an electronic device, including a processor and a memory, where the memory stores a computer program, and the processor implements the method according to any one of the foregoing embodiments when executing the computer program.
In a fifth aspect, an embodiment of the present invention provides a computer-readable storage medium, on which is stored a computer program which, when executed by a processor, implements a method according to any one of the preceding embodiments.
Compared with the prior art, the invention has the following beneficial effects: by respectively obtaining the first acquired index difference information and the second acquired index difference information, the first acquired index difference information represents the difference information of the first user acquired index contained in the first user data set and the second user acquired index contained in the second user data set when the test strategy is not configured, and the second acquired index difference information represents the difference information of the third user acquired index contained in the first user data set and the fourth user acquired index contained in the second user data set when the test strategy is configured. And further, by calculating the difference information of the second acquired index difference information and the first acquired index difference information, the difference information eliminates the data interference generated by the fact that the non-homogeneous data difference between the first user acquired index and the second user acquired index cannot be ensured when a testing strategy is not configured, so that the difference information is used as a testing measurement parameter, and the accuracy of data analysis is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of an application scenario of a data interference cancellation system according to an embodiment of the present invention;
fig. 2 is a schematic block diagram of an electronic device according to an embodiment of the present invention;
fig. 3 is a schematic flow chart of a data interference cancellation method according to an embodiment of the present invention;
FIG. 4 is a flow chart illustrating the sub-steps of step S101 in FIG. 3;
fig. 5 is a schematic diagram of data interference cancellation according to an embodiment of the present invention;
fig. 6 is another flow chart of a data interference cancellation method according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a data interference cancellation device according to an embodiment of the present invention;
fig. 8 is a schematic diagram of another structure of a data interference cancellation device according to an embodiment of the present invention;
fig. 9 is a schematic diagram of another structure of a data interference cancellation device according to an embodiment of the present invention.
Icon: 10-a server; 20 a-terminal equipment; 20 b-terminal equipment; 120-communication interface; 130-a processor; 110-memory; 300-data interference cancellation means; 310-a stability judging module; 311-a predictor obtaining unit; 312-a stability judging unit; 320-an information acquisition module; 321-an average value acquisition unit; 322-a difference information acquisition unit; 340-a difference calculation module; 360-parameter acquisition module.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
In the description of the present invention, it should be noted that, if the terms "upper", "lower", "inner", "outer", and the like indicate an azimuth or a positional relationship based on the azimuth or the positional relationship shown in the drawings, or the azimuth or the positional relationship in which the inventive product is conventionally put in use, it is merely for convenience of describing the present invention and simplifying the description, and it is not indicated or implied that the apparatus or element referred to must have a specific azimuth, be configured and operated in a specific azimuth, and thus it should not be construed as limiting the present invention.
Furthermore, the terms "first," "second," and the like, if any, are used merely for distinguishing between descriptions and not for indicating or implying a relative importance.
It should be noted that the features of the embodiments of the present invention may be combined with each other without conflict.
Referring to fig. 1, a schematic diagram of a network system is provided in an embodiment of the present invention, where the network system may include a server 10, a first user group formed by at least one terminal device 20a, and a second user group formed by at least one terminal device 20 b;
the server 10 may be configured to maintain a test policy related to statistics and analysis data, and issue the test policy to terminals of the first user group and the second user group based on different stages; meanwhile, the server 10 may be configured to collect related user data belonging to the first user group and the second user group respectively, so as to form a first user data set corresponding to the first user group and a second user data set corresponding to the second user group;
the server 10 may further analyze the data in the first user data set and the second user data set to obtain a relevant analysis result.
It should be noted that the analysis function may be implemented by other devices, for example, the server 10 is only used to acquire the first user data set and the second user data set, and then send the data of these data sets to the device with the analysis function for analysis.
Alternatively, the above-described network system may be used to provide a variety of possible services, including, but not limited to: multimedia streaming services, cloud gaming, distributed storage, etc. Taking live video as an example, the server 10 in the network system may be a server providing live video streaming, and the terminal device 20a and the terminal device 20b may be Applications (APP) installed with live video related applications. The server 10 may collect and analyze data related to the live video applications on the terminal device 20a and the terminal device 20b for different analysis purposes.
The terminal device 20a and the terminal device 20b may acquire relevant data of the user when using the live video application, and report the relevant data to the server 10.
Continuing with the live video example, the rule of dividing the first user group and the second user group may be various, for example, it may divide the groups based on ages of different users; or dividing based on the region position of the user; alternatively, partitioning is based on user activity; alternatively, the dimensions of the experimental test may be divided into experimental and control groups based on specific test requirements. In this regard, the present invention is not limited to the rule for dividing the user group.
It should be noted that the above terminal device may, but is not limited to: personal computers, notebook computers, tablet computers, cell phones, and the like.
Referring to fig. 2, a schematic diagram of an electronic device includes a memory 110, a communication interface 120, and a processor 130;
the memory 110, the communication interface 120, and the processor 130 are electrically connected directly or indirectly to each other to realize data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The data interference cancellation device 300 comprises at least one software functional module that may be stored in the memory 110 in the form of software or firmware (firmware) or cured in an Operating System (OS) of the server 10. The processor 130 is configured to execute executable modules stored in the memory 110, such as software functional modules or computer programs included in the data interference cancellation device 300.
The Memory 110 may be, but is not limited to, a random access Memory 110 (Random Access Memory, RAM), a Read Only Memory 110 (ROM), a programmable Read Only Memory 110 (Programmable Read-Only Memory, PROM), an erasable Read Only Memory 110 (Erasable Programmable Read-Only Memory, EPROM), an electrically erasable Read Only Memory 110 (Electric Erasable Programmable Read-Only Memory, EEPROM), etc. The memory 110 is configured to store a program, and the processor 130 executes the program after receiving an execution instruction, and the method executed by the server 10 defined by the process disclosed in any of the foregoing embodiments of the present invention may be applied to the processor 130 or implemented by the processor 130.
The processor 130 may be an integrated circuit chip with signal processing capabilities. The processor 130 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but may also be a Digital Signal Processor (DSP), application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
It should be noted that, the electronic device shown in fig. 2 may be configured to implement the server 10, the terminal device 20a, and the terminal device 20b in fig. 1; when acting as a server, it may perform the corresponding steps to achieve the corresponding technical effects; when it is a terminal device, the electronic device may further comprise other modules, for example, for implementing the corresponding functions of the terminal device: radio frequency circuits, I/O interfaces, batteries, touch screens, mic/speakers, etc. And are not limited herein.
Further, taking the network system shown in fig. 1 as an example, in the prior art, before the test policy is not configured, data of each terminal is collected on the premise that terminals of different groups (for example, terminals of an experimental group and a control group) are as homogeneous as possible, and in some application scenarios, the stage is also named as an AA stage, but in the actual test process, the experimental group and the control group which are not configured with the test policy stage are difficult to be completely homogeneous, and there is always a data difference. And possibly in a stage of the configured test strategy (this stage is also named AB stage), the relevant data of the two groups are also heterogeneous, thus reducing the accurate determination of the final analysis result due to unnecessary differences in the data of the two stages.
Based on the above-mentioned problems, in order to eliminate the above-mentioned differences and ensure the accuracy of the final result of the test experiment, the embodiment of the present invention provides a data interference elimination method, and fig. 3 is a schematic flow chart of the data interference elimination method provided by the embodiment of the present invention. The method comprises the following steps:
step S101, first acquisition index difference information and second acquisition index difference information are respectively obtained.
Optionally, in order to eliminate the data difference, difference information of a first user acquisition index included in the first user data set and a second user acquisition index included in the second user data set, which are represented by the difference information of the first acquisition index, when the test policy is not configured, and difference information of a third user acquisition index included in the first user data set and a fourth user acquisition index included in the second user data set, which are represented by the difference information of the second acquisition index, when the test policy is configured, need to be acquired respectively.
For example, the first set of user data may be a control set of data and the second set of user data may be a laboratory set of data. The user acquisition index can be a user watching time index, a bullet screen quantity index, a focus quantity index and the like. The test strategy may further be:
taking the user collection index as the user's viewing duration as an example, in combination with the server 10 shown in fig. 1, if the related operator sets the index to be collected as the "user's viewing duration" through the server 10, different terminal devices are set to belong to the first user group and the second user group, as shown in fig. 1. At this time, the server 10 may issue a test policy to the terminal devices, for example, the test policy may set a viewing period for each terminal device to collect the user. When a user of a certain terminal device uses the video live broadcast application by utilizing the terminal device, the operation time of the user entering the live broadcast room and pushing out the live broadcast room is used as a statistical point, so that the 'user watching time' of the user is obtained.
Step S102, calculating difference information of the second acquired index difference information and the first acquired index difference information.
Continuing taking the example of S101 as an example, calculating the difference between the control group and the experimental group when the test strategy is configured according to the second acquired index difference information and the first acquired index difference information, so as to obtain difference information. The difference information obtained by subtracting the second acquired index difference information from the first acquired index difference information is the influence of the test strategy.
Step S103, taking the difference information as a test measurement parameter.
According to the data interference elimination method provided by the embodiment of the invention, the first acquisition index difference information and the second acquisition index difference information are respectively obtained, and the first acquisition index difference information represents the difference information of the first user acquisition index contained in the first user data set and the second user acquisition index contained in the second user data set when a test strategy is not configured, and the second acquisition index difference information represents the difference information of the third user acquisition index contained in the first user data set and the fourth user acquisition index contained in the second user data set when the test strategy is configured. And further, by calculating the difference information of the second acquired index difference information and the first acquired index difference information, the difference information eliminates the data interference generated by the fact that the non-homogeneous data difference between the first user acquired index and the second user acquired index cannot be ensured when a testing strategy is not configured, so that the difference information is used as a testing measurement parameter, and the accuracy of data analysis is improved.
In order to obtain the first collected index difference information and the second collected index difference information respectively, a possible implementation manner is given on the basis of fig. 3, so as to obtain the first collected index difference information and the second collected index difference information respectively, as shown in fig. 4, fig. 4 is a schematic flow chart of a sub-step of step S101 in fig. 3 provided by an embodiment of the present invention. The step S101 includes:
Step S1011 obtains a first user acquisition mean value in a first history period according to the first user acquisition index.
Optionally, taking the user collection index as an example of the watching duration of the user, and assuming that the first historical period is 7 days before the statistical moment in the stage of not configuring the test strategy based on the calculation requirement, the first user collection average value is the watching duration average value of the user in the first user data set for 7 days;
step S1012 obtains a second user collection mean value in the first history period according to the second user collection index.
Continuing the above example, the second user collecting average value is the average value of the watching time period 7 days before the statistical moment when the user in the second user data set is in the stage of not configuring the testing strategy;
step S1013 uses the difference between the first user collection mean value and the second user collection mean value as the first collection index difference information.
Step S1014 obtains a third user collection mean value in the second history period according to the third user collection index.
Continuing the above example, the third user collecting average value is the average value of the watching time period of 7 days before the statistical moment in the stage of configuring the testing strategy by the user in the first user data set;
Step S1015 obtains a fourth user acquisition mean value in the second history period according to the fourth user acquisition index.
Continuing the above example, the fourth user collecting average value is the average value of the watching time period 7 days before the statistical moment in the stage of configuring the testing strategy by the user in the second user data set;
step S1016 takes the difference between the third user collection mean value and the fourth user collection mean value as the second collection index difference information.
The difference between the control group and the experimental group when the test strategy is not configured and the difference between the control group and the experimental group when the test strategy is configured and the difference brought by the test strategy are scientifically obtained by respectively calculating the difference information of the first acquired index and the difference information of the second acquired index.
In order to facilitate understanding the process of obtaining the second acquired index difference information and the first acquired index difference information and further calculating the difference information, please refer to fig. 5, fig. 5 is a schematic diagram of data interference cancellation provided in an embodiment of the present invention.
In fig. 1, it is assumed that when t=0, the server 10 in fig. 1 is not configured with a test policy, and the server 10 collects the first user collection index of the terminal device 20a in the first user group and the second user collection index of the terminal device 20b in the second user group, respectively.
For example, if the user acquisition index is the average value of the viewing time period of the past 7 days, the first user acquisition index is the average value of the viewing time period of 7 days of the control group under the condition that the test strategy is not configured; correspondingly, the second user acquisition index is the average value of the watching time period of 7 days of the experimental group;
further, the first collected index difference information=second user collected index-first user collected index; it can be expressed by the following formula:
E(ΔY i |D i =0)
wherein D is i =0 characterizes the moment t=0, i.e. the stage of the test strategy not being configured, Δy i Representing user acquisition index, i is a user corresponding to a specific terminal equipment, E (delta Y) i |D i =0) characterizes the first acquisition indicator difference information.
Further, assuming that t=1, the server 10 in fig. 1 has configured a test policy, and the server 10 collects the third user collection index of the terminal device 20a in the first user group and the fourth user collection index of the terminal device 20b in the second user group, respectively.
Similarly, if the user acquisition index is the average value of the viewing time length of the past 7 days, the third user acquisition index is the average value of the viewing time length of 7 days of the control group under the condition of configured testing strategies; correspondingly, the fourth user acquisition index is the average value of the watching time length of 7 days of the experimental group;
Further, the second collected index difference information=fourth user collected index-third user collected index; it can be expressed by the following formula:
E(ΔY i |D i =1)
wherein D is i =1 characterizes the moment t=1, i.e. the phase of the configured test strategy, E (Δy i |D i =0) characterizing the second acquisition indicator difference information.
Furthermore, referring to step S101 and fig. 4, the difference information may be the difference between the second collected index difference information and the first collected index difference information, and because the difference information eliminates the non-homogeneous data interference contained in the second collected index difference information in the t=0 stage, the accuracy of the subsequent measurement of the test using the difference information is further improved.
Optionally, since there is a possibility that a part of data is too severely changed during the data collection process and cannot be used as effective data for analysis, it is necessary to select data with relatively stable data change trend as the user collection index in the first user data set and the second user data set. Furthermore, in order to screen valid data, a possible implementation manner is provided below, and fig. 6 is a flowchart of another data interference cancellation method provided in the embodiment of the present invention, based on fig. 4, as shown in fig. 6. Further comprises:
In step S1001, a first user prediction index corresponding to the first user acquisition index and a second user prediction index corresponding to the second user acquisition index are obtained through the prediction model respectively.
In this embodiment, the prediction model may be a linear regression model:
Y=a+bX+c,
wherein X represents an input user acquisition index, Y represents a user prediction index, a and c represent intercept and error terms of the model respectively, and b represents a slope of the model. And inputting the first user acquisition index and the second user acquisition index into the prediction model to obtain a first user prediction index and a second user prediction index.
It should be noted that, the prediction model may also be implemented by using a regression function in a sklearn algorithm package, and specifically, the implementation manner is that a machine learning package is well packaged, and the server 10 may call a corresponding package to execute.
The invention is not limited to the specific implementation form of the prediction model.
Continuing taking the above user acquisition index as the average value of the viewing time length of the past 7 days as an example, for the first user acquisition index, in combination with the embodiment of fig. 5, the average value of the viewing time length of the control group for 7 days under the condition that the test strategy is not configured may be used, and then obtaining the viewing time length of the control group at the time t=0 according to the prediction model, so as to be used as the first user prediction index; similarly, in connection with the embodiment of fig. 5, a second user prediction index corresponding to the second user acquisition index may be obtained.
Step S1002 determines whether the first user prediction index and the first user acquisition index, and the second user prediction index and the second user acquisition index both satisfy the stability condition.
Optionally, calculating an error between the first user prediction index and the first user acquisition index and an error between the second user prediction index and the second user acquisition index, and stabilizing when an average value of the errors between the first user prediction index and the first user acquisition index and an average value of the errors between the second user prediction index and the second user acquisition index meet a normal distribution.
Optionally, when the positive-too-distribution with the error mean value of zero is satisfied, the first user prediction index and the first user acquisition index, and the second user prediction index and the second user acquisition index are stable.
If so, the above step S101 may be performed.
If not, step S1003 is executed to update the parameters of the prediction model until the first user prediction index and the first user acquisition index obtained by the updated prediction model, and the obtained second user prediction index and the second user acquisition index all meet the stability condition.
Optionally, the parameters of the prediction model are updated, and one possible implementation is to adjust the parameters a, b, c as described above so that the mean square error is minimized; fitting is carried out according to the data of the user watching duration, the fitted parameters a, b and c are calculated, a plurality of groups of data of the user watching duration are substituted into the prediction model to be fitted, then a plurality of groups of parameters a, b and c are obtained, then the prediction model obtained by each group of parameters a, b and c is used for prediction, so that the predicted value yi 'corresponding to each group of parameters a, b and c is obtained, and each predicted value yi' has a corresponding real value yi. And then selecting parameters a, b and c corresponding to the minimum mean square error as parameters of a final prediction model in a mean square error calculation mode.
Optionally, the mean square error is calculated byWhere n represents the number of users, yi represents the actual time duration of each user, and yi' represents the predicted time duration fitted with this model.
Referring to fig. 7, a schematic structural diagram of a data interference cancellation device according to an embodiment of the present invention is shown. The data interference cancellation device 300 includes: an information acquisition module 320, a difference calculation module 340, and a parameter acquisition module 360.
The information obtaining module 320 is configured to obtain first collected index difference information and second collected index difference information respectively;
the first acquisition index difference information characterizes the difference information of a first user acquisition index contained in the first user data set and a second user acquisition index contained in the second user data set when a test strategy is not configured; the second collected index difference information characterizes difference information of a third user collected index contained in the first user data set and a fourth user collected index contained in the second user data set when the testing strategy is configured.
The difference calculating module 340 is configured to calculate difference information between the second collected index difference information and the first collected index difference information.
The parameter acquisition module 360 is configured to use the difference information as a measurement parameter.
According to the data interference elimination device provided by the embodiment of the invention, the information acquisition module is used for respectively acquiring the first acquisition index difference information and the second acquisition index difference information, wherein the first acquisition index difference information represents the difference information of the first user acquisition index contained in the first user data set and the second user acquisition index contained in the second user data set when a test strategy is not configured, and the second acquisition index difference information represents the difference information of the third user acquisition index contained in the first user data set and the fourth user acquisition index contained in the second user data set when the test strategy is configured. And further, the difference value calculation module calculates the difference value information of the second acquired index difference information and the first acquired index difference information, so that the difference value information eliminates data interference generated by the fact that the non-homogeneous data difference between the first user acquired index and the second user acquired index cannot be ensured when a test strategy is not configured, the difference value information is used as a test measurement parameter, and the accuracy of data analysis is improved.
As an implementation manner, referring to fig. 8 on the basis of fig. 7, fig. 8 shows a schematic structural diagram of an information obtaining module 320 of a data interference cancellation device 300 according to an embodiment of the present invention, where in the embodiment of the present invention, the information obtaining module 320 includes a mean value obtaining unit 321 and a difference information obtaining unit 322.
The average value obtaining unit 321 is configured to obtain a first user acquisition average value in a first history period according to a first user acquisition index; acquiring a second user acquisition mean value in the first history period according to the second user acquisition index; acquiring a third user acquisition mean value in a second historical period according to the third user acquisition index; and acquiring a fourth user acquisition mean value in the second historical period according to the fourth user acquisition index.
The difference information obtaining unit 322 is configured to use a difference value between the first user acquired average value and the second user acquired average value as first acquired index difference information; and taking the difference value between the third user acquisition mean value and the fourth user acquisition mean value as second acquisition index difference information.
As an implementation manner, referring to fig. 9 on the basis of fig. 7, fig. 9 shows another schematic structure of a data interference cancellation device 300 according to an embodiment of the present invention, and in this embodiment of the present invention, the data interference cancellation device 300 further includes a stability determination module 310. The stability determination module 310 includes a predictor obtaining unit 311 and a stability determination unit 312.
The prediction index obtaining unit 311 is configured to obtain, through a prediction model, a first user prediction index corresponding to the first user acquisition index and a second user prediction index corresponding to the second user acquisition index, respectively.
The stability judging unit 312 is configured to determine that the first user prediction index and the first user acquisition index, and the second user prediction index and the second user acquisition index both meet the stability condition, and operate the information obtaining module 320 to obtain the first acquisition index difference information and the second acquisition index difference information respectively.
The stability judging unit 312 is further configured to determine that the first user prediction index and the first user acquisition index, and when the second user prediction index and the second user acquisition index do not meet the stability condition, update the parameters of the prediction model until the first user prediction index and the first user acquisition index, and the second user prediction index and the second user acquisition index obtained through the updated prediction model meet the stability condition.
The data interference elimination device 300 subtracts the difference between the control group and the experimental group when the test strategy is not configured, which is indicated by the difference information of the second acquired index, from the difference between the control group and the experimental group when the test strategy is configured and the difference caused by the test strategy, which is indicated by the difference information of the first acquired index, to obtain the difference information serving as a test measurement parameter, so that the difference between the control group and the experimental group in the stage of the configured test strategy can be eliminated, and the accuracy of the final result of the AB test experiment can be ensured.
The data interference cancellation system provided by the embodiment of the invention can adopt the architecture shown in fig. 1. The server 10 may execute the steps of fig. 3 to fig. 7, so that the system calculates the difference information of the second collected index difference information and the first collected index difference information through the server difference calculation module, so that the difference information eliminates the data interference generated when the testing strategy is not configured due to the fact that the non-homogeneous data difference between the first user collected index and the second user collected index cannot be ensured, and the difference information is used as a testing measurement parameter, thereby improving the accuracy of data analysis.
In the several embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, for example, of the flowcharts and block diagrams in the figures that illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present invention may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the embodiments of the present invention, and various modifications and variations can be made to the embodiments of the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the embodiments of the present invention should be included in the protection scope of the embodiments of the present invention.

Claims (7)

1. A method for data interference cancellation, comprising:
respectively obtaining first acquisition index difference information and second acquisition index difference information;
the first acquisition index difference information characterizes the difference information of a first user acquisition index contained in the first user data set and a second user acquisition index contained in the second user data set when a test strategy is not configured; the second collection index difference information characterizes the difference information of a third user collection index contained in the first user data set and a fourth user collection index contained in the second user data set when a testing strategy is configured;
calculating difference information of the second acquired index difference information and the first acquired index difference information;
taking the difference information as a test measurement parameter;
the first user acquisition index is a first user acquisition index in a first history period, and the second user acquisition index is a second user acquisition index in the first history period; the third user acquisition index is a third user acquisition index in a second history period, and the fourth user acquisition index is a fourth user acquisition index in the second history period; the step of obtaining the first acquired index difference information and the second acquired index difference information respectively includes:
Acquiring a first user acquisition mean value in the first historical period according to the first user acquisition index;
acquiring a second user acquisition mean value in the first historical period according to the second user acquisition index;
taking the difference value between the first user acquisition mean value and the second user acquisition mean value as the first acquisition index difference information;
acquiring a third user acquisition mean value in the second historical period according to the third user acquisition index;
acquiring a fourth user acquisition mean value in the second historical period according to the fourth user acquisition index;
and taking the difference value between the third user acquisition mean value and the fourth user acquisition mean value as the second acquisition index difference information.
2. The method of claim 1, further comprising, prior to the step of separately obtaining the first and second acquisition indicator difference information:
respectively obtaining a first user prediction index corresponding to the first user acquisition index and a second user prediction index corresponding to the second user acquisition index through a prediction model;
determining whether the first user prediction index and the first user acquisition index and the second user prediction index and the second user acquisition index meet a stable condition;
If yes, executing the step of respectively obtaining the first acquired index difference information and the second acquired index difference information;
if not, updating parameters of the prediction model until the first user prediction index and the first user acquisition index obtained through the updated prediction model and the obtained second user prediction index and the second user acquisition index meet the stability condition.
3. A data interference cancellation device, comprising:
the information acquisition module is used for respectively acquiring the first acquisition index difference information and the second acquisition index difference information; the first acquisition index difference information characterizes the difference information of a first user acquisition index contained in the first user data set and a second user acquisition index contained in the second user data set when a test strategy is not configured; the second collection index difference information characterizes the difference information of a third user collection index contained in the first user data set and a fourth user collection index contained in the second user data set when a testing strategy is configured;
the difference value calculation module is used for calculating difference value information of the second acquisition index difference information and the first acquisition index difference information;
The parameter acquisition module is used for taking the difference information as a test measurement parameter; the first user acquisition index is a first user acquisition index in a first history period, and the second user acquisition index is a second user acquisition index in the first history period; the third user acquisition index is a third user acquisition index in a second history period, the fourth user acquisition index is a fourth user acquisition index in the second history period, and the information acquisition module comprises:
the average value acquisition unit is used for acquiring a first user acquisition average value in the first historical period according to the first user acquisition index; acquiring a second user acquisition mean value in the first historical period according to the second user acquisition index; acquiring a third user acquisition mean value in the second historical period according to the third user acquisition index; acquiring a fourth user acquisition mean value in the second historical period according to the fourth user acquisition index;
the difference information acquisition unit is used for taking the difference value between the first user acquisition mean value and the second user acquisition mean value as the first acquisition index difference information; and taking the difference value between the third user acquisition mean value and the fourth user acquisition mean value as the second acquisition index difference information.
4. A device according to claim 3, characterized in that the device further comprises: a stability judging module;
the stability judging module includes:
the prediction index obtaining unit is used for respectively obtaining a first user prediction index corresponding to the first user acquisition index and a second user prediction index corresponding to the second user acquisition index through a prediction model;
the stability judging unit is used for determining that the first user prediction index and the first user acquisition index and the second user prediction index and the second user acquisition index meet stability conditions, and operating the information acquisition module to respectively acquire first acquisition index difference information and second acquisition index difference information;
the stability judging unit is further configured to determine that the first user prediction index and the first user acquisition index, and when the second user prediction index and the second user acquisition index do not meet a stability condition, update parameters of the prediction model until the first user prediction index and the first user acquisition index, and the second user prediction index and the second user acquisition index obtained through the updated prediction model meet the stability condition.
5. A data interference cancellation system, the system comprising: a data acquisition device, the apparatus of any one of claims 3 to 4;
the data acquisition equipment is used for acquiring the first user data set and the second user data set when the test strategy is not configured and when the test strategy is configured respectively; when a testing strategy is not configured, the difference information of a first user acquisition index contained in the first user data set and a second user acquisition index contained in the second user data set is obtained; and when the testing strategy is configured, the third user acquisition index contained in the first user data set and the fourth user acquisition index contained in the second user data set are different in information.
6. An electronic device comprising a processor and a memory, the memory storing a computer program, the processor implementing the method of any one of claims 1-2 when executing the computer program.
7. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the method of any of claims 1-2.
CN202110961015.4A 2021-08-20 2021-08-20 Data interference elimination method and related device Active CN113688124B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110961015.4A CN113688124B (en) 2021-08-20 2021-08-20 Data interference elimination method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110961015.4A CN113688124B (en) 2021-08-20 2021-08-20 Data interference elimination method and related device

Publications (2)

Publication Number Publication Date
CN113688124A CN113688124A (en) 2021-11-23
CN113688124B true CN113688124B (en) 2023-10-31

Family

ID=78581028

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110961015.4A Active CN113688124B (en) 2021-08-20 2021-08-20 Data interference elimination method and related device

Country Status (1)

Country Link
CN (1) CN113688124B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113159815A (en) * 2021-01-25 2021-07-23 腾讯科技(深圳)有限公司 Information delivery strategy testing method and device, storage medium and electronic equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108880931B (en) * 2018-05-29 2020-10-30 北京百度网讯科技有限公司 Method and apparatus for outputting information

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113159815A (en) * 2021-01-25 2021-07-23 腾讯科技(深圳)有限公司 Information delivery strategy testing method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN113688124A (en) 2021-11-23

Similar Documents

Publication Publication Date Title
CN111290924B (en) Monitoring method and device and electronic equipment
CN110851539A (en) Metadata verification method and device, readable storage medium and electronic equipment
CN111198859B (en) Data processing method, device, electronic equipment and computer readable storage medium
CN113485918A (en) Test method, test device, electronic equipment and storage medium
CN114140034B (en) Quality monitoring method based on working conditions and related device
CN109240916B (en) Information output control method, information output control device and computer readable storage medium
CN113688124B (en) Data interference elimination method and related device
CN113533923A (en) GaN HEMT device testing method and device
CN115269315A (en) Abnormity detection method, device, equipment and medium
CN110032624B (en) Sample screening method and device
CN113452533B (en) Charging self-inspection and self-healing method and device, computer equipment and storage medium
CN111222739B (en) Nuclear power station task allocation method and nuclear power station task allocation system
CN109857632B (en) Test method, test device, terminal equipment and readable storage medium
CN113656391A (en) Data detection method and device, storage medium and electronic equipment
CN112379967A (en) Simulator detection method, device, equipment and medium
CN112416989A (en) Management method and device of Internet performance broker platform and electronic equipment
CN114443461A (en) Method, device, system and medium for determining code coverage information
CN110838001A (en) Sample analysis method and sample analysis system for nuclear power plant
CN111026571B (en) Processor down-conversion processing method and device and electronic equipment
CN116820539B (en) System software operation maintenance system and method based on Internet
CN113342664B (en) Dial testing method, device, storage medium and computer equipment
CN117907809B (en) Batch chip testing method and device and electronic equipment
CN115994070B (en) System availability detection method and device, electronic equipment and readable storage medium
CN116703560A (en) Borrowing credit auditing optimization method and device, electronic equipment and readable storage medium
CN112819165A (en) Concept identification method and device, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant