CN111160566A - Sample generation method and device, computer readable storage medium and computer equipment - Google Patents
Sample generation method and device, computer readable storage medium and computer equipment Download PDFInfo
- Publication number
- CN111160566A CN111160566A CN201911365416.2A CN201911365416A CN111160566A CN 111160566 A CN111160566 A CN 111160566A CN 201911365416 A CN201911365416 A CN 201911365416A CN 111160566 A CN111160566 A CN 111160566A
- Authority
- CN
- China
- Prior art keywords
- resource transfer
- time
- data
- information
- network media
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000012546 transfer Methods 0.000 claims abstract description 133
- 238000012549 training Methods 0.000 claims abstract description 61
- 238000004422 calculation algorithm Methods 0.000 claims description 29
- 238000004590 computer program Methods 0.000 claims description 15
- 238000001914 filtration Methods 0.000 claims description 3
- 230000003111 delayed effect Effects 0.000 description 9
- 230000008569 process Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 238000010801 machine learning Methods 0.000 description 5
- 238000012544 monitoring process Methods 0.000 description 5
- 238000005070 sampling Methods 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 230000007704 transition Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 238000002360 preparation method Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003578 releasing effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Physics & Mathematics (AREA)
- Economics (AREA)
- Development Economics (AREA)
- Medical Informatics (AREA)
- Mathematical Physics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Game Theory and Decision Science (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application relates to a sample generation method, a device, a computer readable storage medium and a computer device, wherein the method comprises the following steps: acquiring resource transfer data of a target commodity after network media information is released; the resource transfer data includes a resource transfer time; determining information putting time of the network media information, and determining target feature data from at least two groups of pre-stored candidate feature data according to the information putting time and the resource transfer time; and generating a model training sample according to the target characteristic data and the resource transfer data. By adopting the method, the effective rate of the model training sample can be improved by obtaining the target characteristic data at a specific time, so that the model prediction capability is enhanced, and the accuracy of the model prediction result can be improved.
Description
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a sample generation method, an apparatus, a computer-readable storage medium, and a computer device.
Background
At present, with the rapid development of artificial intelligence technology, it has become an epoch trend to solve practical problems by using artificial intelligence methods such as machine learning or deep learning.
However, in the existing machine learning model modeling process, the situation that the on-line effect of the model is very poor due to the characteristic crossing often occurs, for example, when the model is trained, the result is used as the cause by mistake, which not only results in low accuracy of data used for model training, but also results in the problem that the current learning rule of the model is not consistent with the fact, and further results in poor prediction accuracy of the final model.
Therefore, the model training samples in the prior art have the problem of low efficiency of training the model.
Disclosure of Invention
Based on this, it is necessary to provide a sample generation method, an apparatus, a computer-readable storage medium, and a computer device for solving the technical problem in the prior art that model training samples are inefficient in training models.
In one aspect, an embodiment of the present invention provides a sample generation method, including: acquiring resource transfer data of a target commodity after network media information is released; the resource transfer data includes a resource transfer time; determining information delivery time of network media information, and determining target feature data from at least two groups of pre-stored candidate feature data according to the information delivery time and the resource transfer time; and generating a model training sample according to the target characteristic data and the resource transfer data.
In another aspect, an embodiment of the present invention provides a sample generation apparatus, including: the data acquisition module is used for acquiring resource transfer data of the target commodity after the network media information is released; the resource transfer data includes a resource transfer time; the characteristic determining module is used for determining the information delivery time of the network media information and determining target characteristic data from at least two groups of pre-stored candidate characteristic data according to the information delivery time and the resource transfer time; and the sample generation module is used for generating a model training sample according to the target characteristic data and the resource transfer data.
In yet another aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, the computer program, when executed by a processor, implementing the steps of: acquiring resource transfer data of a target commodity after network media information is released; the resource transfer data includes a resource transfer time; determining information delivery time of network media information, and determining target feature data from at least two groups of pre-stored candidate feature data according to the information delivery time and the resource transfer time; and generating a model training sample according to the target characteristic data and the resource transfer data.
In another aspect, an embodiment of the present invention provides a computer device, including a memory and a processor, where the memory stores a computer program, and the processor implements the following steps when executing the computer program: acquiring resource transfer data of a target commodity after network media information is released; the resource transfer data includes a resource transfer time; determining information delivery time of network media information, and determining target feature data from at least two groups of pre-stored candidate feature data according to the information delivery time and the resource transfer time; and generating a model training sample according to the target characteristic data and the resource transfer data.
According to the sample generation method, the sample generation device, the computer-readable storage medium and the computer equipment, the server can obtain the resource transfer time in the resource transfer data by obtaining the resource transfer data of the target commodity after the network media information is launched, and further after the information launch time of the network media information is determined, the target feature data is determined from at least two groups of pre-stored candidate feature data by using the information launch time and the resource transfer time, so that the target feature data and the preorder step are used for obtaining the resource transfer data to generate the model training sample in a combined manner. By adopting the method, the effective rate of the model training sample can be improved by obtaining the target characteristic data at a specific time, so that the model prediction capability is enhanced, and the accuracy of the model prediction result can be improved.
Drawings
FIG. 1 is a diagram of an application environment of a sample generation method in one embodiment;
FIG. 2 is a block diagram of a computer device in one embodiment;
FIG. 3 is a schematic flow chart of a sample generation method in one embodiment;
FIG. 4 is a schematic flow chart diagram illustrating the target feature data determination step in one embodiment;
FIG. 5 is a flowchart illustrating the resource transfer data acquisition step in one embodiment;
FIG. 6 is a flowchart illustrating the prediction result obtaining step in one embodiment;
FIG. 7 is a schematic flow chart of model training data construction in an exemplary embodiment;
FIG. 8 is a block diagram showing the structure of a sample generation device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
First, it should be noted that the sample generation method provided in the present application may be actually applied to a delayed consumption scenario, where the delayed consumption scenario specifically refers to a scenario in which after a network media information set for a target commodity is released, a releasing effect of the network media information cannot be immediately obtained, but whether the network media information is successfully released or not can be determined within a preset time, for example, a delayed consumption scenario in which a coupon is issued, a delayed consumption scenario in which an advertisement is released, and the like, all of which may exist that a user does not consume when the network media information is currently released, but consumes after a period of time by using the network media information.
FIG. 1 is a diagram of an example application environment in which a sample generation method may be used in one embodiment. The sample generation method provided by the application can be applied to the application environment shown in fig. 1. The user terminal 110 communicates with the server 120 through a network, the user terminal 110 may specifically be a desktop terminal or a mobile terminal, the mobile terminal may specifically be at least one of a mobile phone, a tablet computer, a notebook computer, and the like, the server 120 may be implemented by an independent server or a server cluster formed by a plurality of servers, and the network includes but is not limited to: a wide area network, a metropolitan area network, or a local area network.
In practical application, the server 120 may deliver specified network media information to the user terminal 110, after receiving the network media information, the user terminal 110 may operate the network media information within a preset time period, so as to generate information operation behavior data, and after obtaining the information operation behavior data, the server 120 may convert the information operation behavior data into available data for model training, that is, a model training sample. The model may be a machine learning model or a deep learning model.
FIG. 2 is a diagram illustrating an internal structure of a computer device in one embodiment. The computer device may specifically be the server 120 in fig. 1. As shown in fig. 2, the computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external computer device through a network connection. The computer program is executed by a processor to implement a sample generation method.
Those skilled in the art will appreciate that the architecture shown in fig. 2 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, as shown in FIG. 3, a sample generation method is provided. The embodiment is mainly illustrated by applying the method to the server 120 in fig. 1. Referring to fig. 3, the sample generation method specifically includes the following steps:
s302, acquiring resource transfer data of the target commodity after network media information is released; the resource transfer data includes a resource transfer time.
The target commodity may be a virtual network object that needs to be used as a monitoring object to extract monitoring data to generate a model training sample in a sample generation process, and the virtual network object may be classified according to commodity attributes, such as a vehicle, food, clothing, and the like.
The network medium information may refer to network promotion medium information for the target product, such as a coupon, an advertisement, and the like.
The resource transfer data may refer to resource transfer related data when a user performs numerical transfer, numerical exchange, or the like with a target commodity through network intermediary information, for example, numerical exchange time (resource transfer time), numerical exchange amount (resource transfer amount), numerical transfer order number (resource transfer order number), or the like.
The resource transfer time may be referred to as a sampling time (a time when a sample is acquired), and specifically may be a time when a user performs value transfer or value exchange with a target product through network intermediary information, for example, if the user purchases a car using a coupon in 10/2019, 10/2019 is the resource transfer time.
Specifically, before obtaining the resource transfer data of the target commodity, the server 120 may first construct the network media information of the target commodity, and then launch the network media information, that is, upload the network media information to the internet for network disclosure, so that when the user performs resource transfer with the target commodity through the network media information, the server 120 may further obtain the resource transfer data generated by resource transfer of the target commodity.
For example, if a car currently being a target commodity is on-line by the server 120 in 2019, month 9 and 25, and the user purchases the car in 2019, month 10 and 10, the server 120 may determine that the resource transfer time of the car is 2019, month 10 and 10.
S304, determining information putting time of the network media information, and determining target feature data from at least two groups of pre-stored candidate feature data according to the information putting time and the resource transfer time.
The information delivery time may be referred to as a sample time (time for constructing network medium information to be delivered), and may specifically refer to a time for generating and delivering network medium information for a target commodity, for example, an online release time of a coupon, an advertisement broadcasting time, and the like.
The candidate feature data may refer to user features, commodity features, and activity keyword features that are preset and stored in the database of the server 120, for example, feature data such as user gender, user age, degree of attention paid to commodities, commodity attributes, and shopping activity keywords (e.g., holiday keywords).
The target feature data may be candidate feature data which is a sample generation reason and is associated with the information delivery time in a sample generation process.
Specifically, after the server 120 obtains the resource transfer data of the target commodity, the information delivery time of the network medium information corresponding to the target commodity can be further determined, and then the target feature data of the sample used for training the currently required generated model is determined in at least two sets of pre-stored candidate feature data by comparing the information delivery time with the resource transfer time.
It should be noted that at least two sets of candidate feature data pre-stored by the server 120 respectively have corresponding feature generation time, that is, there is a mapping relationship between one set of candidate feature data and one feature generation time, after the server 120 determines the information delivery time of the network media information, it needs to further extract a sample used for training a feature data generation model, and then the feature generation time can be determined by comparing the information delivery time with the resource transfer time, so as to further determine the currently used target feature data by using the feature generation time.
For example, if the online distribution time of the coupon is 2019, month 9 and 25 days, and the resource transfer time of the target product is 2019, month 10 and 10 days, the information delivery time and the resource transfer time are not at the same time point, and at this time, the sample generation method is applied to the delayed consumption scene to construct the prediction model, and the target feature data is determined from the candidate feature data pre-stored in the server 120 by using the information delivery time (online distribution time of the coupon), that is, the feature generation time of the target feature data has a mapping relationship with the year 9, month 25 and 2019.
And S306, generating a model training sample according to the target characteristic data and the resource transfer data.
The model training sample may be data required to be used for model training, that is, data generated when a user performs resource transfer with a target commodity by using network intermediary information, for example, the user purchases the target commodity by using a coupon.
Specifically, the model training sample can be actually used for training a machine learning prediction model or a deep learning prediction model applied to a delayed consumption scene, and the target characteristic data and the resource transfer data are required to be determined for generating the model training sample, so that the data used for model training can be obtained by combining the target characteristic data and the resource transfer data.
More specifically, after model training of the machine learning prediction model or the deep learning prediction model is realized by using the model training sample, under a delayed consumption scene, the user demand degree of the network media information corresponding to the target commodity can be obtained through the trained prediction model, so that accurate delivery of the network media information is realized by using the user demand degree.
In this embodiment, the server may obtain resource transfer time in the resource transfer data by obtaining resource transfer data of the target commodity after the network media information is delivered, and then, after the information delivery time of the network media information is determined, determine target feature data from at least two sets of pre-stored candidate feature data by using the information delivery time and the resource transfer time, so as to obtain a resource transfer data combination by using the target feature data and the preamble step to generate a model training sample. By adopting the method, the effective rate of the model training sample can be improved by obtaining the target characteristic data at a specific time, so that the model prediction capability is enhanced, and the accuracy of the model prediction result can be improved.
As shown in fig. 4, in an embodiment, the determining an information delivery time of the network media information in step S304, and determining the target feature data from at least two sets of pre-stored candidate feature data according to the information delivery time and the resource transfer time specifically includes the following steps:
s3042, determining an information delivery time of the network media information.
Specifically, the server 120 may obtain the information delivery time of the network media information by obtaining historical log data after the target commodity is determined, that is, information data for operations related to the target commodity is recorded in the historical log data, and the information data includes the information delivery time of the network media information.
S3044, matching the information delivery time with the resource transfer time.
Specifically, the server 120 may perform a time comparison between the information release time and the resource transfer time to implement a task of matching the information release time and the resource transfer time.
For example, the time relationship between the information delivery time 2019, 9, 25 days, and the resource transfer time 2019, 10 days is compared.
S3046, if the information delivery time does not match the resource transfer time, determining target feature data from at least two pre-stored sets of candidate feature data according to the information delivery time.
Specifically, if the server 120 determines that the information release time does not match the resource transfer time, that is, the sampling time (resource transfer time) and the sample time (information release time) are not at the same time point, the target feature data may be extracted using the sample time (information release time) when constructing sample data, and then a model training sample is generated.
For example, if the information release time of the training sample of which the model is to be generated is 25/9/2019 and the resource transfer time is 10/2019, and the two are not at the same time point, the determination time of the target feature data is selected from 25/9/2019 of the information release time, and the data having a mapping relation with 25/9/2019 is acquired from the pre-stored candidate feature data and is used as the target feature data.
In the embodiment, the target characteristic data is determined by matching the information delivery time and the resource transfer time, so that the characteristic crossing can be avoided in the model training process, the result can be prevented from being used as a reason by mistake in the model training process, the situation in the model prediction process can be further reproduced, the data used in the model training is matched with the fact, the model prediction capability can be further improved, and the accuracy of the model prediction result can be further improved.
In an embodiment, if the information release time does not match the resource transfer time in step S3046, determining the target feature data from at least two sets of pre-stored candidate feature data according to the information release time includes the following steps:
s30462, if the information delivery time does not match the resource transfer time, determining candidate feature data matching the information delivery time from at least two sets of pre-stored candidate feature data as target feature data.
Specifically, the server 120 further determines candidate feature data matched with the information delivery time as target feature data, which may be to first determine feature generation time of each candidate feature data, further match each feature generation time with the information delivery time, determine candidate feature data corresponding to the feature generation time matched with the information delivery time, and use the candidate feature data as the target feature data for generating the model training sample when the current information delivery time is not matched with the resource transfer time.
In this embodiment, when the server determines that the information release time does not match the resource transfer time, the target feature data is determined by using the information release time, which not only can further improve the effective rate of the model training sample, but also can improve the model prediction capability by using the effective model training sample, and can further improve the accuracy of the model prediction result.
As shown in fig. 5, in an embodiment, the acquiring resource transfer data of the target commodity after the network media information is delivered in step S302 specifically includes the following steps:
and S3022, acquiring historical log data of the target commodity after the network media information is released.
The historical log data may refer to real-time status data of the target product.
Specifically, the resource transfer data of the target commodity is generated when the target commodity is subjected to resource transfer, and the resource transfer action includes the function of the network media information, and the server 120 obtains the resource transfer data of the target commodity after the network media information is delivered, and can be executed by obtaining and monitoring the historical log data of the target commodity after the network media information is delivered.
S3024, reading the resource transfer state in the history log data.
The resource transition state may refer to a value transition storage state of the target commodity, for example, a transition state and an un-transition state.
Specifically, the server 120 may obtain the resource transfer data of the target product by monitoring the change of the resource transfer state in the historical log data in real time.
S3026, when the resource transfer state is the transferred state, determining the history log data as the resource transfer data.
Specifically, the server 120 may determine currently existing history log data as the resource transfer data of the target commodity when reading and monitoring that the resource transfer state is the transferred state.
In this embodiment, the server may determine the time for acquiring the resource transfer data by acquiring the resource transfer state of the target commodity, and further acquire the resource transfer data of the target commodity after the network medium information is released, so that the efficiency of acquiring the resource transfer data may be improved, the efficiency of the model training sample may be further improved, the model prediction capability may be improved, and the accuracy of the model prediction result may be further improved.
As shown in fig. 6, in an embodiment, after generating the model training sample according to the target feature data and the resource transfer data in step S306, the method specifically includes the following steps:
s3082, inputting the model training sample into the user demand prediction model.
The user demand degree prediction model can be an algorithm model used for calculating the demand degree of the user on the currently-launched network media information, and can be applied to a delayed consumption scene to realize user demand degree prediction on the network media information corresponding to the target commodity.
Specifically, after the server 120 generates the model training sample by using the target feature data and the resource transfer data, the model training sample may be input to the user demand prediction model to calculate the demand of the user for the currently measured network media information.
For example, it is predicted whether a user is interested in a particular coupon.
S3084, obtaining a prediction result output by the user demand degree prediction model, and obtaining the user demand degree of the network medium information.
Specifically, the output result of the user demand degree prediction model, that is, the user demand degree including the network medium information currently regarding the target commodity, is obtained, and information delivery can be performed on a directed user group, so as to improve the service index of the actual application scenario.
In this embodiment, the server may input the currently generated model training sample to the user demand prediction model to obtain the user demand of the network media information, so as to determine the targeted delivery crowd of the network media information by using the user demand, thereby improving the service index of the actual application scenario and further satisfying the user demand.
In one embodiment, the step S3084 of obtaining the prediction result output by the user demand degree prediction model to obtain the user demand degree of the target commodity after the network media information is delivered specifically includes the following steps:
s30842, obtaining a prediction result output by the user demand degree prediction model through a preset recommendation algorithm, and obtaining the user demand degree of the network medium information.
Specifically, the prediction capability of the server 120 for the user demand degree of the network media information can be improved through a preset recommendation algorithm, so that the server 120 can calculate the real demand of the user for the network media information corresponding to the target commodity conveniently.
In the embodiment, the server can calculate the user demand degree of the network media information through a preset recommendation algorithm, so that the model prediction capability is improved, and the accuracy of the model prediction result can be improved.
In one embodiment, the preset recommendation algorithm includes any one of a content-based recommendation algorithm, a collaborative filtering algorithm, a rule-based recommendation algorithm, a utility-based recommendation algorithm, and a knowledge-based recommendation algorithm.
The content-based recommendation algorithm may be an algorithm that finds the relevance of an item or content according to metadata of recommended items or content, and then recommends similar items to a user based on past preference records of the user.
The collaborative filtering algorithm may refer to an algorithm that recommends only by knowing the relationship between the user and the article, without considering the property of the article itself.
The rule-based recommendation algorithm can be a recommendation algorithm based on an association rule, that is, an algorithm which takes a purchased commodity as a rule head and a rule body as a recommendation object on the basis of the association rule.
The utility-based recommendation algorithm can be calculated on the utility condition of the user use item, and the core problem is how to create a utility function for each user, so that the user profile model is largely determined by the utility function adopted by the system.
The knowledge-based recommendation algorithm may refer to an inference technique.
Specifically, in practical applications, the server 120 may determine to use any one of the above recommendation algorithms according to the project requirements to calculate the user demand degree of the network media information corresponding to the target commodity.
In the embodiment, various recommendation algorithms are provided for the server to use when the actual calculation and prediction of the model are controlled, so that the model prediction efficiency and accuracy can be further improved.
To facilitate a thorough understanding of the embodiments of the present application by those skilled in the art, a specific example will be described below with reference to fig. 7. Fig. 7 is a schematic flowchart of model training data construction in an embodiment of the present application, which is applied to a "ticket recommendation service scenario for WeChat payment" in a delayed consumption scenario.
As can be seen in fig. 7, three phases of model modeling are included: data preparation, model training, and model prediction. The method mainly provides a model training data construction scheme in a data preparation stage, namely how to generate sample data for model training. First, the contents of the time line in fig. 7 will be described, including "WeChat Payment side designation coupon delivery plan" (delivery plan for one type of coupon is constructed) "at 25 th.9.2019, 30 th.9.2019," coupon user pickup is exposed to a certain user "(network medium information — exposure of coupon)," coupon user pickup at check-out of a certain store "(user has made a transaction using coupon) at 10 th.2019, and" model training data is constructed "at 11 th.10 th.2019.
It can be determined that, on 10/2019, when a user uses a coupon to perform a resource transfer operation on a target commodity, a model training sample is generated, but the sample generation time is actually sampling time, and a feature crossing problem is likely to occur in feature data generated by using the sampling time, the target feature data generated by using the sample time, that is, feature data generated by time associated with the sample time shown on 25/9/2019, needs to be further acquired.
And after determining that the sample time is 2019, month 25 and year 9, month 24, the target feature data generated by 2019, month 9 and day 24 is actually adopted, because the feature data at the time point is actually prestored in the server 120 before the "WeChat Payment side designation coupon delivery plan" (a delivery plan of a coupon is constructed) at month 25 and year 9 in 2019. Thus, the server 120 currently constructs the model training data using the "T + N" combined feature sample pattern, that is, constructs the model training data using the features at the time T (24/9/2019), the samples at the time T + N (10/2019), and not using the "T + 1" combined feature sample pattern: characteristics of time T (10/9/2019), and samples of time T +1 (10/2019).
In the embodiment, the effective rate of the model training sample can be improved by acquiring the target characteristic data at a specific time, so that the model prediction capability is enhanced, and the accuracy of the model prediction result can be improved.
It should be understood that although the various steps in the flow charts of fig. 3-6 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 3-6 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternating with other steps or at least some of the sub-steps or stages of other steps.
As shown in fig. 8, in an embodiment, a sample generation apparatus 800 is provided, where the apparatus 800 may be disposed in a sample generation system, and is configured to perform the sample generation method, where the sample generation apparatus 800 specifically includes: a data acquisition module 802, a feature determination module 804, and a sample generation module 806, wherein:
the data acquisition module 802 is configured to acquire resource transfer data of the target commodity after the network media information is delivered; the resource transfer data includes a resource transfer time;
the characteristic determining module 804 is used for determining information delivery time of the network media information and determining target characteristic data from at least two groups of pre-stored candidate characteristic data according to the information delivery time and the resource transfer time;
and a sample generating module 806, configured to generate a model training sample according to the target feature data and the resource transfer data.
In one embodiment, the characteristic determining module 804 is further configured to determine an information delivery time of the network media information; matching the information delivery time with the resource transfer time; and if the information release time is not matched with the resource transfer time, determining target characteristic data from at least two groups of pre-stored candidate characteristic data according to the information release time.
In one embodiment, the feature determining module 804 is further configured to determine, from at least two pre-stored sets of candidate feature data, a candidate feature data that matches the information delivery time as the target feature data if the information delivery time does not match the resource transfer time.
In one embodiment, the data obtaining module 802 is further configured to obtain historical log data of the target product after the network media information is delivered; reading a resource transfer state in historical log data; when the resource transfer state is the transferred state, determining the historical log data as the resource transfer data.
In one embodiment, the sample generating apparatus 800 further includes a prediction result obtaining module, configured to input the model training sample to the user demand prediction model; and obtaining a prediction result output by the user demand degree prediction model to obtain the user demand degree of the network media information.
In one embodiment, the prediction result obtaining module is further configured to obtain, through a preset recommendation algorithm, a prediction result output by the user demand degree prediction model to obtain the user demand degree of the network media information.
In this embodiment, the server may obtain resource transfer time in the resource transfer data by obtaining resource transfer data of the target commodity after the network media information is delivered, and then, after the information delivery time of the network media information is determined, determine target feature data from at least two sets of pre-stored candidate feature data by using the information delivery time and the resource transfer time, so as to obtain a resource transfer data combination by using the target feature data and the preamble step to generate a model training sample. By adopting the scheme, the effective rate of the model training sample can be improved by acquiring the target characteristic data at a specific time, so that the model prediction capability is enhanced, and the accuracy of the model prediction result can be improved.
In one embodiment, the sample generation apparatus provided herein may be implemented in the form of a computer program that is executable on a computer device such as that shown in fig. 2. The memory of the computer device may store various program modules that make up the sample generation apparatus, such as the data acquisition module 802, the feature determination module 804, and the sample generation module 806 shown in fig. 8. The computer program constituted by the respective program modules causes the processor to execute the steps in the sample generation method of the respective embodiments of the present application described in the present specification.
For example, the computer device shown in fig. 2 may execute step S302 by the data acquisition module 802 in the sample generation apparatus shown in fig. 8. The computer device may perform step S304 by the feature determination module 804. The computer device may perform step S306 by the sample generation module 806.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the above-described sample generation method. Here, the steps of the sample generation method may be steps in the sample generation methods of the respective embodiments described above.
In one embodiment, a computer readable storage medium is provided, storing a computer program which, when executed by a processor, causes the processor to perform the steps of the above-described sample generation method. Here, the steps of the sample generation method may be steps in the sample generation methods of the respective embodiments described above.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A method of generating a sample, comprising the steps of:
acquiring resource transfer data of a target commodity after network media information is released; the resource transfer data includes a resource transfer time;
determining information putting time of the network media information, and determining target feature data from at least two groups of pre-stored candidate feature data according to the information putting time and the resource transfer time;
and generating a model training sample according to the target characteristic data and the resource transfer data.
2. The method according to claim 1, wherein the determining an information delivery time of the network media information and determining target feature data from at least two pre-stored sets of candidate feature data according to the information delivery time and the resource transfer time comprises:
determining the information delivery time of the network media information;
matching the information delivery time with the resource transfer time;
and if the information release time is not matched with the resource transfer time, determining target characteristic data from at least two groups of pre-stored candidate characteristic data according to the information release time.
3. The method according to claim 2, wherein if the information delivery time does not match the resource transfer time, determining target feature data from at least two pre-stored sets of candidate feature data according to the information delivery time comprises:
and if the information release time is not matched with the resource transfer time, determining candidate characteristic data matched with the information release time from at least two groups of pre-stored candidate characteristic data as the target characteristic data.
4. The method of claim 1, wherein the obtaining resource transfer data of the target commodity after the network media information is delivered comprises:
acquiring historical log data of the target commodity after network media information is released;
reading a resource transfer state in the historical log data;
and when the resource transfer state is the transferred state, determining the historical log data as the resource transfer data.
5. The method of claim 1, further comprising, after generating model training samples from the target feature data and the resource transfer data:
inputting the model training sample into a user demand degree prediction model;
and obtaining a prediction result output by the user demand degree prediction model to obtain the user demand degree of the network media information.
6. The method according to claim 5, wherein the obtaining of the prediction result output by the user demand degree prediction model to obtain the user demand degree of the target commodity after the network media information is delivered comprises:
and obtaining a prediction result output by the user demand degree prediction model through a preset recommendation algorithm to obtain the user demand degree of the network media information.
7. The method according to claim 6, wherein the preset recommendation algorithm comprises any one of a content-based recommendation algorithm, a collaborative filtering algorithm, a rule-based recommendation algorithm, a utility-based recommendation algorithm, and a knowledge-based recommendation algorithm.
8. A sample generation device, the device comprising:
the data acquisition module is used for acquiring resource transfer data of the target commodity after the network media information is released; the resource transfer data includes a resource transfer time;
the characteristic determining module is used for determining the information putting time of the network media information and determining target characteristic data from at least two groups of pre-stored candidate characteristic data according to the information putting time and the resource transfer time;
and the sample generation module is used for generating a model training sample according to the target characteristic data and the resource transfer data.
9. A computer-readable storage medium, storing a computer program which, when executed by a processor, causes the processor to carry out the steps of the method according to any one of claims 1 to 7.
10. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911365416.2A CN111160566A (en) | 2019-12-26 | 2019-12-26 | Sample generation method and device, computer readable storage medium and computer equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911365416.2A CN111160566A (en) | 2019-12-26 | 2019-12-26 | Sample generation method and device, computer readable storage medium and computer equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111160566A true CN111160566A (en) | 2020-05-15 |
Family
ID=70558190
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911365416.2A Pending CN111160566A (en) | 2019-12-26 | 2019-12-26 | Sample generation method and device, computer readable storage medium and computer equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111160566A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113689237A (en) * | 2021-08-20 | 2021-11-23 | 北京达佳互联信息技术有限公司 | Method and device for determining to-be-launched media resource and media resource processing model |
TWI754446B (en) * | 2020-11-05 | 2022-02-01 | 中華電信股份有限公司 | System and method for maintaining model inference quality |
CN114328480A (en) * | 2021-12-08 | 2022-04-12 | 腾讯科技(深圳)有限公司 | Data processing method and related device |
CN114546974A (en) * | 2020-11-26 | 2022-05-27 | 北京达佳互联信息技术有限公司 | Data labeling method and device |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107341682A (en) * | 2016-11-17 | 2017-11-10 | 精硕科技(北京)股份有限公司 | A kind of advertising message monitoring and evaluation method and its device |
CN108460674A (en) * | 2018-02-01 | 2018-08-28 | 平安科技(深圳)有限公司 | Information processing method, device, computer equipment and storage medium |
CN108960899A (en) * | 2018-06-11 | 2018-12-07 | 广东因特利信息科技股份有限公司 | The user information exchange method and system launched for advertisement |
CN110135899A (en) * | 2019-05-09 | 2019-08-16 | 达疆网络科技(上海)有限公司 | A kind of differentiation discount coupon distribution method based on user's access-hours |
CN110188957A (en) * | 2019-06-03 | 2019-08-30 | 南京微尚信息技术有限公司 | Website intelligent optimization extension system |
CN110232600A (en) * | 2019-06-18 | 2019-09-13 | 浙江华坤道威数据科技有限公司 | A kind of large-size screen monitors advertisement orientation jettison system and method based on the analysis of multi-source heterogeneous data |
-
2019
- 2019-12-26 CN CN201911365416.2A patent/CN111160566A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107341682A (en) * | 2016-11-17 | 2017-11-10 | 精硕科技(北京)股份有限公司 | A kind of advertising message monitoring and evaluation method and its device |
CN108460674A (en) * | 2018-02-01 | 2018-08-28 | 平安科技(深圳)有限公司 | Information processing method, device, computer equipment and storage medium |
CN108960899A (en) * | 2018-06-11 | 2018-12-07 | 广东因特利信息科技股份有限公司 | The user information exchange method and system launched for advertisement |
CN110135899A (en) * | 2019-05-09 | 2019-08-16 | 达疆网络科技(上海)有限公司 | A kind of differentiation discount coupon distribution method based on user's access-hours |
CN110188957A (en) * | 2019-06-03 | 2019-08-30 | 南京微尚信息技术有限公司 | Website intelligent optimization extension system |
CN110232600A (en) * | 2019-06-18 | 2019-09-13 | 浙江华坤道威数据科技有限公司 | A kind of large-size screen monitors advertisement orientation jettison system and method based on the analysis of multi-source heterogeneous data |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI754446B (en) * | 2020-11-05 | 2022-02-01 | 中華電信股份有限公司 | System and method for maintaining model inference quality |
CN114546974A (en) * | 2020-11-26 | 2022-05-27 | 北京达佳互联信息技术有限公司 | Data labeling method and device |
CN113689237A (en) * | 2021-08-20 | 2021-11-23 | 北京达佳互联信息技术有限公司 | Method and device for determining to-be-launched media resource and media resource processing model |
CN114328480A (en) * | 2021-12-08 | 2022-04-12 | 腾讯科技(深圳)有限公司 | Data processing method and related device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110489520B (en) | Knowledge graph-based event processing method, device, equipment and storage medium | |
CN109582876B (en) | Tourist industry user portrait construction method and device and computer equipment | |
CN108876133A (en) | Risk assessment processing method, device, server and medium based on business information | |
CN109783730A (en) | Products Show method, apparatus, computer equipment and storage medium | |
CN111160566A (en) | Sample generation method and device, computer readable storage medium and computer equipment | |
CN109447731B (en) | Cross-platform product recommendation method, device, computer equipment and storage medium | |
CN109242539A (en) | Based on potential user's prediction technique, device and the computer equipment for being lost user | |
CN113706211B (en) | Advertisement click rate prediction method and system based on neural network | |
CN110909975B (en) | Scientific research platform benefit evaluation method and device | |
CN114693409A (en) | Product matching method, device, computer equipment, storage medium and program product | |
Xue et al. | Intelligent mining on purchase information and recommendation system for e-commerce | |
CN111680213A (en) | Information recommendation method, data processing method and device | |
CN114330837A (en) | Object processing method and device, computer equipment and storage medium | |
CN111489196B (en) | Prediction method and device based on deep learning network, electronic equipment and medium | |
CN112465461A (en) | Business object information changing method, system, computer device and storage medium | |
CN110097250B (en) | Product risk prediction method, device, computer equipment and storage medium | |
CN110766465A (en) | Financial product evaluation method and verification method and device thereof | |
CN110992189A (en) | Resource data estimation method, resource data estimation device, computer equipment and storage medium | |
CN110619275A (en) | Information pushing method and device, computer equipment and storage medium | |
CN111915347A (en) | Method, device and system for effectively storing and applying promotion purchase price | |
Xiao et al. | Efficient simulation budget allocation for ranking the top m designs | |
CN111078995B (en) | Data backtracking method and device, computer equipment and storage medium | |
CN116843393B (en) | Intelligent advertisement management method and system | |
CN117726455A (en) | Message pushing method, device, computer equipment and storage medium | |
CN115408489A (en) | Question list generation method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |