CN113343082A

CN113343082A - Hot field prediction model generation method and device, storage medium and equipment

Info

Publication number: CN113343082A
Application number: CN202110574931.2A
Authority: CN
Inventors: 邵佳帅; 闫开元; 于吉士; 陈松林; 张子实; 谭孟泷
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2021-05-25
Filing date: 2021-05-25
Publication date: 2021-09-03

Abstract

The application discloses a method, a device, a storage medium and equipment for generating a hot field prediction model, wherein the method comprises the following steps: acquiring training characteristic data of a sample key field in a first historical time period; obtaining text-sending click data of the sample key field in a second historical time period, and determining a first training prediction result of the sample key field based on the text-sending click data; and initializing a prediction model, taking the training characteristic data as model input data, taking the first training prediction result as model output data, and training the prediction model to obtain a trained hot field prediction model. By the method and the device, the hot spot field prediction model can be generated based on the characteristics of the sample key field in the historical time period and the text clicking data in the historical time period, so that the hot spot field can be determined, the production efficiency of hot spot text content is improved, and the expansion of a content platform is ensured.

Description

Hot field prediction model generation method and device, storage medium and equipment

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for generating a hot field prediction model, a storage medium, and a device.

Background

With the continuous development and improvement of computer technology, terminals such as mobile phones and computers become indispensable equipment in daily life of people, through content application loaded in the terminals, the content of the sent texts generated by the authors can be pushed to the users for browsing based on a content platform, and the topic types of the generated content of the sent texts are rich due to different interests and hobbies of different authors, so that the browsing requirements of most users are met. However, the generation of the existing text content often depends on the creation inspiration of the author or the understanding of the content platform user, and the hot spots in different fields need to be controlled by the author, so that the generated text content is easy to cause less browsing feedback due to the lack of the hot spot structure, the quality of the text content cannot be ensured, meanwhile, more time of the author is spent, the generation efficiency of the hot text content is influenced, and the expansion of the content platform is influenced.

Disclosure of Invention

The application provides a hot field prediction model generation method, a hot field prediction model generation device, a hot field prediction model generation storage medium and hot field prediction model generation equipment, which can generate a hot field prediction model based on characteristics of a sample key field in a historical time period before set time and text click data in the historical time period after the set time, so that hot fields can be determined, the hot text content generation efficiency is improved, and the expansion of a content platform is ensured.

In a first aspect, an embodiment of the present application provides a hot field prediction model generation method, including:

acquiring training characteristic data of a sample key field in a first historical time period;

obtaining text-sending click data of the sample key field in a second historical time period, and determining a first training prediction result of the sample key field based on the text-sending click data;

initializing a prediction model, taking the training characteristic data as model input data, taking the first training prediction result as model output data, and training the prediction model to obtain a trained hot field prediction model;

the first history time period is a history time period before a set time, and the second history time period is a history time period after the set time.

In a second aspect, an embodiment of the present application provides a prediction model generation apparatus, including:

the training characteristic acquisition unit is used for acquiring training characteristic data of the sample key field in a first historical time period;

the training result determining unit is used for acquiring text-sending click data of the sample key field in a second historical time period and determining a first training prediction result of the sample key field based on the text-sending click data;

the model generation unit is used for initializing a prediction model, taking the training characteristic data as model input data and the first training prediction result as model output data, and training the prediction model to obtain a trained hot field prediction model;

In a third aspect, embodiments of the present application provide a computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the steps of the above-mentioned method.

In a fourth aspect, an embodiment of the present application provides a computer device, including: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the steps of the method as described above.

In the embodiment of the application, the hot spot field prediction model is generated through the characteristics of the sample key field in the historical time period before the set time and the text clicking data in the historical time period after the set time, so that the hot spot field can be determined, the creation inspiration and the creation direction are provided for an author, the production efficiency of the hot spot text content is improved, and the expansion of a content platform is ensured.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a system architecture diagram of a topic push provided by an embodiment of the present application;

fig. 2 is a schematic flowchart of a hot field prediction model generation method according to an embodiment of the present disclosure;

fig. 3 is a schematic flowchart of a hot field prediction model generation method according to an embodiment of the present disclosure;

fig. 4 is an exemplary schematic view of coordinate axes of a historical time period provided in an embodiment of the present application;

FIG. 5 is a schematic diagram illustrating examples of predictive model generation and use provided by embodiments of the present application;

fig. 6 is a schematic flowchart of training feature data acquisition according to an embodiment of the present application;

fig. 7 is a schematic flowchart of obtaining a first training prediction result according to an embodiment of the present application;

FIG. 8 is a schematic flow chart illustrating the predictive model determination provided in the embodiments of the present application;

FIG. 9 is a schematic diagram illustrating examples of predictive model generation and use provided by embodiments of the present application;

fig. 10 is a schematic structural diagram of a prediction model generation apparatus according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of a prediction model generation apparatus according to an embodiment of the present application;

fig. 12 is a schematic structural diagram of a training feature obtaining unit according to an embodiment of the present application;

fig. 13 is a schematic structural diagram of a training result determining unit provided in an embodiment of the present application;

fig. 14 is a schematic structural diagram of a model generation unit provided in an embodiment of the present application;

fig. 15 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

In order to make the features and advantages of the present application more obvious and understandable, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Referring to fig. 1, a system architecture diagram of topic push is provided for the embodiment of the present application. As shown in fig. 1, the hot field prediction model generation method provided in this embodiment of the present application may be applied to a scenario of a content platform, where three entities, namely an author, a user, and a text content, exist in the content platform, where the author provides the text content for the content platform, the user browses the text content in the content platform, the text content is precisely matched with the user, the entity structure may be specifically divided into an author terminal device, a user terminal device, and a content service device, and the author terminal device and the content service device may be connected to each other through a network.

The author terminal device may specifically be a device for generating a text content for an author, and may include, but is not limited to, a mobile phone, a personal computer, a notebook computer, a vehicle-mounted device, a wearable device, and other terminal devices having a content production function; the user terminal device may specifically be a device for browsing text-sending content by a user, and may include, but is not limited to, a mobile phone, a personal computer, a notebook computer, a vehicle-mounted device, a wearable device, and other terminal devices having a content browsing function; the content service device may specifically be a background service device which is equipped with a content platform and stores the text content generated by the author, for example: servers, service clusters, etc.; the text content may specifically be content composed of multimedia data, and the multimedia data may include, but is not limited to, video, pictures, text, and the like.

In the embodiment of the present application, because the generation of the existing text content often depends on the creative inspiration of the author or the understanding of the user of the content platform, the creation level of the author is a test, and for a new author who joins the content platform, the creation direction of the text content cannot be known in time, which may cause the loss of the author, reduce the generation of the text content in the content platform, and further affect the expansion of the content platform, so that the relationship between the content platform and the author needs to be improved to help the author solve the problem of what kind of content is generated, the embodiment of the present application uses the prediction model generation apparatus as an execution subject, the prediction model generation apparatus may be specifically a content service apparatus in a system architecture, and may also be a model generation application program in the content service apparatus, and specifically provides a hot field prediction model generation method, the prediction model generation device acquires training characteristic data of a sample key field in a first historical time period; the prediction model generation device acquires text-sending click data of the sample key field in a second historical time period, and determines a first training prediction result of the sample key field based on the text-sending click data; the prediction model generation device initializes a prediction model, takes the training characteristic data as model input data and the first training prediction result as model output data, and trains the prediction model to obtain a trained hot field prediction model; the first history time period is a history time period before a set time, and the second history time period is a history time period after the set time. The hot spot field prediction model is generated through the characteristics of the sample key field in the historical time period before the set time and the text sending click data in the historical time period after the set time, so that the hot spot field can be determined, the creation inspiration and the creation direction are provided for an author, the production efficiency of the hot spot text sending content is improved, and the expansion of a content platform is ensured.

Based on the system architecture shown in fig. 1, the hot field prediction model generation method provided by the embodiment of the present application will be described in detail below with reference to fig. 2 to 9.

Referring to fig. 2, a flow chart of a hot field prediction model generation method is provided in the embodiment of the present application. As shown in fig. 2, the method may include the following steps S101 to S103.

S101, acquiring training characteristic data of a sample key field in a first historical time period;

specifically, the prediction model generation apparatus may obtain training feature data of a sample key field in a first history time period, where the sample key field is extracted from text content published by a content platform according to a history period, the history period may be a fixed time period or may be set manually, the first history time period is a history time period before a set time, the set time may be a time point set by a user and close to a current time, and a duration of the first history time period may be set according to a requirement of actual model generation, for example: the third day before the current time may be selected as the set time, and the first 7 days before the set time may be selected as the first history period. The training feature data may specifically be feature data associated with the sample key field selected for training of a prediction model, and the training feature data may include, but is not limited to, at least one of a basic feature, a trend feature, and a related field feature, where the basic feature may specifically be used to represent a text-sending behavior feature of the sample key field, the trend feature may specifically be used to represent a browsing amount feature of text-sending content associated with the sample key field, and the related field feature may specifically be used to represent a browsing amount feature of a related field and a browsing amount feature of a related field of the sample key field.

S102, obtaining text-sending click data of the sample key field in a second historical time period, and determining a first training prediction result of the sample key field based on the text-sending click data;

specifically, the prediction model generation device may obtain the text click data of the sample key field in a second historical time period, where the second historical time period is a historical time period after the set time, and the duration of the second historical time period may also be set according to the requirement of actual model generation, according to the above example, the third day before the current time is used as the set time, and the last 3 days of the set time may be used as the second historical time period. The textual click data may include at least one of a number of textual clicks, a number ranking of the number of textual clicks, and an increasing trend of the number of textual clicks.

S103, initializing a prediction model, taking the training characteristic data as model input data and the first training prediction result as model output data, and training the prediction model to obtain a trained hot field prediction model;

specifically, the prediction model generation device may initialize the prediction model, including determining a model type of the prediction model, establishing a basic architecture of the prediction model, and the like, and may train the prediction model by using the training feature data as model input data and the first training prediction result as model output data, so as to obtain model parameters in the prediction model. It can be understood that the prediction model generation apparatus may train the prediction model by using the training feature data and the first training prediction result respectively corresponding to all the sample key fields in the sample key field set, or may select the training feature data and the first training prediction result respectively corresponding to some sample key fields in the sample key field set, train the prediction model, and may select the training feature data and the first training prediction result according to actual requirements.

The prediction model generation device can substitute the model parameters into the prediction model to obtain a trained hot field prediction model, the hot field prediction model can be used for predicting whether a target key field belongs to a hot key field in a future time period, and forward-looking prediction of the hot key word is realized mainly according to characteristic expression of the target key field in a historical time period.

Referring to fig. 3, a flow chart of a hot field prediction model generation method is provided in the present embodiment. As shown in fig. 3, the method may include the following steps S201 to S206.

S201, counting the text contents released in the third history time period, and extracting sample key fields in the text contents;

specifically, the prediction model generation device may count the content of the text issued in a third history time period, where the third history time period may specifically be a history cycle, and the history cycle may be a fixed time period or may be set manually, for example: a month, a year, etc., the third historical period of time comprises a first historical period of time and a second historical period of time. The prediction model generation device may count all the text contents published in the third history time period in the publishing platform, and extract a sample key field set in the text contents, where the sample key field set may include one or more sample key fields, and the sample key fields may be keywords in the text contents. It will be appreciated that different textual content may be provided with different sample key fields, and that each textual content may correspond to one or more sample key fields.

S202, acquiring training characteristic data of the sample key field in a first historical time period;

S203, obtaining text-sending click data of the sample key field in a second historical time period, and determining a first training prediction result of the sample key field based on the text-sending click data;

Please refer to fig. 4, which provides an exemplary diagram of coordinate axes of a history time period for the embodiment of the present application. Fig. 4 shows an exemplary distribution manner of historical time periods, where a day may be a unit time in a historical time period, the third historical time period is O to T + q, the first historical time period is T-m to T, the second historical time period is T +1 to T + q, where T is a set time, q is greater than or equal to 1, m is greater than or equal to 0, and T + q may be a current day or a day before the current day.

As can be seen from fig. 4, the third historical time period is a time period in a larger range before the current day, and is specifically used for screening out a sample key field that needs to be subjected to model training, the first historical time period and the second historical time period are specifically distributed according to set time, and the set time is generally selected at a time point closer to the current day, so that characteristic data selection of the sample key field and a prediction result are representative, and accuracy of prediction model training can be further ensured.

S204, initializing a prediction model, taking the training characteristic data as model input data and the first training prediction result as model output data, and training the prediction model to obtain a trained hot field prediction model;

The prediction model generation device can substitute the model parameters into the prediction model to obtain a trained hot field prediction model, the hot field is used for indicating that any key field belongs to a hot key field in a future time period, the hot field prediction model can be used for predicting whether a target key field belongs to the hot key field in the future time period, and the prospective prediction of the hot key words is realized mainly according to the characteristic expression of the target key field in a historical time period.

Referring to fig. 5, an exemplary diagram of the generation and use of a prediction model is provided for the embodiment of the present application. As shown in fig. 5, the prediction model generation device may obtain training feature data of a sample key field between T-m and T, and text click data between T +1 and T + q, where T-n represents a number between T-m and T, n is less than or equal to m, T + p represents a number between T +1 and T + q, and p is less than or equal to q, determine a first training prediction result of the sample key field through the text click data, and train the prediction model by using the training feature data as model input data of the prediction model, using the first training prediction result as model output data of the prediction model, and train the prediction model to finally obtain the hot field prediction model.

S205, acquiring the input target key field, and acquiring field characteristic data of the target key field in a fourth historical time period;

specifically, the prediction model generation device may obtain an input target key field, where the target key field may be a key field that is selected by an administrator of the content platform and is to be pushed to an author, or a key field that is selected by an author in the content platform and needs to be thermally predicted, and for the target key field input by the author, the content platform may provide an input interface of the hot field prediction model, and the author may input the target key field in the input interface based on an author terminal device.

The prediction model generation device may further obtain field feature data of the target key field in a fourth historical time period, where the fourth historical time period is a historical time period before the current time, a duration of the fourth historical time period is the same as a duration of the first historical time period, the field feature data may specifically be feature data of the target key field, and specifically may include at least one of a basic feature, a trend feature, and a related field feature of the target key field in the fourth historical time period, where relevant descriptions of the basic feature, the trend feature, and the related field feature are described in training feature data of the sample key field, and are not described herein again.

S206, inputting the field characteristic data into the hot field prediction model to obtain a target prediction result of the target key field;

specifically, the prediction model generation device may input the field feature data into the hot field prediction model to obtain a target prediction result of the target key field. Referring to fig. 5 again, after the hot field prediction model is obtained by training the prediction model, the hot field prediction model can implement hot prediction on the input key field, when the input target key field is obtained, field feature data of the target key field in a fourth historical time period can be obtained, the field feature data is used as model input data of the hot field prediction model, a target prediction result of the target key field is obtained by calculation, and the prediction model generation device can output and display the target prediction result.

In the embodiment of the application, a hot spot field prediction model is generated through the characteristics of the sample key field in the historical time period before the set time and the text clicking data in the historical time period after the set time, so that the hot spot field can be determined, the creation inspiration and the creation direction are provided for an author, the production efficiency of the hot spot text content is improved, and the expansion of a content platform is ensured; the set time is usually selected to be closer to the time point of the current day, so that the characteristic data selection of the key field of the sample and the prediction result are representative, and the accuracy of the prediction model training can be further ensured; the hot field prediction model can be used for predicting whether the target key field belongs to the hot key field in a future time period, and the prospective prediction of the hot key words is realized mainly according to the characteristic expression of the target key field in a historical time period.

Please refer to fig. 6, which provides a schematic flow chart of training feature data acquisition according to an embodiment of the present application. As shown in fig. 6, the embodiment of the present application mainly explains a specific implementation process of S101 and S202, and may include steps S301 to S304.

S301, obtaining basic characteristics of a sample key field in a first historical time period;

specifically, the prediction model generating device may obtain basic features of the sample key field in a first history time period, where the basic features are used to represent the text sending behavior features of the sample key field, and the text sending behavior features may specifically include the number of authors, the author level, the number of fans of authors, the multimedia type (e.g., text type, video type, picture type, etc.) of the first text sending content for sending the sample key field in the first history time period, the number of the first text sending content, the second text sending content for sending the sample key field in a third history time period, the click volume in the first history time period, and the like. It is understood that the first text content refers to the text content associated with the sample key field published to the content platform within the first historical period, the second text content refers to the text content associated with the sample key field published to the content platform within the third historical period, and the first text content published within the first historical period is included in the second text content because the first historical period is within the third historical period.

S302, acquiring trend characteristics of the sample key field in a first historical time period;

specifically, the prediction model generation device may obtain a trend feature of the sample key field in a first history time period, where the trend feature is used to represent a browsing volume feature of the text content associated with the sample key field, and the browsing volume feature may specifically include a total number of days in which the click volume of the second text content on a single day in the first history time period belongs to a positive sample, a rise of the click volume of the second text content in the first history time period, and the like.

S303, acquiring related field characteristics of the sample key field in a first historical time period;

specifically, the prediction model generating device may obtain relevant field characteristics of the sample key field in the first history time period, where the relevant field characteristics are used to represent the text-sending behavior characteristics of the relevant field of the sample key field and the browsing volume characteristics of the relevant field, the relevant field and the sample key field are fields belonging to the same content category, and the content category may be a first-level category to which the sample key field belongs, for example: entertainment, literature, sports, etc., the characteristics of the relevant fields of the texting behavior may include the number of authors, the level of authors, the number of fans of authors, the multimedia type of the third texting content (e.g., text type, video type, picture type, etc.) of the relevant fields of the first historical period, the number of third texting content, the fourth texting content of the relevant fields of the third historical period, the click rate in the first historical period, and the like, and the browsing amount characteristics of the related fields may include the total number of days in which the click rate of the fourth text content in a single day in the first historical period belongs to a positive sample, the rise of the click rate of the fourth text content in the first historical period, and the like. It is understood that the third text content refers to the text content associated with the relevant field published to the content platform within the first history time period, and the fourth text content refers to the text content associated with the relevant field published to the content platform within the third history time period, and the third text content published within the first history time period is included in the fourth text content because the first history time period is within the third history time period.

It should be noted that the positive sample in the embodiment of the present application is specifically used for sample text content or related text content whose click rate satisfies a set condition, where the click rate satisfies the set condition may specifically be that the click rate is greater than or equal to a click number threshold.

S304, generating training feature data of the sample key field in a first historical time period based on the basic feature, the trend feature and the related field feature;

specifically, the prediction model generating device may generate training feature data of the sample key field in the first historical event segment based on the basic feature, the trend feature and the related field feature, and it is understood that, when there are a plurality of sample key fields, each sample key field corresponds to its own basic feature, trend feature and related field feature.

In the embodiment of the present application, the steps S301 to S303 may be executed simultaneously, or the execution sequence may be changed according to the actual requirement, and specifically, the setting may be performed according to the actual requirement.

In the embodiment of the application, the basic characteristics, the trend characteristics and the related field characteristics of the key field of the sample are obtained, so that the description of the performance of the key field of the sample in a certain time period is ensured, the accuracy of the hot field prediction model training is ensured, and the accuracy of hot prediction of the key field is further improved.

Please refer to fig. 7, which provides a schematic flowchart of the first training prediction result acquisition according to the embodiment of the present application. As shown in fig. 7, the embodiment of the present application mainly explains a specific implementation process of S102 and S203, and may include step S401 to step S402.

S401, acquiring the number of text-sending clicks, the number ordering of the text-sending clicks and the rising trend of the number of the text-sending clicks of the sample key field in a second historical time period;

specifically, the prediction model generation device may obtain the number of text clicks, the number ranking of the number of text clicks, and the rising trend of the number of text clicks of the sample key field in the second history time period, where the number of text clicks may specifically represent the fifth text content for texting the sample key field in the third history time period, and the click rate in the second history time period; the number ordering of the number of the text clicks may represent an ordering position of the click rate of the fifth text content in the click rate ordering within the second historical time period, the click rate ordering includes the click rate of the text content of each sample key field in the second historical time period, and the click rate ordering mode may be selected as ordering from most to least according to the click rate; the rising trend of the number of issued clicks may represent an average rising of the click rate of the fifth issued content in the second historical time period.

S402, determining a first training prediction result of the sample key field based on the number of the Chinese clicks, the number ordering of the number of the Chinese clicks and the rising trend of the number of the Chinese clicks;

specifically, the prediction model generation device may determine, based on the number of essay clicks, the number ranking of the number of essay clicks, and the rising trend of the number of essay clicks, a first training prediction result of the sample key field, where the first training prediction result is used to indicate that the sample key field belongs to a hot key field or a non-hot key field, the hot key field is used to indicate that the prediction key field is a hot field in a future time period, and the non-hot key field is used to indicate that the prediction key field is a non-hot field in the future time period, and optionally, the manner of determining the first training prediction result of the sample key field may include any of the following cases:

if the number of the text clicks is larger than or equal to a number threshold value and the number sequence of the number of the text clicks is in a first sequence range, determining that a first training prediction result of the sample key field is a hot key field;

if the number of the text clicks is larger than or equal to a number threshold, the number sequence of the number of the text clicks is in a second sequence range, and the rising trend of the number of the text clicks is larger than or equal to a rising threshold, determining that a first training prediction result of the sample key field is a hot key field;

if the number of the text clicks is smaller than a number threshold, determining that a first training prediction result of the sample key field is a non-heatable key field;

if the number of the text clicks is larger than or equal to a number threshold, the number ordering of the number of the text clicks is in a second ordering range, and the amplitude trend of the number of the text clicks is smaller than an amplitude threshold, determining that a first training prediction result of the sample key field is a non-hot key field;

and if the number of text clicks is greater than or equal to a number threshold and the number sequence of the number of text clicks is in a third sequence range, determining that the first training prediction result of the sample key field is a non-hot key field.

It should be noted that the first sorting range is higher than the second sorting range, and the second sorting range is higher than the third sorting range, for example: the first sorting range is the first 10 digits in the click quantity sorting, the second sorting range is the 11 th-20 th digits in the click quantity sorting, and the third sorting range is after 21 digits in the click quantity sorting and the like.

In the embodiment of the application, the accuracy of the first training prediction result of the key field of the sample is ensured by sequencing the number of the issued clicks, the number of the issued clicks and the rising trend of the number of the issued clicks of the key field of the sample, and the first training prediction result is used as the prediction basis of the key field of the sample, so that the accuracy of screening candidate prediction models subsequently is ensured, and the accuracy of hot prediction of the key field is improved.

Referring to fig. 8, a schematic flow chart of predictive model determination is provided in the embodiment of the present application. As shown in fig. 8, the embodiment of the present application mainly explains a specific implementation process of S103 and S204, and may include steps S501 to S504.

S501, initializing candidate prediction models of multiple types, taking the training characteristic data as model input data and the first training prediction result as model output data, and training the candidate prediction models of the multiple types to obtain multiple trained candidate prediction models;

specifically, the prediction model generation device may initialize multiple types of candidate prediction models, including determining model types of the candidate prediction models, establishing basic architectures of the candidate prediction models, and the like, and train the multiple types of candidate prediction models by using the training feature data as model input data and the first training prediction result as model output data to obtain model parameters of each type of candidate prediction models in the multiple types of candidate prediction models.

Because the prediction models of different model types have their own characteristics, in the actual model training process, the same model input data and model output data can be used to train the prediction models of different types respectively, specifically, the prediction model can be a plurality of types of candidate prediction models, and the candidate prediction models can include a linear model, a tree model, a deep neural network and the like.

The prediction model generation device may substitute the model parameters into the multiple types of candidate prediction models respectively to obtain multiple trained candidate prediction models, and it is understood that the prediction model generation device may obtain the model parameters corresponding to each candidate prediction model when training the multiple types of candidate prediction models respectively, and further may substitute the model parameters into the corresponding candidate prediction models to finally obtain the multiple trained candidate prediction models.

S502, inputting the training characteristic data into each candidate prediction model of the trained candidate prediction models respectively to obtain a second training prediction result corresponding to each candidate prediction model;

specifically, the prediction model generation device may input the training feature data as model input data again to each candidate prediction model after training, and may obtain a second training prediction result of the sample key field input by each candidate prediction model, where the second training prediction result is a prediction result obtained after actual calculation of each candidate prediction model after training

S503, matching the second training prediction result based on the first training prediction result to obtain the prediction accuracy corresponding to each candidate prediction model;

specifically, the prediction model generation device may match the second training prediction result with the first training prediction result, and it may be understood that the first training prediction result is a training result determined based on the number of clicks of the sample key field in the second historical time period, the number ranking of the number of clicks of the issue, and the rising trend of the number of clicks of the issue, and therefore the first training prediction result is used as an accurate prediction result of the sample key field.

S504, selecting a candidate prediction model with the highest prediction accuracy from the trained candidate prediction models as a hot field prediction model;

please refer to fig. 9, which provides an exemplary illustration of the generation and usage of the prediction model according to the embodiment of the present application. As shown in fig. 9, fig. 9 is further added with a screening process of candidate prediction models on the basis of fig. 5, where a sample key field 1 and a sample key field 2 shown in fig. 9 may be the same sample key field, a first training prediction result of the sample key field is determined by text click data, and the prediction model generation apparatus may use training feature data as model input data of candidate prediction models of multiple types, respectively, use the first training prediction result as model output data of candidate prediction models of multiple types, and train the candidate prediction models of multiple types, respectively, to obtain multiple candidate prediction models after training.

The prediction model generation device may input the training feature data to each candidate prediction model of the trained plurality of candidate prediction models again to obtain a second training prediction result corresponding to each candidate prediction model, match the second training prediction result based on the first training prediction result to obtain a prediction accuracy corresponding to each candidate prediction model, and finally select a candidate prediction model with the highest prediction accuracy from the trained plurality of candidate prediction models as the hot field prediction model.

In the embodiment of the application, a plurality of candidate prediction models are obtained by providing a plurality of types of prediction models for model training, a second training prediction result can be obtained by using training feature data of a key field of the same sample for model calculation, and then a hot field prediction model with the highest prediction accuracy can be selected from the candidate prediction models according to the matching degree between the first training prediction result and the second training prediction result, so that the hot field prediction accuracy of the hot field prediction model on the key field is effectively ensured.

Based on the system architecture shown in fig. 1, the prediction model generation apparatus provided in the embodiment of the present application will be described in detail below with reference to fig. 10 to 14. It should be noted that, the prediction model generating apparatus in fig. 2 to 9 is used for executing the method of the embodiment shown in fig. 2 to 9 of the present application, for convenience of description, only the portion related to the embodiment of the present application is shown, and details of the specific technology are not disclosed, please refer to the embodiment shown in fig. 2 to 9 of the present application.

Referring to fig. 10, a schematic structural diagram of a prediction model generation apparatus is provided in the embodiment of the present application. As shown in fig. 10, the prediction model generation apparatus 1 according to the embodiment of the present application may include: a training feature acquisition unit 11, a training result determination unit 12, and a model generation unit 13.

The training feature obtaining unit 11 is configured to obtain training feature data of the sample key field in a first historical time period;

a training result determining unit 12, configured to obtain text-to-speech click data of the sample key field in a second historical time period, and determine a first training prediction result of the sample key field based on the text-to-speech click data;

a model generating unit 13, configured to initialize a prediction model, use the training feature data as model input data, use the first training prediction result as model output data, and train the prediction model to obtain a trained hot field prediction model;

wherein the first history time period is a history time period before a set time, and the second history time period is a history time period after the set time.

Referring to fig. 11, a schematic structural diagram of a prediction model generation apparatus is provided in the embodiment of the present application. As shown in fig. 11, the prediction model generation apparatus 1 according to the embodiment of the present application may include: a training feature acquisition unit 11, a training result determination unit 12, a model generation unit 13, a sample field acquisition unit 14, a field feature acquisition unit 15, and a prediction result acquisition unit 16.

The sample field acquisition unit 14 is used for counting the text contents released in the third history time period and extracting sample key fields in the text contents;

wherein the third historical time period comprises the first historical time period and the second historical time period.

specifically, please refer to fig. 12, which provides a schematic structural diagram of a training feature obtaining unit according to an embodiment of the present application. As shown in fig. 12, the training feature obtaining unit 11 may include:

a basic feature obtaining subunit 111, configured to obtain a basic feature of the sample key field in the first historical time period;

wherein the basic features are used for representing the text behavior features of the sample key fields.

A trend feature obtaining subunit 112, configured to obtain a trend feature of the sample key field in a first historical time period;

wherein the trend feature is used to represent a browsing volume feature of the textual content associated with the sample key field.

A field characteristic obtaining subunit 113, configured to obtain relevant field characteristics of the sample key field in a first historical time period;

the related field characteristics are used for representing the text sending behavior characteristics of the related fields of the sample key fields and the browsing volume characteristics of the related fields.

A training feature generation subunit 114, configured to generate training feature data of the sample key field in the first historical time period based on the base feature, the trend feature and the related field feature.

specifically, please refer to fig. 13 together, which provides a schematic structural diagram of the training result determining unit according to the embodiment of the present application. As shown in fig. 13, the training result determining unit 12 may include:

the click data acquisition subunit 121 is configured to acquire the number of text clicks, the number sequence of text clicks, and the rising trend of the number of text clicks of the sample key field in a second historical time period;

a first result determination subunit 122, configured to determine a first training prediction result of the sample key field based on the number of issued hits, the number ranking of issued hits, and the rising trend of the number of issued hits;

in a specific implementation, the first result determination subunit 122 is specifically configured to:

the first sorting range is higher than the second sorting range.

if the number of text clicks is greater than or equal to a number threshold and the number sequence of the number of text clicks is in a third sequence range, determining that a first training prediction result of the sample key field is a non-hot key field;

the second sequencing horizon is higher than the third sequencing horizon.

specifically, please refer to fig. 14, which provides a schematic structural diagram of the model generating unit according to the embodiment of the present application. As shown in fig. 14, the prediction model is a plurality of types of candidate prediction models, and the model generation unit 13 may include:

a candidate model generation subunit 131, configured to initialize multiple types of candidate prediction models, use the training feature data as model input data, use the first training prediction result as model output data, and train the multiple types of candidate prediction models to obtain multiple trained candidate prediction models;

a second result determining subunit 132, configured to input the training feature data to each candidate prediction model of the trained multiple candidate prediction models, respectively, so as to obtain a second training prediction result corresponding to each candidate prediction model;

an accuracy obtaining subunit 133, configured to match the second training prediction result based on the first training prediction result to obtain prediction accuracy corresponding to each candidate prediction model;

a prediction model generation subunit 134, configured to select, as a hot field prediction model, a candidate prediction model with the highest prediction accuracy from the trained multiple candidate prediction models.

A field characteristic obtaining unit 15, configured to obtain an input target key field, and obtain field characteristic data of the target key field in a fourth historical time period;

a prediction result obtaining unit 16, configured to input the field feature data into the field prediction model to obtain a target prediction result of the target key field.

In the embodiment of the application, a hot spot field prediction model is generated through the characteristics of the sample key field in the historical time period before the set time and the text clicking data in the historical time period after the set time, so that the hot spot field can be determined, the creation inspiration and the creation direction are provided for an author, the production efficiency of the hot spot text content is improved, and the expansion of a content platform is ensured; the set time is usually selected to be closer to the time point of the current day, so that the characteristic data selection of the key field of the sample and the prediction result are representative, and the accuracy of the prediction model training can be further ensured; by acquiring the basic characteristics, the trend characteristics and the related field characteristics of the key field of the sample, the description of the performance of the key field of the sample in a certain time period is ensured, the accuracy of hot field prediction model training is ensured, and the accuracy of hot prediction of the key field is further improved; the accuracy of a first training prediction result of the sample key field is ensured by sequencing the number of the text clicks, the number of the text clicks and the rising trend of the number of the text clicks of the sample key field, and the first training prediction result is used as a prediction basis of the sample key field, so that the accuracy of screening candidate prediction models subsequently is ensured, and the accuracy of hot prediction of the key field is improved; the method comprises the steps of performing model training by providing multiple types of prediction models to obtain multiple candidate prediction models, performing model calculation by using training characteristic data of key fields of the same sample to obtain a second training prediction result, and selecting a hot field prediction model with the highest prediction accuracy from the candidate prediction models according to the matching degree between the first training prediction result and the second training prediction result, so that the hot field prediction accuracy of the hot field prediction model for the key fields is effectively ensured; the hot field prediction model can be used for predicting whether the target key field belongs to the hot key field in a future time period, and the prospective prediction of the hot key words is realized mainly according to the characteristic expression of the target key field in a historical time period.

An embodiment of the present application further provides a computer storage medium, where the computer storage medium may store a plurality of program instructions, where the program instructions are suitable for being loaded by a processor and executing the method steps in the embodiments shown in fig. 2 to 9, and a specific execution process may refer to specific descriptions of the embodiments shown in fig. 2 to 9, which is not described herein again.

Referring to fig. 15, a schematic structural diagram of a computer device is provided according to an embodiment of the present application. As shown in fig. 15, the computer apparatus 1000 may include: at least one processor 1001, such as a CPU, at least one network interface 1004, input output interfaces 1003, memory 1005, at least one communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others. The memory 1005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 1005 may optionally be at least one memory device located remotely from the processor 1001. As shown in fig. 15, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, an input-output interface module, and a model generation application program.

In the computer apparatus 1000 shown in fig. 15, the input/output interface 1003 is mainly used as an interface for providing input for a user, acquiring data input by the user, and presenting the calculated data to the user.

In one embodiment, processor 1001 may be configured to invoke a model generation application stored in memory 1005 and perform the following operations in particular:

Optionally, before performing the step of obtaining the text feature data of the sample key field in the first history time period, the processor 1001 further performs the following operations:

counting the text contents released in the third history time period, and extracting sample key fields in the text contents;

the third historical time period includes the first historical time period and the second historical time period.

Optionally, when the processor 1001 acquires training feature data of the sample key field in the first historical time period, specifically perform the following operations:

obtaining basic characteristics of a sample key field in a first historical time period, wherein the basic characteristics are used for representing the text-sending behavior characteristics of the sample key field;

acquiring trend characteristics of the sample key field in a first historical time period, wherein the trend characteristics are used for representing browsing volume characteristics of the text content associated with the sample key field;

acquiring related field characteristics of the sample key field in a first historical time period, wherein the related field characteristics are used for expressing the text sending behavior characteristics of the related field of the sample key field and the browsing volume characteristics of the related field;

generating training feature data of the sample key field in a first historical time period based on the base features, the trend features and the related field features.

Optionally, when the processor 1001 obtains the text-to-text click data of the sample key field in the second historical time period and determines the first training prediction result of the sample key field based on the text-to-text click data, specifically perform the following operations:

acquiring the number of text clicks, the number sequence of the text clicks and the rising trend of the number of the text clicks of the sample key field in a second historical time period;

and determining a first training prediction result of the sample key field based on the number of the Chinese clicks, the number sequence of the Chinese clicks and the rising trend of the number of the Chinese clicks.

Optionally, when the processor 1001 determines the first training prediction result of the sample key field based on the number of issued hits, the number ranking of issued hits, and the rising trend of the number of issued hits, the following operation is specifically performed:

the first sorting range is higher than the second sorting range.

the second sequencing horizon is higher than the third sequencing horizon.

Optionally, the prediction model is a candidate prediction model of multiple types;

when the processor 1001 executes an initialization prediction model, uses the training feature data as model input data, and uses the first training prediction result as model output data, and trains the prediction model to obtain a trained hot field prediction model, the following operations are specifically executed:

initializing multiple types of candidate prediction models, taking the training characteristic data as model input data and the first training prediction result as model output data, and training the multiple types of candidate prediction models to obtain multiple trained candidate prediction models;

respectively inputting the training characteristic data to each candidate prediction model in the trained multiple candidate prediction models to obtain a second training prediction result corresponding to each candidate prediction model;

matching the second training prediction result based on the first training prediction result to obtain the prediction accuracy corresponding to each candidate prediction model;

and selecting the candidate prediction model with the highest prediction accuracy from the trained candidate prediction models as the hot field prediction model.

Optionally, the processor 1001 further performs the following operations:

acquiring the input target key field, and acquiring field characteristic data of the target key field in a fourth historical time period;

and inputting the field characteristic data into the hot field prediction model to obtain a target prediction result of the target key field.

Optionally, the candidate prediction model includes at least one of a linear model, a tree model, and a deep neural network.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.

Claims

1. A hot field prediction model generation method is characterized by comprising the following steps:

2. The method of claim 1, wherein obtaining the sample key field precedes the textual feature data for the first historical time period, further comprising:

3. The method of claim 1, wherein obtaining training feature data for the sample key field over the first historical time period comprises:

4. The method of claim 1, wherein obtaining textual click data for the sample key field over a second historical period of time, determining a first training prediction for the sample key field based on the textual click data, comprises:

5. The method of claim 4, wherein determining the first training prediction for the sample key field based on the number of literary clicks, the number-rank of literary clicks, and the trend of rising amplitudes of the literary clicks comprises:

the first sorting range is higher than the second sorting range.

6. The method of claim 4, wherein determining the first training prediction for the sample key field based on the number of literary clicks, the number-rank of literary clicks, and the trend of rising amplitudes of the literary clicks comprises:

the second sequencing horizon is higher than the third sequencing horizon.

7. The method of claim 1, wherein the predictive model is a plurality of types of candidate predictive models;

the initializing the prediction model, taking the training characteristic data as model input data, taking the first training prediction result as model output data, and training the prediction model to obtain a trained hot field prediction model, includes:

8. The method of claim 1, further comprising:

9. The method of claim 7, wherein the candidate predictive model comprises at least one of a linear model, a tree model, and a deep neural network.

10. A prediction model generation apparatus, comprising:

11. A computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the steps of the method according to any of claims 1-9.

12. A computer device, comprising: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the steps of the method according to any of claims 1-9.