WO2023124029A1

WO2023124029A1 - Deep learning model training method and apparatus, and content recommendation method and apparatus

Info

Publication number: WO2023124029A1
Application number: PCT/CN2022/106805
Authority: WO
Inventors: 陈意超
Original assignee: 北京百度网讯科技有限公司
Priority date: 2021-12-27
Filing date: 2022-07-20
Publication date: 2023-07-06
Also published as: CN114329201A; CN114329201B

Abstract

A deep learning model training method and apparatus, a content recommendation method and apparatus, a device, a medium, and a product, relating to the technical fields of deep learning and intelligent recommendation in artificial intelligence. The deep learning model training method comprises: acquiring a configuration file, wherein the configuration file comprises model type data and candidate feature configuration data (S210); selecting an initial network layer type and an initial network layer structure on the basis of the model type data (S220); obtaining an initial deep learning model on the basis of the initial network layer type and the initial network layer structure (S230); processing a first training sample on the basis of the candidate feature configuration data to obtain first training feature data (S240); training the initial deep learning model by means of the first training feature data (S250); and obtaining a target deep learning model on the basis of the trained initial deep learning model (S260).

Description

Deep learning model training method, content recommendation method and device

This application claims priority to a Chinese patent application with application number 202111618428.9 filed on December 27, 2021, the entire contents of which are incorporated herein by reference.

technical field

The present disclosure relates to the technical field of artificial intelligence, in particular to deep learning, intelligent recommendation and other technical fields, and more specifically, to a deep learning model training method, content recommendation method, device, electronic equipment, media, and program product.

Background technique

In related technologies, the deep learning model can be used to recommend relevant content. However, in order to train a better deep learning model, a lot of labor and time costs need to be invested, and there is a high technical threshold, which leads to the failure of the deep learning model. The training efficiency is low.

Contents of the invention

The present disclosure provides a training method of a deep learning model, a content recommendation method, a device, an electronic device, a storage medium, and a program product.

According to an aspect of the present disclosure, a method for training a deep learning model is provided, including: obtaining a configuration file, wherein the configuration file includes model type data and candidate feature configuration data; based on the model type data, selecting an initial network Layer type and initial network layer structure; Based on the initial network layer type and the initial network layer structure, an initial deep learning model is obtained; based on the candidate feature configuration data, the first training sample is processed to obtain the first training feature data; The first training feature data trains the initial deep learning model; based on the trained initial deep learning model, a target deep learning model is obtained.

According to an aspect of the present disclosure, there is provided a method for recommending content, including: determining object feature data for a target object; for target content in at least one candidate content, determining content feature data for the target content; The object feature data and the content feature data are input into the target deep learning model to obtain an output result, wherein the target deep learning model is generated by the method according to the present disclosure, and the output result represents the impact of the target object on the target The degree of interest in the content; in response to the output result meeting a preset condition, recommending the target content to the target object.

According to another aspect of the present disclosure, a training device for a deep learning model is provided, including: an acquisition module, a selection module, a first acquisition module, a first processing module, a first training module, and a second acquisition module. An acquisition module, configured to acquire a configuration file, wherein the configuration file includes model type data and candidate feature configuration data; a selection module, configured to select an initial network layer type and an initial network layer structure based on the model type data; the first The obtaining module is used to obtain an initial deep learning model based on the initial network layer type and the initial network layer structure; the first processing module is used to process the first training sample based on the candidate feature configuration data to obtain the first training Feature data; a first training module, configured to use the first training feature data to train the initial deep learning model; a second obtaining module, configured to obtain a target deep learning model based on the trained initial deep learning model.

According to an aspect of the present disclosure, a content recommendation device is provided, including: a first determination module, a second determination module, an input module and a recommendation module. The first determination module is used to determine the object feature data for the target object; the second determination module is used to determine the content feature data for the target content for the target content in at least one candidate content; the input module is used for The object feature data and the content feature data are input into the target deep learning model to obtain an output result, wherein the target deep learning model is generated by the device according to the present disclosure, and the output result represents the target object's effect on the target object. The degree of interest in the target content; a recommendation module, configured to recommend the target content to the target object in response to the output result meeting a preset condition.

According to another aspect of the present disclosure, an electronic device is provided, including: at least one processor and a memory communicatively connected to the at least one processor. Wherein, the memory stores instructions that can be executed by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the above-mentioned deep learning model training method and /or content recommendation method.

According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions, the computer instructions are used to enable the computer to execute the above-mentioned deep learning model training method and/or content recommendation method .

According to another aspect of the present disclosure, a computer program product is provided, including a computer program. When the computer program is executed by a processor, the above-mentioned deep learning model training method and/or content recommendation method are implemented.

It should be understood that what is described in this section is not intended to identify key or important features of the embodiments of the present disclosure, nor is it intended to limit the scope of the present disclosure. Other features of the present disclosure will be readily understood through the following description.

Description of drawings

The accompanying drawings are used to better understand the present solution, and do not constitute a limitation to the present disclosure. in:

FIG. 1 schematically shows a system architecture for training a deep learning model and recommending content according to an embodiment of the present disclosure;

Fig. 2 schematically shows a flow chart of a method for training a deep learning model according to an embodiment of the present disclosure;

Fig. 3 schematically shows a flow chart of a method for training a deep learning model according to another embodiment of the present disclosure;

Fig. 4 schematically shows a schematic diagram of a training method of a deep learning model according to an embodiment of the present disclosure;

Fig. 5 schematically shows a schematic diagram of a content recommendation method according to an embodiment of the present disclosure;

Fig. 6 schematically shows a block diagram of a training device for a deep learning model according to an embodiment of the present disclosure;

Fig. 7 schematically shows a block diagram of a content recommendation device according to an embodiment of the present disclosure; and

FIG. 8 is a block diagram of an electronic device for performing deep learning model training and/or content recommendation to implement an embodiment of the present disclosure.

Detailed ways

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting of the present disclosure. The terms "comprising", "comprising", etc. used herein indicate the presence of stated features, steps, operations and/or components, but do not exclude the presence or addition of one or more other features, steps, operations or components.

All terms (including technical and scientific terms) used herein have the meaning commonly understood by one of ordinary skill in the art, unless otherwise defined. It should be noted that the terms used herein should be interpreted to have a meaning consistent with the context of this specification, and not be interpreted in an idealized or overly rigid manner.

Where expressions such as "at least one of A, B, and C, etc." are used, they should generally be interpreted as those skilled in the art would normally understand the expression (for example, "having A, B, and C A system of at least one of "shall include, but not be limited to, systems with A alone, B alone, C alone, A and B, A and C, B and C, and/or A, B, C, etc. ).

Fig. 1 schematically shows a system architecture of deep learning model training and content recommendation according to an embodiment of the present disclosure. It should be noted that, what is shown in FIG. 1 is only an example of the system architecture to which the embodiments of the present disclosure can be applied, so as to help those skilled in the art understand the technical content of the present disclosure, but it does not mean that the embodiments of the present disclosure cannot be used in other device, system, environment or scenario.

As shown in FIG. 1 , a system architecture 100 according to this embodiment may include

clients

101 , 102 , 103 , a network 104 and a server 105 . The network 104 is used as a medium for providing communication links between the

clients

101 , 102 , 103 and the server 105 . Network 104 may include various connection types, such as wires, wireless communication links, or fiber optic cables, among others.

Users may use

clients

101, 102, 103 to interact with server 105 over network 104 to receive or send messages, and the like.

Clients

101, 102, and 103 can be installed with various communication client applications, such as shopping applications, web browser applications, search applications, instant messaging tools, email clients, social platform software, etc. (just for example).

Clients

101, 102, 103 may be various electronic devices with display screens and supporting web browsing, including but not limited to smartphones, tablet computers, laptop computers, desktop computers, and the like. The

clients

101, 102, and 103 in the embodiments of the present disclosure may, for example, run applications.

The server 105 may be a server that provides various services, such as a background management server that provides support for websites browsed by users using the

clients

101 , 102 , 103 (just an example). The background management server can analyze and process received user requests and other data, and feed back processing results (such as webpages, information, or data, etc. obtained or generated according to user requests) to the client. In addition, the server 105 may also be a cloud server, that is, the server 105 has a cloud computing function.

It should be noted that the deep learning model training method and/or the content recommendation method provided by the embodiment of the present disclosure may be executed by the server 105 . Correspondingly, the deep learning model training device and/or the content recommendation device provided by the embodiments of the present disclosure may be set in the server 105 . The deep learning model training method and/or content recommendation method provided by the embodiments of the present disclosure may also be executed by a server or server cluster that is different from the server 105 and can communicate with the

clients

101 , 102 , 103 and/or the server 105 . Correspondingly, the deep learning model training device and/or content recommendation device provided by the embodiments of the present disclosure may also be set on a server or a server that is different from the server 105 and can communicate with the

clients

101, 102, 103 and/or the server 105 in the cluster.

Exemplarily, the server 105 can receive training samples from the

clients

101, 102, 103 through the network 104, and use the training samples to train the deep learning model, and then the server 105 can send the trained deep learning model to the client through the network 104 101, 102, 103, the client can use the trained deep learning model to recommend content. Alternatively, the server 105 may also directly use the deep learning model to recommend content.

It should be understood that the numbers of clients, networks and servers in Fig. 1 are only illustrative. You can have as many clients, networks, and servers as you need for your implementation.

A method for training a deep learning model and a method for recommending content according to an exemplary embodiment of the present disclosure will be described below with reference to FIGS. 2 to 5 in conjunction with the system architecture of FIG. 1 . The deep learning model training method and the content recommendation method of the embodiments of the present disclosure can be executed by the server shown in FIG. 1 , for example, the server shown in FIG. 1 is the same as or similar to the electronic device below.

Fig. 2 schematically shows a flowchart of a method for training a deep learning model according to an embodiment of the present disclosure.

As shown in FIG. 2 , the deep learning model training method 200 of the embodiment of the present disclosure may include, for example, operation S210 to operation S260.

In operation S210, a configuration file is obtained, and the configuration file includes model type data and candidate feature configuration data.

In operation S220, based on the model type data, an initial network layer type and an initial network layer structure are selected.

In operation S230, an initial deep learning model is obtained based on the initial network layer type and the initial network layer structure.

In operation S240, the first training sample is processed based on the candidate feature configuration data to obtain first training feature data.

In operation S250, an initial deep learning model is trained using the first training feature data.

In operation S260, a target deep learning model is obtained based on the trained initial deep learning model.

Exemplarily, the configuration file includes model type data, and the model type data, for example, characterizes the model type of the initial deep learning model, and the model type includes, for example, a deep neural network (Deep Neural Networks, DNN) type. After determining the model type (such as DNN) of the initial deep learning model based on the model type data, the initial network layer type and initial network layer structure of the DNN model can be further determined.

Exemplarily, the initial network layer type includes, for example, attention layer, fully connected layer, pooling layer and other types of layers, and the initial network layer type may also represent the connection relationship between each layer. The initial network layer structure, for example, characterizes the number of nodes in each layer.

Exemplarily, the initial network layer type may include multiple optional initial network layer types, the initial network layer structure may include multiple optional initial network layer structures, and the data based on the model type may be selected from multiple initial network layer types and multiple Select the initial network layer type and initial network layer structure required by the DNN model in each initial network layer structure. For example, different initial network layer types and different initial network layer structures can be selected in turn to build the initial deep learning model. For each Build the initial deep learning model for training.

Exemplarily, the candidate feature configuration data represents a processing method for the first training sample. In other words, the candidate feature configuration data represents a feature type and a feature dimension for extracting feature data from the first training sample. Processing the first training sample based on the candidate feature configuration data can obtain the first training feature data suitable for training the initial deep learning model. In an example, the candidate feature configuration data includes feature types and feature dimensions for the first training sample. The feature type includes, for example, age, gender, content category and other features, and the feature dimension is, for example, the dimension of a feature vector, and the dimension of a feature vector includes, for example, 1*128 dimension, 1*256 dimension, and so on.

For example, when training the first initial deep learning model, for the first training sample used to train the model, select age and gender features from features such as age, gender, content category, etc., from feature dimensions 1*128 dimensions, 1 Select 1*128 dimension among *256 dimensions, and then process the first training sample to obtain the first training feature data of 1*128 dimension for age and gender.

For example, when training the second initial deep learning model, for the first training sample used to train the model, select the features of gender and content category from features such as age, gender, content category, etc., from the feature dimension 1*128 dimensions, Select 1*256 dimension among 1*256 dimensions, and then process the first training sample to obtain 1*256-dimensional first training feature data for gender and content category.

For example, after the initial deep learning model is constructed, the first training sample can be processed based on the candidate feature configuration data to obtain the first training feature data, and the initial deep learning model can be trained through the first training feature data. The candidate feature configuration data may also include multiple candidate feature configuration data. For different initial deep learning models, different candidate feature configuration data may be sequentially selected to process the first training samples used to train the corresponding initial deep learning models. After obtaining the trained initial deep learning model, a target deep learning model can be obtained based on the initial deep learning model. For example, the initial deep learning model can be directly used as the target deep learning model, or model construction and model training can be re-performed based on the initial deep learning model to obtain a deep learning model.

According to the embodiment of the present disclosure, the model type data and the candidate feature configuration data are defined through the configuration file. When training the initial deep learning model, the corresponding initial network layer type and initial network layer structure can be selected based on the configuration file to construct the corresponding initial deep learning model, and based on the candidate feature configuration data to process the first training sample to obtain the corresponding first training feature data, so as to train the initial deep learning model based on the first training feature data, and then obtain the target deep learning model based on the initial deep learning model. It can be understood that the initial neural network is constructed based on the configuration file and the first training sample is processed so as to automatically and quickly train multiple initial deep learning models, which improves the efficiency of model training and reduces the cost of model training. No modification is required through the configuration file Code, lowering the technical threshold of model training.

Fig. 3 schematically shows a flowchart of a method for training a deep learning model according to another embodiment of the present disclosure.

As shown in FIG. 3 , the deep learning model training method 300 of the embodiment of the present disclosure may include, for example, operation S301 to operation S311 .

In operation S301, a configuration file is acquired, and the configuration file includes model type data and candidate feature configuration data.

In operation S302, based on the model type data, an initial network layer type and an initial network layer structure are selected.

In operation S303, an initial deep learning model is obtained based on the initial network layer type and the initial network layer structure.

In operation S304, the first training sample is processed based on the candidate feature configuration data to obtain first training feature data.

In operation S305, an initial deep learning model is trained using the first training feature data.

According to the embodiment of the present disclosure, operation S301 to operation S305 are the same as or similar to the operations of the above-mentioned embodiment, and will not be repeated here. After obtaining the trained initial deep learning model through operations S301 to S305, a target deep learning model is obtained based on the trained initial deep learning model, see operations S306 to S311.

Exemplarily, the initial deep learning model includes at least one trained initial deep learning model, the initial network layer type, initial network layer structure or candidate feature corresponding to each trained deep learning model in at least one trained initial deep learning model Configuration data can be different. For example, the configuration file also includes evaluation conditions, which are used to evaluate the training effect of the initial deep learning model. The following operations S306 to S308 describe obtaining the target network layer type, target network layer structure, and target feature configuration data with better training effect by evaluating the initial deep learning model.

In operation S306, the verification sample is processed based on the candidate feature configuration data to obtain verification feature data.

In operation S307, the verification feature data are respectively input into at least one trained initial deep learning model, and at least one verification result is obtained.

In operation S308, based on at least one verification result and evaluation condition, determine the target network layer type, target network layer structure, and target feature configuration data from the network layer type set, network layer structure set, and feature configuration data set, respectively.

Exemplarily, after training multiple initial deep learning models, a set of network layer types, a set of network layer structures, and a set of feature configuration data corresponding to the multiple initial deep learning models are obtained. For example, the set of network layer types includes initial network layer types for multiple trained initial deep learning models. The set of network layer structures includes initial network layer structures for a plurality of trained initial deep learning models. The feature configuration data set includes initial feature configuration data for a plurality of trained initial deep learning models, and the initial feature configuration data in the feature configuration data set is, for example, at least part of multiple candidate feature configuration data.

Exemplarily, for each initial deep learning model, for the candidate feature configuration data corresponding to the initial deep learning model, process the verification sample based on the candidate feature configuration data to obtain verification feature data, and use the verification feature data to train the initial depth The learning model is validated. Thus, a plurality of verification results corresponding to a plurality of initial deep learning models can be obtained.

In an example, the verification result includes, for example, the recall rate or precision rate of the initial deep learning model on the verification sample, and the evaluation condition includes, for example, conditions for the recall rate and precision rate, for example, the recall rate of the verification result is evaluated by the evaluation condition Or whether the accuracy rate reaches a certain threshold. In another example, the evaluation condition is related to, for example, an AUC (Area Under Curve) curve, the verification result can be evaluated based on the AUC curve, and the AUC curve is an evaluation index. Based on the verification results and evaluation conditions for multiple initial deep learning models, the target network layer type, target network layer structure, and target feature configuration data are respectively determined from the network layer type set, network layer structure set, and feature configuration data set.

According to the embodiment of the present disclosure, the verification results are evaluated by evaluation conditions, so as to determine the target network layer type, target network layer structure, target The feature configuration data improves the determination accuracy of the target network layer type, the target network layer structure, and the target feature configuration data.

After obtaining the target network layer type, target network layer structure, and target feature configuration data, the model can be retrained based on the target network layer type, target network layer structure, and target feature configuration data to obtain the target deep learning model, see the following operation S309～operation S311.

In operation S309, a target deep learning model to be trained is obtained based on the type of the target network layer and the structure of the target network layer.

Exemplarily, based on the type of the target network layer and the structure of the target network layer, a target deep learning model is constructed.

In operation S310, the second training sample is processed based on the target feature configuration data to obtain second training feature data.

Exemplarily, the target feature configuration data represents how to process the second training samples used to train the target deep learning model, so as to obtain the second training feature data suitable for training the target deep learning model.

In operation S311, use the second training feature data to train the target deep learning model to be trained to obtain the target deep learning model.

Exemplarily, after the target deep learning model is constructed, the second training sample can be processed based on the target feature configuration data to obtain the second training feature data, and the target deep learning model is trained through the second training feature data. The training process of the target deep learning model It is similar to the training process of the initial deep learning model and will not be repeated here.

According to the embodiments of the present disclosure, the process of training multiple initial deep learning models can be regarded as an experimental process of searching network layer types, network layer structures, and feature configuration data.

In an example, the initial deep learning model whose verification result satisfies the evaluation condition can be directly used as the final target deep learning model.

In another example, the target network layer type, target network layer structure, and target feature configuration data may come from different initial deep learning models. In order to reduce the consumption rate of data storage space, the initial deep learning model can be saved instead of Optimal target network layer type, target network layer structure, and target feature configuration data. Then, rebuild and train the target deep learning model based on the target network layer type, target network layer structure, and target feature configuration data.

It can be understood that first, the target network layer type, target network layer structure, and target feature configuration data are obtained by training the initial deep learning model, and then the target deep learning model is obtained by retraining based on the target network layer type, target network layer structure, and target feature configuration data , which not only improves the accuracy of the target deep learning model, but also reduces the consumption rate of data storage space.

Fig. 4 schematically shows a schematic diagram of a method for training a deep learning model according to an embodiment of the present disclosure.

As shown in FIG. 4 , the configuration file 410 includes, for example, model type data 411 , multiple candidate feature configuration data 412 , and evaluation conditions 413 .

Exemplarily, the multiple candidate network layer types 420 include, for example, candidate network layer types A1-A4, and the multiple candidate hyperparameters 430, for example, include candidate hyperparameters B1-B4.

Based on the model type data 411, select an initial network layer type for the initial deep learning model from multiple candidate network layer types 420, and randomly select target hyperparameters from multiple candidate network layer types 420 as the initial deep learning model. Initial network layer structure.

Taking the initial

deep learning models

431, 432, and 433 as examples, select the candidate network layer type A1 and the candidate hyperparameter B1 as the initial network layer type and initial network layer structure for the initial deep learning model 431, for example, the candidate network layer type A1 includes In the fully connected layer and the pooling layer, the candidate hyperparameter B1 (target hyperparameter) is M nodes in the fully connected layer and N nodes in the pooling layer, and both M and N are integers greater than 0. Similarly, the candidate network layer type A2 and the candidate hyperparameter B2 are selected as the initial network layer type and initial network layer structure for the initial deep learning model 432, respectively. The candidate network layer type A3 and the candidate hyperparameter B3 are selected as the initial network layer type and initial network layer structure for the initial deep learning model 433, respectively.

Then, construct an initial deep learning model 431 based on candidate network layer type A1 and candidate hyperparameter B1, construct an initial deep learning model 432 based on candidate network layer type A2 and candidate hyperparameter B2, and construct an initial deep learning model 432 based on candidate network layer type A3 and candidate hyperparameter B3 Initial Deep Learning Model 433 .

After the initial

deep learning models

431 , 432 , 433 are constructed, the initial

deep learning models

431 , 432 , 433 need to be trained based on the first training samples 440 .

Exemplarily, the initial feature configuration data for the initial deep learning model is selected from a plurality of candidate feature configuration data 412 . For example, the candidate feature configuration data C1 is selected as the initial feature configuration data for the initial deep learning model 431, the candidate feature configuration data C2 is selected as the initial feature configuration data for the initial deep learning model 432, and the candidate feature configuration data C3 is selected as the initial feature configuration data for the initial deep learning model 431. Initial feature configuration data for the learning model 433 .

For each initial deep learning model, the first training sample needs to be processed 440 based on the corresponding initial feature configuration data. Taking the initial deep learning model 431 as an example, the first feature type and the first feature dimension are determined based on the initial feature configuration data (C1), for example, the initial feature configuration data (C1) defines the first feature type and the first feature dimension, the first The feature types include, for example, features such as age, gender, and content category. The first feature dimension is, for example, the dimension of a feature vector, and the dimension of a feature vector is, for example, 1*128 dimensions.

Then, a first sub-sample is extracted from the first training sample 440 based on the first feature type, for example, the first sub-sample is for content including age, gender, content category and other features. The first sub-sample is processed based on the first feature dimension to obtain the first training feature data 441. The first training feature data 441 is, for example, a feature vector, and the dimension of the feature vector is, for example, 1*128 dimensions.

The process of obtaining the first training feature data 442 and the first training feature data 443 is similar to the process of obtaining the first training feature data 441 , and will not be repeated here.

Then, use the first training feature data 441 to train the initial deep learning model 431 , use the first training feature data 442 to train the initial deep learning model 432 , and use the first training feature data 443 to train the initial deep learning model 433 .

After the initial deep learning models 431-433 are trained, a network layer type set 451, a network layer structure set 452, and a feature configuration data set 453 for the initial deep learning models 431-433 are obtained. The network layer type set 451 includes, for example, initial network layer types A1, A2, and A3, the network layer structure set 452 includes, for example, initial network layer structures B1, B2, and B3, and the feature configuration data set 453 includes, for example, initial feature configuration data C1, C2, and C3. .

Then, based on the evaluation condition 413 and the verification sample 460, the target network layer type 471 (A1), target network layer structure 472 (B2), target network layer structure 472 (B2), target Feature configuration data 473 (C3), the process is similar to the above content, and will not be repeated here.

Next, construct the target deep learning model 480 based on the target network layer type 471 (A1) and the target network layer structure 462 (B2). After the target deep learning model 480 is constructed, the target deep learning model 480 needs to be trained based on the second training sample 490 .

Exemplarily, the second training sample 490 is processed based on the target feature configuration data 473 ( C3 ) to obtain the second training feature data 491 . For example, the second feature type and the second feature dimension are determined based on the target feature configuration data 473 (C3). The target feature configuration data 473 (C3) defines, for example, the second feature type and the second feature dimension. The second feature type includes, for example, age , gender and other features, the second feature dimension is, for example, the dimension of the feature vector, and the dimension of the feature vector is, for example, 1*256 dimensions.

Then, a second sub-sample is extracted from the second training sample 490 based on the second feature type, for example, the second sub-sample is for content including age, gender and other features. The second sub-sample is processed based on the second feature dimension to obtain the second training feature data 491. The second training feature data 491 is, for example, a feature vector, and the dimension of the feature vector is, for example, 1*256 dimensions.

Next, use the second training feature data 491 to train the target deep learning model 480, and obtain the trained target deep learning model 480 as the final deep learning model.

In another example of the present disclosure, the model can be trained based on the paddle training framework PaddlePaddle and open source distributed Ray. For example, using PaddlePaddle to implement model building and model training, using Ray to seamlessly switch between local training and cluster training, Ray can automatically schedule available resources for parallel training, improving resource utilization and parallel training levels.

For example, a configuration file includes two files, a feature configuration file and a training configuration file. The feature configuration file includes, for example, candidate feature configuration data, and the feature configuration file may also include a processing method of the feature, and the processing method includes, for example, normalization, hash operation, and the like. The training configuration file includes data other than features, for example including model type data, evaluation conditions, and the like.

The training samples, verification samples, candidate feature configuration data, model structure, hyperparameters, and training resource configuration used in the training process can all be called through the configuration file, without modifying the framework code, and the experimental training can be started with one click, reducing the Technical threshold and training difficulty.

For example, in the first step, input the configuration file, the first training sample, and the verification sample to perform automatic training search of the initial deep learning model, and the search results include, for example, hyperparameters, feature types, feature dimensions (embedding dimension), model structure, and the like. In the second step, the target deep learning model is retrained based on the search results and the second training samples.

The model type data in the configuration file, for example, defines how to select the initial model type and network layer structure (search direction), and the candidate feature configuration data, for example, defines the feature type search and feature dimension search. Among them, hyperparameter search, feature type search, feature dimension search and model structure search can be collectively referred to as search directions.

Feature types include features or combined features that need to be extracted from sample data during model training. Features include, for example, gender and age, and combined features include, for example, a combination of gender and age.

Exemplarily, the hyperparameter search includes, for example, a search space, a search algorithm, and a scheduler algorithm (scheduling algorithm). The search space includes algorithms such as random search (random search), grid search (grid search), and uniform distribution extraction. The search space represents which candidate hyperparameters are available for search. Search algorithms include grid search (grid search) algorithm, Bayesian optimization algorithm, OPTUNA optimization and other algorithms. OPTUNA is a framework for automatic hyperparameter optimization. The search algorithm is used to determine the optimal hyperparameter based on the training results of candidate hyperparameters. parameter. The scheduler algorithm (scheduling algorithm) includes the first-in-first-out FIFO algorithm, ASHA algorithm, etc. The ASHA algorithm is a parameter tuning algorithm. The scheduler algorithm represents how to schedule computing resources to perform parallel training based on candidate hyperparameters.

Combined features can be searched through AutoCross (automatic crossover) algorithm, AutoFis (automatic adjustment) and other models. The AutoCross model is responsible for screening useful explicit crossover features, such as screening features that improve the training effect of the model. The AutoFis model is responsible for filtering the useless second-order cross features (implicit cross features) in the FM (Factorization Machine) model and the DeepFM model. The explicit intersection feature is, for example, the combination or concatenation of multiple features, and the implicit intersection feature is, for example, the point product of multiple features.

For the feature dimension, AutoDim algorithm and AutoDis algorithm can be used to search. AutoDim algorithm is an algorithm for automatic dimension optimization, and AutoDis algorithm is an automatic discretization algorithm for numerical features. The AutoDim algorithm searches out different dimension sizes from different feature dimensions, that is, searches for suitable dimensions for discrete features. The AutoDis algorithm supports continuous feature embedding (discretization of continuous features), and searches for the most suitable dimension size for different continuous features during the training process.

Model structure search can learn the weight corresponding to the child architecture (network layer) through the NAS model (a compression model), so as to obtain an optimal model structure. For example, by learning the weights corresponding to multiple candidate network layers, the candidate network layer with a larger weight is used as the final network layer.

When performing model search and training experiments, the experimental process and experimental results can be visualized. For example, it is formulated through the VisualDL tool. The VisualDL tool is a visual analysis tool in PaddlePaddle, a flying paddle training framework. It uses rich charts to show the influence of different hyperparameters in the experiment on the experimental results, and can more intuitively understand the search space and search algorithm for the recommendation model. Impact.

The training process of the model supports batch offline training search and incremental training search. For example, batch offline search training or incremental search training can be selected through configuration. For batch offline search, compare the experimental results on the same data set to select the optimal search result. For incremental search training, if the experimental effect of incremental search is better than the original experiment, replace it, otherwise keep the original model structure and hyperparameters and continue training.

The training process can be carried out in a parallel manner. For example, some computing resources are trained based on a part of hyperparameters, model structure, and training samples, and some computing resources are trained based on another part of hyperparameters, model structure, and training samples, thereby improving training efficiency.

Fig. 5 schematically shows a flowchart of a content recommendation method according to an embodiment of the present disclosure.

As shown in FIG. 5 , the content recommendation method 500 of the embodiment of the present disclosure may include, for example, operation S510 to operation S540.

In operation S510, object feature data for the target object is determined.

In operation S520, content feature data for the target content is determined for the target content in the at least one candidate content.

In operation S530, the object feature data and the content feature data are input into the target deep learning model to obtain an output result.

In operation S540, in response to the output result satisfying the preset condition, the target content is recommended to the target object.

Exemplarily, the above-mentioned initial deep learning model or target deep learning model is suitable for content recommendation scenarios, including but not limited to articles, commodities, and news.

For example, the target object is an object that browses content, and the object feature data includes, for example, the target object's age, gender, historical browsing records, browsed content category, and so on. Any one of multiple candidate contents is taken as the target content, and the content feature data of the target content is determined. The content feature data includes, but not limited to, content category, topic information, and keyword information.

The object feature data and content feature data are input into the target deep learning model to obtain an output result, and the output result represents the degree of interest of the target object in the target content. In another example, when the model accuracy of the initial deep learning model meets requirements, the object feature data and content feature data may also be input into the initial deep learning model to obtain an output result. The initial deep learning model or the target deep learning model can automatically learn the association between object feature data and content feature data. If the output result satisfies the preset condition, it means that the target object is more interested in the target content, and at this time the target content can be recommended to the target object. If the output result does not meet the preset condition, it means that the target object is less interested in the target content, and at this time the target content may not be recommended to the target object.

According to the embodiments of the present disclosure, the content recommendation is performed through the initial deep learning model or the target deep learning model, which improves the accuracy and efficiency of content recommendation, and the recommended content meets the needs of the target object and improves the user experience of the target object.

Fig. 6 schematically shows a block diagram of a training device for a deep learning model according to an embodiment of the present disclosure.

As shown in Figure 6, the training device 600 of the deep learning model of the embodiment of the present disclosure includes, for example, an acquisition module 610, a selection module 620, a first acquisition module 630, a first processing module 640, a first training module 650 and a second acquisition module 660.

The obtaining module 610 may be used to obtain a configuration file, wherein the configuration file includes model type data and candidate feature configuration data. According to an embodiment of the present disclosure, the acquiring module 610 may, for example, perform the operation S210 described above with reference to FIG. 2 , which will not be repeated here.

The selection module 620 may be configured to select an initial network layer type and an initial network layer structure based on the model type data. According to an embodiment of the present disclosure, the selection module 620 may, for example, perform the operation S220 described above with reference to FIG. 2 , which will not be repeated here.

The first obtaining module 630 can be used to obtain an initial deep learning model based on the initial network layer type and the initial network layer structure. According to an embodiment of the present disclosure, the first obtaining module 630 may, for example, perform operation S230 described above with reference to FIG. 2 , which will not be repeated here.

The first processing module 640 may be configured to process the first training sample based on the candidate feature configuration data to obtain first training feature data. According to an embodiment of the present disclosure, the first processing module 640 may, for example, execute the operation S240 described above with reference to FIG. 2 , which will not be repeated here.

The first training module 650 can be used to train an initial deep learning model using the first training feature data. According to an embodiment of the present disclosure, the first training module 650 may, for example, execute the operation S250 described above with reference to FIG. 2 , which will not be repeated here.

The second obtaining module 660 can be used to obtain a target deep learning model based on the trained initial deep learning model. According to an embodiment of the present disclosure, the second obtaining module 660 may, for example, perform the operation S260 described above with reference to FIG. 2 , which will not be repeated here.

According to an embodiment of the present disclosure, the trained initial deep learning model includes at least one trained initial deep learning model; the configuration file also includes evaluation conditions; the second obtaining module includes: a first processing sub-module, an input sub-module, a first Identify submodules and get submodules. The first processing submodule is used to process verification samples based on candidate feature configuration data to obtain verification feature data; the input submodule is used to input verification feature data into at least one trained initial deep learning model to obtain at least one verification result ; The first determining submodule is used to determine the target network layer type, target network layer structure, and target feature configuration data from the network layer type set, network layer structure set, and feature configuration data set based on at least one verification result and evaluation condition ; The obtaining sub-module is used to obtain the target deep learning model based on the target network layer type, target network layer structure, and target feature configuration data.

According to an embodiment of the present disclosure, the network layer type set includes an initial network layer type for at least one trained initial deep learning model; the network layer structure set includes an initial network layer structure for at least one trained initial deep learning model; features The configuration data set includes initial feature configuration data for at least one trained initial deep learning model, and the initial feature configuration data in the feature configuration data set is at least part of the candidate feature configuration data.

According to an embodiment of the present disclosure, the obtaining submodule includes: an obtaining unit, a processing unit and a training unit. The obtaining unit is used to obtain the target deep learning model to be trained based on the target network layer type and the target network layer structure; the processing unit is used to process the second training sample based on the target feature configuration data to obtain the second training feature data; the training unit, It is used for using the second training feature data to train the target deep learning model to be trained to obtain the target deep learning model.

According to an embodiment of the present disclosure, the candidate feature configuration data includes at least one candidate feature configuration data; the first processing module 640 includes: a first selection submodule, a second determination submodule, an extraction submodule and a second processing submodule. The first selection submodule is used to select the initial feature configuration data for the initial deep learning model from at least one candidate configuration data; the second determination submodule is used to determine the first feature type and the first feature based on the initial feature configuration data dimension; the extraction sub-module is used to extract the first sub-sample from the first training sample based on the first feature type; the second processing sub-module is used to process the first sub-sample based on the first feature dimension to obtain the first training feature data.

According to an embodiment of the present disclosure, the processing unit includes: a determination subunit, an extraction subunit, and a processing subunit. The determination subunit is used to determine the second feature type and the second feature dimension based on the target feature configuration data; the extraction subunit is used to extract the second subsample from the second training sample based on the second feature type; the processing subunit , for processing the second sub-sample based on the second feature dimension to obtain the second training feature data.

According to an embodiment of the present disclosure, the selection module 620 includes: a second selection submodule and a third selection submodule. The second selection submodule is used to select the initial network layer type for the initial deep learning model from at least one candidate network layer type based on the model type data; the third selection submodule is used to select the target from at least one candidate hyperparameter Hyperparameters, as the initial network layer structure for the initial deep learning model.

Fig. 7 schematically shows a block diagram of a content recommendation device according to an embodiment of the present disclosure.

As shown in FIG. 7 , the content recommendation device 700 of the embodiment of the present disclosure includes, for example, a first determination module 710 , a second determination module 720 , an input module 730 and a recommendation module 740 .

The first determination module 710 may be used to determine object feature data for the target object. According to an embodiment of the present disclosure, the first determining module 710 may, for example, perform the operation S510 described above with reference to FIG. 5 , which will not be repeated here.

The second determination module 720 may be configured to determine content feature data for the target content for the target content in at least one candidate content. According to an embodiment of the present disclosure, the second determining module 720 may, for example, perform the operation S520 described above with reference to FIG. 5 , which will not be repeated here.

The input module 730 can be used to input object feature data and content feature data into the target deep learning model to obtain an output result, wherein the target deep learning model is generated using the above-mentioned deep learning model training device, and the output result represents the target object's effect on the target. Level of interest in the content. According to an embodiment of the present disclosure, the input module 730 may, for example, perform the operation S530 described above with reference to FIG. 5 , which will not be repeated here.

The recommendation module 740 may be configured to recommend target content to the target object in response to the output result meeting the preset condition. According to an embodiment of the present disclosure, the recommendation module 740 may, for example, perform the operation S540 described above with reference to FIG. 5 , which will not be repeated here.

In the technical solution of this disclosure, the collection, storage, use, processing, transmission, provision, disclosure, and application of the user's personal information involved are all in compliance with relevant laws and regulations, necessary confidentiality measures have been taken, and they do not violate the Public order and good customs.

In the technical solution of the present disclosure, before acquiring or collecting the user's personal information, the user's authorization or consent is obtained.

According to the embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium, and a computer program product.

FIG. 8 shows a schematic block diagram of an example electronic device 800 that may be used to implement embodiments of the present disclosure. Electronic device 800 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are by way of example only, and are not intended to limit implementations of the disclosure described and/or claimed herein.

As shown in FIG. 8, the device 800 includes a computing unit 801 that can execute according to a computer program stored in a read-only memory (ROM) 802 or loaded from a storage unit 808 into a random access memory (RAM) 803. Various appropriate actions and treatments. In the RAM 803, various programs and data necessary for the operation of the device 800 can also be stored. The computing unit 801, ROM 802, and RAM 803 are connected to each other through a bus 804. An input/output (I/O) interface 805 is also connected to the bus 804 .

Multiple components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, etc.; an output unit 807, such as various types of displays, speakers, etc.; a storage unit 808, such as a magnetic disk, an optical disk, etc. ; and a communication unit 809, such as a network card, a modem, a wireless communication transceiver, and the like. The communication unit 809 allows the device 800 to exchange information/data with other devices over a computer network such as the Internet and/or various telecommunication networks.

The computing unit 801 may be various general-purpose and/or special-purpose processing components having processing and computing capabilities. Some examples of computing units 801 include, but are not limited to, central processing units (CPUs), graphics processing units (GPUs), various dedicated artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processing processor (DSP), and any suitable processor, controller, microcontroller, etc. The calculation unit 801 executes various methods and processes described above, such as a deep learning model training method and/or a content recommendation method. For example, in some embodiments, the deep learning model training method and/or the content recommendation method may be implemented as a computer software program, which is tangibly embodied in a machine-readable medium, such as the storage unit 808 . In some embodiments, part or all of the computer program may be loaded and/or installed on the device 800 via the ROM 802 and/or the communication unit 809. When the computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of the above-described deep learning model training method and/or content recommendation method can be executed. Alternatively, in other embodiments, the computing unit 801 may be configured in any other appropriate way (for example, by means of firmware) to execute a deep learning model training method and/or a content recommendation method.

Various implementations of the systems and techniques described above herein can be implemented in digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips Implemented in a system of systems (SOC), complex programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpreted on a programmable system including at least one programmable processor, the programmable processor Can be special-purpose or general-purpose programmable processor, can receive data and instruction from storage system, at least one input device, and at least one output device, and transmit data and instruction to this storage system, this at least one input device, and this at least one output device an output device.

Program codes for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes can be provided to a processor or controller of a general-purpose computer, a special-purpose computer, or other programmable deep learning model training device and/or content recommendation device, so that the program code when executed by the processor or controller makes the flowchart and and/or the functions/operations specified in the block diagrams are implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.

To provide for interaction with the user, the systems and techniques described herein can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user. ); and a keyboard and pointing device (eg, a mouse or a trackball) through which a user can provide input to the computer. Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and can be in any form (including Acoustic input, speech input or, tactile input) to receive input from the user.

The systems and techniques described herein can be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., as a a user computer having a graphical user interface or web browser through which a user can interact with embodiments of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system. The components of the system can be interconnected by any form or medium of digital data communication, eg, a communication network. Examples of communication networks include: Local Area Network (LAN), Wide Area Network (WAN) and the Internet.

A computer system may include clients and servers. Clients and servers are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, a server of a distributed system, or a server combined with a blockchain.

It should be understood that steps may be reordered, added or deleted using the various forms of flow shown above. For example, each step described in the present disclosure may be executed in parallel, sequentially, or in a different order, as long as the desired result of the technical solution disclosed in the present disclosure can be achieved, no limitation is imposed herein.

The specific implementation manners described above do not limit the protection scope of the present disclosure. It should be apparent to those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present disclosure shall be included within the protection scope of the present disclosure.

Claims

A training method for a deep learning model, comprising:

Obtain a configuration file, wherein the configuration file includes model type data and candidate feature configuration data;

Selecting an initial network layer type and an initial network layer structure based on the model type data;

Obtain an initial deep learning model based on the initial network layer type and the initial network layer structure;

Processing a first training sample based on the candidate feature configuration data to obtain first training feature data;

training the initial deep learning model using the first training feature data; and

Based on the trained initial deep learning model, a target deep learning model is obtained.
The method according to claim 1, wherein said trained initial deep learning model comprises at least one trained initial deep learning model; said configuration file also includes evaluation conditions; said initial deep learning model based on training , to get the target deep learning model including:

Processing verification samples based on the candidate feature configuration data to obtain verification feature data;

respectively inputting the verification feature data into the at least one trained initial deep learning model to obtain at least one verification result;

Based on the at least one verification result and the evaluation condition, determine the target network layer type, target network layer structure, and target feature configuration data from the network layer type set, network layer structure set, and feature configuration data set, respectively; and

The target deep learning model is obtained based on the target network layer type, the target network layer structure, and the target feature configuration data.
The method of claim 2, wherein:

the set of network layer types includes an initial network layer type for the at least one trained initial deep learning model;

The set of network layer structures includes an initial network layer structure for the at least one trained initial deep learning model;

The feature configuration data set includes initial feature configuration data for the at least one trained initial deep learning model, and the initial feature configuration data in the feature configuration data set is at least part of the candidate feature configuration data.
The method according to claim 2 or 3, wherein said obtaining said target deep learning model based on said target network layer type, said target network layer structure, and said target feature configuration data comprises:

Based on the target network layer type and the target network layer structure, a target deep learning model to be trained is obtained;

processing a second training sample based on the target feature configuration data to obtain second training feature data; and

Using the second training feature data to train the target deep learning model to be trained to obtain the target deep learning model.
The method according to claim 1, wherein the candidate feature configuration data comprises at least one candidate feature configuration data; the processing of the first training sample based on the candidate feature configuration data to obtain the first training feature data comprises:

selecting initial feature configuration data for the initial deep learning model from the at least one candidate configuration data;

determining a first feature type and a first feature dimension based on the initial feature configuration data;

extracting a first subsample from the first training sample based on the first feature type; and

Processing the first sub-sample based on the first feature dimension to obtain first training feature data.
The method according to claim 4, wherein said processing the second training sample based on the target feature configuration data to obtain the second training feature data comprises:

determining a second feature type and a second feature dimension based on the target feature configuration data;

extracting a second subsample from the second training sample based on the second feature type; and

Processing the second sub-sample based on the second feature dimension to obtain second training feature data.
The method according to claim 1, wherein said selecting an initial network layer type and an initial network layer structure based on said model type data comprises:

selecting an initial network layer type for an initial deep learning model from at least one candidate network layer type based on the model type data; and

A target hyperparameter is selected from at least one candidate hyperparameter as an initial network layer structure for the initial deep learning model.
A content recommendation method comprising:

determining object characteristic data for the target object;

For target content in at least one candidate content, determine content characteristic data for the target content;

Input the object feature data and the content feature data into the target deep learning model to obtain an output result, wherein the target deep learning model is generated by the method according to any one of claims 1-7, and the The output result characterizes the degree of interest of the target object in the target content; and

In response to the output result meeting a preset condition, recommending the target content to the target object.
A training device for a deep learning model, comprising:

An acquisition module, configured to acquire a configuration file, wherein the configuration file includes model type data and candidate feature configuration data;

A selection module, configured to select an initial network layer type and an initial network layer structure based on the model type data;

A first obtaining module, configured to obtain an initial deep learning model based on the initial network layer type and the initial network layer structure;

A first processing module, configured to process a first training sample based on the candidate feature configuration data to obtain first training feature data;

A first training module, configured to use the first training feature data to train the initial deep learning model; and

The second obtaining module is used to obtain a target deep learning model based on the trained initial deep learning model.
The device according to claim 9, wherein the trained initial deep learning model includes at least one trained initial deep learning model; the configuration file also includes evaluation conditions; the second obtaining module includes:

The first processing submodule is used to process verification samples based on the candidate feature configuration data to obtain verification feature data;

The input submodule is used to respectively input the verification feature data into the at least one trained initial deep learning model to obtain at least one verification result;

The first determining submodule is used to determine the target network layer type, target network layer structure, and target from the network layer type set, network layer structure set, and feature configuration data set based on the at least one verification result and the evaluation condition. feature configuration data; and

The obtaining submodule is used to obtain the target deep learning model based on the target network layer type, the target network layer structure, and the target feature configuration data.
The apparatus of claim 10, wherein:

the set of network layer types includes an initial network layer type for the at least one trained initial deep learning model;

The set of network layer structures includes an initial network layer structure for the at least one trained initial deep learning model;

The feature configuration data set includes initial feature configuration data for the at least one trained initial deep learning model, and the initial feature configuration data in the feature configuration data set is at least part of the candidate feature configuration data.
The device according to claim 10 or 11, wherein the obtaining submodule comprises:

An obtaining unit, configured to obtain a target deep learning model to be trained based on the target network layer type and the target network layer structure;

a processing unit, configured to process a second training sample based on the target feature configuration data to obtain second training feature data; and

A training unit, configured to use the second training feature data to train the target deep learning model to be trained to obtain the target deep learning model.
The apparatus according to claim 9, wherein the candidate feature configuration data comprises at least one candidate feature configuration data; the first processing module comprises:

A first selection submodule, configured to select initial feature configuration data from the at least one candidate configuration data;

A second determining submodule, configured to determine a first feature type and a first feature dimension based on the initial feature configuration data;

an extracting submodule, configured to extract a first subsample from the first training sample based on the first feature type; and

The second processing submodule is configured to process the first sub-sample based on the first feature dimension to obtain first training feature data.
The apparatus according to claim 12, wherein the processing unit comprises:

a determining subunit, configured to determine a second feature type and a second feature dimension based on the target feature configuration data;

an extracting subunit for extracting a second subsample from the second training sample based on the second feature type; and

A processing subunit, configured to process the second sub-sample based on the second feature dimension to obtain second training feature data.
The apparatus of claim 9, wherein the selection module comprises:

A second selection submodule, configured to select an initial network layer type for an initial deep learning model from at least one candidate network layer type based on the model type data; and

The third selection submodule is used to select a target hyperparameter from at least one candidate hyperparameter as the initial network layer structure for the initial deep learning model.
A content recommendation device, comprising:

The first determination module is used to determine the object feature data for the target object;

A second determining module, configured to determine, for target content in at least one candidate content, content characteristic data for the target content;

The input module is used to input the object feature data and the content feature data into the target deep learning model to obtain an output result, wherein the target deep learning model adopts the method described in any one of claims 9-15 Generated by the device, the output result characterizes the degree of interest of the target object in the target content; and

A recommending module, configured to recommend the target content to the target object in response to the output result meeting a preset condition.
An electronic device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor, the instructions are executed by the at least one processor, so that the at least one processor can perform any one of claims 1-8. Methods.
A non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to execute the method according to any one of claims 1-8.
A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-8.