WO2020220692A1 - 深度神经网络及其训练 - Google Patents

深度神经网络及其训练 Download PDF

Info

Publication number
WO2020220692A1
WO2020220692A1 PCT/CN2019/125028 CN2019125028W WO2020220692A1 WO 2020220692 A1 WO2020220692 A1 WO 2020220692A1 CN 2019125028 W CN2019125028 W CN 2019125028W WO 2020220692 A1 WO2020220692 A1 WO 2020220692A1
Authority
WO
WIPO (PCT)
Prior art keywords
training
document
network
parameters
prediction
Prior art date
Application number
PCT/CN2019/125028
Other languages
English (en)
French (fr)
Inventor
曹雪智
祝升
汪非易
汤彪
谢睿
王仲远
Original Assignee
北京三快在线科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京三快在线科技有限公司 filed Critical 北京三快在线科技有限公司
Publication of WO2020220692A1 publication Critical patent/WO2020220692A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9038Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to a deep neural network for ranking learning and its training method, device, electronic equipment and storage medium.
  • LTR Learning To Rank
  • Ranking learning uses a supervised learning method to train a ranking scoring model using labeled training samples to evaluate the correlation between user requests and retrieved documents, so as to reasonably rank search results.
  • the model structure it can be divided into linear models, tree models, deep learning models, and combined models between them, and deep learning models are the mainstream models for sorting learning at this stage.
  • the global evaluation index is designed to evaluate whether the correlation between each user request and the retrieved document is reasonably estimated by the model, usually using global AUC (Area Under the ROC Curve) and RMSE (Root Mean Squared Error) , Root Mean Square Error) to measure.
  • the list evaluation index aims to evaluate whether the sorting result given by the final model is reasonable, usually using MAP (Mean Average Precision) and NDCG (Normalized Discounted Cumulative Gain) to measure.
  • sorting learning can be divided into three types: single document method (Pointwise), document pair method (Pairwise), and document list method (Listwise).
  • the existing single-document method is optimized for the global evaluation index and can achieve better training results on the global evaluation index.
  • the performance on the list evaluation index is often significantly inferior to the document list method.
  • the existing document list method is optimized for list evaluation indicators, and good training results have been obtained.
  • this training method can only obtain information from the list data with clicks, and the information contained in a large number of non-click search logs is It cannot be used by the model, and because it only considers the relative ranking relationship in the list, it cannot give an absolute similarity evaluation for the specified user request and the retrieved document pair, so it performs relatively poorly on the global evaluation index. Therefore, a model trained through the existing single-document method or document-list method cannot have a good global evaluation index and a list evaluation index at the same time.
  • the embodiments of the present application provide a deep neural network for ranking learning and its training method, device, electronic equipment, and storage medium to improve the list evaluation index and global evaluation index of the model.
  • the embodiment of the present application provides a deep neural network for ranking learning, including: an input layer network for modeling input features to obtain underlying features; a hidden layer network for modeling the underlying features , To extract high-order features; the prediction layer network includes a single-document prediction sub-network, a document list prediction sub-network, a single-document prediction node, and a document list prediction node.
  • the single-document prediction sub-network is used to The high-order features are scored and predicted and the prediction results are output through a single-document prediction node.
  • the document list prediction sub-network is used to score and predict the high-level features based on a document list method, and the prediction results are output through the document list prediction node .
  • the embodiment of the present application provides a training method of a deep neural network for ranking learning, including: organizing training data into a first training sample corresponding to a single document method and a second training sample corresponding to a document list method; random initialization The parameters of the input layer network of the deep neural network, the parameters of the hidden layer network, and the parameters of the prediction layer network.
  • the parameters of the prediction layer network include the parameters of the single-document prediction sub-network and the parameters of the document list prediction sub-network;
  • a training sample and the second training sample are alternately used to train the deep neural network using a single document method and a document list method to update the parameters of the prediction layer network corresponding to the current training method and the parameters of the hidden layer network And the parameters of the input layer network until the training is completed to obtain a multi-objective ranking learning model.
  • the embodiment of the application provides a training device for a deep neural network for ranking learning, including: a sample organization module for organizing training data into a first training sample corresponding to a single document method and a first training sample corresponding to a document list method. 2. Training samples; the network parameter initialization module, used to randomly initialize the parameters of the input layer network of the deep neural network, the parameters of the hidden layer network, and the parameters of the prediction layer network.
  • the parameters of the prediction layer network include the parameters of the single-document prediction sub-network And the parameters of the document list prediction sub-network; an alternate training module for training the deep neural network by alternately using the single document method and the document list method according to the first training sample and the second training sample to update
  • the parameters of the prediction layer network corresponding to the current training mode, the parameters of the hidden layer network and the parameters of the input layer network, until the training is completed, a multi-objective ranking learning model is obtained.
  • An embodiment of the present application also discloses an electronic device, including a memory, a processor, and a computer program stored on the memory and capable of running on the processor.
  • the processor implements the embodiment of the present application when the computer program is executed.
  • the training method of the deep neural network used for ranking learning.
  • the embodiment of the application provides a non-volatile computer-readable storage medium on which a computer program is stored.
  • the training method of the deep neural network for ranking learning disclosed in the embodiment of the application is step.
  • the deep neural network for ranking learning disclosed in the embodiments of the present application includes a single-document prediction sub-network and a document list prediction sub-network in the prediction layer network.
  • the single-document prediction sub-network performs scoring and prediction of high-order features based on a single-document method.
  • the prediction result is output through a single-document prediction node, and the document list prediction sub-network is used to score and predict high-order features based on the document list method, and the prediction result is output through the document list prediction node.
  • the single document method and the document list method are used to share network information in the underlying network. The two complement each other, and the information is exclusively shared in the high-level network, retaining their own characteristics, which can simultaneously improve the global evaluation index and the list evaluation index.
  • FIG. 1 is a structural diagram of a deep neural network for ranking learning provided in Embodiment 1 of the present application.
  • Fig. 2 is a graph of AUC evaluation curves corresponding to three different training methods in an embodiment of the present application.
  • FIG. 3 is a graph of NDCG evaluation curves corresponding to three different training methods in an embodiment of the present application.
  • FIG. 4 is a flowchart of a training method of a deep neural network for ranking learning provided in the second embodiment of the present application.
  • FIG. 5 is an AUC evaluation curve diagram of the alternate training of the deep neural network in the embodiment of the present application and the training of the traditional model using the single-document method.
  • FIG. 6 is an NDCG evaluation curve diagram of the alternate training of the deep neural network in the embodiment of the application and the training of the traditional model using the document list method.
  • Fig. 7 is a flowchart of a training method of a deep neural network for ranking learning provided in the third embodiment of the present application.
  • Fig. 8 is a flowchart of alternate training in an embodiment of the present application.
  • Fig. 9 is a flowchart of a training method of a deep neural network for ranking learning provided in the fourth embodiment of the present application.
  • Fig. 10 is a schematic structural diagram of a training device for a deep neural network for ranking learning provided in the fifth embodiment of the present application.
  • Fig. 11 is a schematic diagram of an electronic device for a deep neural network for ranking learning provided by an embodiment of the present application.
  • the deep neural network for ranking learning includes an input layer network 110, a hidden layer network 120 and a prediction layer network 130.
  • the input layer network 110 is used to model the input features to obtain the bottom layer features.
  • the hidden layer network 120 is used to model the underlying features to extract high-level features.
  • the prediction layer network 130 includes a single-document prediction sub-network 131, a document list prediction sub-network 132, a single-document prediction node 133, and a document list prediction node 134.
  • the single-document prediction sub-network 131 is used for scoring and predicting the high-level features based on a single-document method, and the prediction result is output through the single-document prediction node 133, and the document list prediction sub-network 132 is used for scoring based on the document list method
  • the high-level features are scored and predicted, and the prediction result is output through the document list prediction node 134.
  • Deep neural networks used for ranking learning include models such as DNN (Deep Neural Networks), DeepFM, Deep&Wide, and PNN (Product-based Neural Network, product-based neural network).
  • DeepFM includes two parts, namely DNN and FM (Factorization Machine), which are respectively responsible for the extraction of low-order features and the extraction of high-order features.
  • DNN Deep Neural Networks
  • FM Fractorization Machine
  • Wide refers to the wide linear model
  • Deep refers to the deep neural network (Deep Netural Networks), which aims to enable the trained model to simultaneously obtain memory (memorization) and generalization (generalization) capabilities.
  • PNN believes that the cross-feature expression learned after the embedding feature (embedding) is input to MLP (Multi-Layer Perception) is not sufficient. It proposes a product layer (product layer) idea, which is based on multiplication.
  • the DNN network structure that embodies the intersection of physical signs.
  • the aforementioned deep neural networks all include input layer networks, hidden layer networks and prediction layer networks.
  • the input layer network at the bottom layer models the bottom layer features, including the vector embedding representation of discrete features, the numerical transformation of continuous features, and normalization.
  • the hidden layer network in the middle models the interrelationship between features and extracts high-level features from it.
  • the high-level prediction layer network uses the high-order features modeled by the network to make scoring predictions.
  • the prediction layer network includes a single-document prediction sub-network and a document list prediction sub-network.
  • the single-document method and the document-list method can be alternately trained, so that the single-document method and The document list method shares network information in the low-level network (input layer network and hidden layer network), and the two complement each other.
  • the high-level network prediction layer network
  • FIG. 2 is a graph of AUC evaluation curves corresponding to three different training methods in an embodiment of the present application
  • FIG. 3 is a graph of NDCG evaluation corresponding to three different training methods in an embodiment of this application.
  • curve 1 indicates that the single document method is used for training alone
  • curve 2 indicates that the single document method is used for training alone
  • curve 3 indicates that the single document method and the document list method are alternately trained for the same model (ie Traditional ranking learning model, such as DNN model).
  • the abscissa represents the number of epochs of training. In one round of training, the training data is divided into multiple training batches for training.
  • the alternate training curve will quickly converge to the single training curve, which shows that the models trained by the single document method and the document list method are generally similar in parameters. There are big differences in those parameters that can be quickly trained.
  • the parameters in the high-level network can be quickly trained, and the parameters of the low-level network are not easy to be quickly trained. Therefore, it can be concluded that the two training methods of the single document method and the document list method are in the low-level network
  • the high-level network will have different characteristics.
  • the embodiment of the present application shares the input layer network and the hidden layer network with the single-document method and the document list method, and separately has corresponding single-document prediction sub-networks and document-list prediction sub-networks, thereby forming a single-document-based method and Multi-objective ranking learning model based on document list.
  • the deep neural network disclosed in the embodiments of the application includes a single-document prediction sub-network and a document list prediction sub-network in the prediction layer network.
  • the single-document prediction sub-network performs scoring predictions on high-level features based on a single-document method and passes the prediction results through the single-document prediction sub-network.
  • the document prediction node output, the document list prediction sub-network is used to score and predict high-level features based on the document list method, and the prediction results are output through the document list prediction node, thereby realizing the single document method and the document list method in the underlying network
  • Information sharing the two complement each other, the information is exclusively shared in the high-level network, retaining their own characteristics, and can simultaneously improve the global evaluation index and the list evaluation index.
  • This embodiment discloses a method for training a deep neural network for ranking learning, and the deep neural network is a deep neural network for ranking learning disclosed in the embodiments of the application. As shown in FIG. 4, the method includes step 410 to step 430.
  • Step 410 Organize the training data into a first training sample corresponding to the single document mode and a second training sample corresponding to the document list mode.
  • the first training sample includes a user request and one document in the user request recall list
  • the second training sample includes the user request and all documents in the user request recall list.
  • the input layer network and hidden layer network in the deep neural network perform feature extraction on the (user request, retrieved document 1/2/.../N) two-tuple, and then use the prediction layer network to score and predict the retrieved document.
  • Step 420 Randomly initialize the parameters of the input layer network of the deep neural network, the parameters of the hidden layer network, and the parameters of the prediction layer network.
  • the parameters of the prediction layer network include the parameters of the single-document prediction sub-network and the parameters of the document list prediction sub-network. .
  • the network parameters of the deep neural network are initialized, all network parameters and the embedding representation of discrete features can be initialized by random methods, for example, the Xavier method is used to initialize the deep neural network.
  • Feature embedding is to convert data (dimension reduction) into a fixed-size feature representation (vector) to facilitate processing and calculation (such as finding distance).
  • a model trained on a speech signal for speaker recognition may allow a speech segment to be converted into a digital vector so that another segment from the same speaker has a small distance from the original vector (e.g., Euclidean distance).
  • the dimensionality reduction method of feature embedding can be analogized to a fully connected layer (without activation function), and the dimensionality is reduced by calculating the weight matrix of the embedding layer.
  • the Xavier method is a very effective neural network initialization method, which can make the variance of each layer output as equal as possible.
  • Step 430 According to the first training sample and the second training sample, the deep neural network is trained alternately using a single document method and a document list method to update the parameters of the prediction layer network corresponding to the current training method.
  • the alternate training of the deep neural network can be done as follows. Select one or a certain number of samples from the first training sample, use the single-document method to train the deep neural network, obtain the output result from the single-document prediction node, and use the backpropagation method to sequentially analyze the prediction layer network according to the output result. The parameters of the single-document prediction sub-network, the parameters of the hidden layer network and the parameters of the input layer network are updated.
  • the second training sample performs feature extraction, uses the document list prediction sub-network to score and predict the extracted high-level features, obtains the output result from the document list prediction node, and uses backpropagation according to the output result to sequentially analyze the prediction layer network
  • the document list in the parameters predicts the parameters of the sub-network, the parameters of the hidden layer network, and the parameters of the input layer network to update.
  • the network performs score prediction, obtains the output result from the single-document prediction node, and uses backpropagation according to the output result to sequentially predict the parameters of the single-document sub-network in the parameters of the prediction layer network, the parameters of the hidden layer network, and the input layer network
  • the parameters are updated.
  • the single document method and the document list method are used alternately to train the deep neural network until the training is completed. After training for one or more rounds, a multi-objective ranking learning model can be obtained.
  • Fig. 5 is an AUC evaluation curve of the alternate training of the deep neural network in the embodiment of the application and the traditional model training using the single-document method.
  • the abscissa is the number of training samples
  • the curve 4 is the implementation of the application.
  • Curve 5 is the AUC curve of the traditional model (such as the DNN model) using the single-document method for training. It can be seen that when the curve converges, the deep neural network in the embodiment of this application The AUC index of is higher than the AUC index of the traditional model using a single-document method for training. Therefore, the training method of the deep neural network for ranking learning in the embodiment of the present application improves the global evaluation index compared with the traditional model.
  • FIG. 6 is an NDCG evaluation curve diagram of the alternate training of the deep neural network in the embodiment of the application and the traditional model training using the document list method.
  • the abscissa is the number of training samples
  • the curve 6 is the implementation of the application.
  • the NDCG curve of the deep neural network for alternate training is the NDCG curve of the traditional model (such as the DNN model) using the single-document method for training. It can be seen that when the curve converges, the deep neural network in the embodiment of the application The NDCG index of NDCG is higher than the NDCG index of the traditional model that uses the document list method for training. Therefore, the training method of the deep neural network for ranking learning in the embodiment of the present application improves the list evaluation index compared with the traditional model.
  • the training data is organized into a first training sample corresponding to a single document mode and a second training sample corresponding to a document list mode, and according to the first training sample and the second training sample
  • the learning model due to the alternate training of the single document method and the document list method, realizes the network information sharing of the single document method and the document list method in the underlying network.
  • the two complement each other, and the information is exclusively shared in the high-level network, retaining their own characteristics , Can improve the global evaluation index and the list evaluation index at the same time, thereby improving the accuracy of the ranking learning model.
  • the method further includes: upon receiving a user request, obtaining a recall list, and determining a target scene according to the user request; according to the target scene, Determining the prediction node that obtains the output result from the multi-objective ranking learning model; organizing the user request and recall list into input features corresponding to the prediction node, and inputting the input feature into the multi-objective ranking learning model, Obtain the output result from the prediction node.
  • the prediction node corresponding to the document list method or the single document method When performing offline evaluation or online scoring, you should choose to use the prediction node corresponding to the document list method or the single document method according to the characteristics of the scene. For example, when the target scene is a scene in the head of a heavy list such as search and sorting, the prediction node corresponding to the document list method is selected for prediction, and when the target scene is a browsing advertisement recommendation, the corresponding single document method is selected The prediction node makes predictions. Therefore, the corresponding prediction node is selected for prediction according to the target scene, and a better prediction result is obtained.
  • This embodiment discloses a method for training a deep neural network for ranking learning, and the deep neural network is a deep neural network for ranking learning disclosed in the embodiments of the application. As shown in FIG. 7, the method includes step 710 to step 740.
  • Step 710 Organize the training data into a first training sample corresponding to the single document mode and a second training sample corresponding to the document list mode.
  • Step 720 Randomly initialize the parameters of the input layer network of the deep neural network, the parameters of the hidden layer network, and the parameters of the prediction layer network.
  • the parameters of the prediction layer network include the parameters of the single-document prediction sub-network and the parameters of the document list prediction sub-network. .
  • Step 730 Divide the first training sample and the second training sample into multiple training batches, where each training batch includes multiple first training samples or multiple second training samples.
  • dividing the first training sample and the second training sample into multiple training batches includes: organizing the first training sample into a first training batch according to a first number According to the second number, organize the second training samples into a second training batch; randomly arrange the first training batch and the second training batch to obtain multiple training batches.
  • the selection of the first number and the second number depends on the data set and the training machine condition.
  • the first number can be equal to the product of the second number and the average number of documents requested by the user to display, thereby achieving a balance between the two training objectives.
  • the first training samples of the first quantity are organized into first training batches, thereby obtaining multiple first training batches.
  • the multiple first training batches and multiple second training batches are randomly scattered, so that the multiple first training batches and multiple second training batches are randomly arranged to obtain multiple mixed training batches Times.
  • Step 740 According to the multiple training batches, the deep neural network is trained alternately using a single document method and a document list method to update the parameters of the prediction layer network corresponding to the current training method and the hidden layer network. The parameters and the parameters of the input layer network until the training is completed, a multi-objective ranking learning model is obtained.
  • a training batch can be selected sequentially or randomly from multiple training batches, and the deep neural network can be trained based on the training method corresponding to the training batch, and the prediction layer network corresponding to the current training method can be updated by back propagation The parameters of the hidden layer network, and the parameters of the input layer network, until the training is completed, a multi-objective ranking learning model is obtained.
  • a training batch of training samples is completed, which can be called a training round.
  • the selection of the training batch in this embodiment may be random.
  • the training batch selected for the first time may belong to the first training batch.
  • the deep neural network is trained using a single document method, and the training batch selected for the second time may still belong to the first training batch.
  • the deep neural network is still trained in a single document mode at this time. Therefore, the alternate training described in this embodiment can use a single document method to train the deep neural network for one or more rounds, and then use a document list method to train the deep neural network for one or more rounds; you can also use documents
  • the deep neural network is trained for one or more rounds in a list mode, and then the deep neural network is trained for one or more rounds using a single document mode.
  • Fig. 8 is a flowchart of alternate training in an embodiment of the present application.
  • the deep neural network is trained alternately using a single document method and a document list method according to the multiple training batches.
  • Updating the parameters of the prediction layer network corresponding to the current training mode, the parameters of the hidden layer network, and the parameters of the input layer network until the training is completed to obtain a multi-objective ranking learning model includes the following steps.
  • Step 741 randomly select a training batch from the multiple training batches, and determine the current training mode based on the training samples in the training batch.
  • a training batch is randomly selected from a plurality of training batches, and based on the organization form of the training samples in the training batch, it is determined whether the current training mode is the single document mode or the document list mode. If the training samples in the training batch include a document in the user request and the user request recall list, it is determined that the current training mode is the single document mode. If the training samples in the training batch include all documents in the user request and the user request recall list, it is determined that the current training mode is the document list mode.
  • Step 742 If the current training mode is the single-document mode, train the deep neural network in the single-document mode based on the training batch, and obtain the first output result from the single-document prediction node, based on the first An output result updates the parameters of the single-document prediction sub-network, the parameters of the hidden layer network, and the parameters of the input layer network using a back propagation method.
  • the training samples in the training batch are input to the deep neural network, and the input layer network in the deep neural network is based on the previous training (including single-document method, document list method or initialization).
  • the parameters of the input layer network are modeled on the training samples to obtain the underlying features.
  • the hidden layer network in the deep neural network is based on the hidden layer network of the previous training (including the single document method or the document list method)
  • the parameters model the relationship between the underlying features to extract high-level features, the single-document prediction sub-network in the prediction layer network scores and predicts the high-level features, and outputs the first
  • the output result based on the real result corresponding to the first output result and the training sample, is used to update the parameters of the single-document prediction sub-network, the parameters of the hidden layer network, and the parameters of the input layer network in a back propagation manner.
  • Step 743 If the current training mode is the document list mode, train the deep neural network using the document list mode based on the training batch, and obtain a second output result from the document list prediction node, based on the first Second, the output result uses back propagation to update the parameters of the document list prediction sub-network, the parameters of the hidden layer network, and the parameters of the input layer network.
  • the training samples in the training batch are input to the deep neural network, and the input layer network in the deep neural network is based on the previous training (including single document method, document list method or initialization).
  • the parameters of the input layer network are modeled on the training samples to obtain the underlying features.
  • the hidden layer network in the deep neural network is based on the hidden layer network of the previous training (including the single document method or the document list method)
  • the parameters model the relationship between the underlying features to extract high-level features, the document list prediction sub-network in the prediction layer network scores and predicts the high-level features, and outputs the second through the document list prediction node
  • the output result is based on the real result corresponding to the second output result and the training sample, using a back propagation method to update the parameters of the document list prediction sub-network, the parameters of the hidden layer network, and the parameters of the input layer network.
  • the list evaluation index is taken as the optimization target, and the change of the list evaluation index is used to perform the gradient weighting during the back propagation calculation of the gradient, and the back propagation is performed.
  • step 744 it is determined whether the training is completed, if not, step 741 is executed again, and if yes, step 745 is executed.
  • Step 745 End training, and obtain a multi-objective ranking learning model.
  • the multi-objective ranking learning model refers to a learning model including a single document method and a document list method.
  • the deep neural network training method disclosed in this embodiment divides the first training sample and the second training sample into multiple training batches, and each training batch stores multiple first training samples or multiple second training samples, According to multiple training batches, the deep neural network is trained alternately using the single document method and the document list method to update the parameters of the prediction layer network corresponding to the current training method, the parameters of the hidden layer network, and the parameters of the input layer network until After the training is completed, a multi-objective ranking learning model is obtained, which realizes the alternate training of the single-document method and the document-list method, and by dividing the training samples into multiple training batches, training based on the training batches can improve the training speed.
  • This embodiment discloses a method for training a deep neural network for ranking learning, and the deep neural network is a deep neural network for ranking learning disclosed in the embodiments of the application. As shown in FIG. 9, the method includes step 910 to step 980.
  • Step 910 Organize the training data into a first training sample corresponding to the single document mode and a second training sample corresponding to the document list mode.
  • Step 920 randomly initialize the parameters of the input layer network of the deep neural network, the parameters of the hidden layer network, and the parameters of the prediction layer network.
  • the parameters of the prediction layer network include the parameters of the single-document prediction sub-network and the parameters of the document list prediction sub-network. .
  • Step 930 Randomly arrange the first training sample and the second training sample to obtain a training sample set.
  • the first training sample and the second training sample are randomly arranged together to obtain a training sample set.
  • Step 940 randomly select a training sample from the training sample set, and determine the current training mode based on the training sample.
  • a training sample is randomly selected from the training sample set, and based on the training sample, it is determined whether the current training mode is the single document mode or the document list mode. If the training sample includes a document in the user request and the user request recall list, it is determined that the current training mode is the single document mode. If the training sample includes all documents in the user request and the user request recall list, then it is determined that the current training mode is the document list mode.
  • Step 950 If the current training mode is the single-document mode, train the deep neural network using the single-document mode based on the training samples, and obtain the first output result from the single-document prediction node, based on the first The output result updates the parameters of the single-document prediction sub-network, the parameters of the hidden layer network, and the parameters of the input layer network using a back propagation method.
  • the training samples are input to the deep neural network, and the input layer network in the deep neural network is based on the input during the previous training (including single-document mode, document list mode, or initialization parameters)
  • the parameters of the layer network model the training samples to obtain the bottom layer features.
  • the hidden layer network in the deep neural network calculates the bottom layer according to the parameters of the hidden layer network in the previous training (including the single document mode or the document list mode).
  • the relationship between features is modeled to extract high-order features.
  • the single-document prediction sub-network in the prediction layer network scores and predicts the high-order features, and outputs the first output result through the single-document prediction node.
  • the first output result is the real result corresponding to the training sample, and the parameters of the single-document prediction sub-network, the parameters of the hidden layer network, and the parameters of the input layer network are updated in a back propagation manner.
  • Step 960 If the current training method is the document list method, train the deep neural network using the document list method based on the training samples, and obtain a second output result from the document list prediction node, based on the second The output result uses the back propagation method to update the parameters of the document list prediction sub-network, the parameters of the hidden layer network, and the parameters of the input layer network.
  • the training samples are input to the deep neural network, and the input layer network in the deep neural network is based on the input during the previous training (including single document method, document list method or initialization parameters)
  • the parameters of the layer network model the training samples to obtain the bottom layer features.
  • the hidden layer network in the deep neural network calculates the bottom layer according to the parameters of the hidden layer network in the previous training (including the single document mode or the document list mode). The relationship between features is modeled to extract high-order features.
  • the document list prediction sub-network in the prediction layer network scores and predicts the high-order features, and outputs the second output result through the document list prediction node, based on all
  • the second output result is a real result corresponding to the training sample
  • the parameters of the document list prediction sub-network, the parameters of the hidden layer network, and the parameters of the input layer network are updated using a back propagation method.
  • the list evaluation index is taken as the optimization target, and the change of the list evaluation index is used to perform the gradient weighting during the back propagation calculation of the gradient, and the back propagation is performed.
  • step 970 it is determined whether the training is completed, if not, step 940 is executed again, and if yes, step 980 is executed.
  • Step 980 End the training, and obtain a multi-objective ranking learning model.
  • the multi-objective ranking learning model refers to a learning model including a single document method and a document list method.
  • the first training sample and the second training sample are randomly sorted to obtain a training sample set, a training sample is randomly selected from the training sample set, and the current training sample is determined based on the training sample.
  • Training mode if the current training mode is the single-document mode, the deep neural network is trained in the single-document mode based on the training samples, and the first output result is obtained from the single-document prediction node, and the inverse is used based on the first output result. Update the parameters of the single-document prediction sub-network, the parameters of the hidden layer network, and the parameters of the input layer network to the propagation mode.
  • the deep neural network is trained using the document list method based on the training sample, And obtain the second output result from the document list prediction node, and update the parameters of the document list prediction subnet, hidden layer network, and input layer network based on the second output result using backpropagation, and execute the above again Select training samples and perform training operations based on the selected training samples.
  • a multi-objective ranking learning model is obtained, thereby realizing alternate training of single-document mode and document-list mode, which can simultaneously improve global evaluation indicators and list evaluation indicators.
  • the device 1000 includes:
  • the sample organization module 1010 is configured to organize the training data into a first training sample corresponding to the single document mode and a second training sample corresponding to the document list mode;
  • the network parameter initialization module 1020 is used to randomly initialize the parameters of the input layer network of the deep neural network, the parameters of the hidden layer network, and the parameters of the prediction layer network.
  • the parameters of the prediction layer network include the parameters of the single-document prediction sub-network and the document list Predict the parameters of the sub-network;
  • the alternate training module 1030 is configured to train the deep neural network alternately using the single document method and the document list method according to the first training sample and the second training sample, so as to update the prediction layer corresponding to the current training method.
  • the alternate training module 1030 includes: a training batch dividing unit, configured to divide the first training sample and the second training sample into multiple training batches, wherein each training batch includes multiple first training batches.
  • a training batch dividing unit configured to divide the first training sample and the second training sample into multiple training batches, wherein each training batch includes multiple first training batches.
  • One training sample or multiple second training samples an alternate training unit for training the deep neural network alternately using a single document method and a document list method according to the multiple training batches to update the current training method
  • the parameters of the corresponding prediction layer network, the parameters of the hidden layer network and the parameters of the input layer network, until the training is completed, a multi-objective ranking learning model is obtained.
  • the alternate training unit includes: a training batch selection subunit, configured to randomly select a training batch from the multiple training batches, and determine the current training based on the training samples in the training batch
  • the single-document training subunit is used to train the deep neural network in the single-document method based on the training batch if the current training method is the single-document method, and obtain the first Output result, update the parameters of the single-document prediction sub-network, the parameters of the hidden layer network, and the parameters of the input layer network in a back propagation manner based on the first output result
  • a document list training subunit for If the current training method is the document list method, the deep neural network is trained using the document list method based on the training batch, and the second output result is obtained from the document list prediction node, based on the second output result Use the back propagation method to update the parameters of the document list prediction sub-network, hidden layer network parameters, and input layer network
  • alternate training control sub-units for re-executing the selected training batch and based on the selected training batch pair
  • the training batch dividing unit is specifically configured to: organize the first training samples into a plurality of first training batches according to a first number; organize the second training samples according to a second number Are multiple second training batches; randomly arrange the multiple first training batches and the multiple second training batches to obtain multiple training batches.
  • the first number is equal to the product of the second number and the average number of documents requested by the user to be displayed.
  • the alternate training module 1030 includes: a sample arranging unit for randomly arranging the first training sample and the second training sample to obtain a training sample set; a sample selection unit for selecting A training sample is randomly selected from the training sample set, and the current training mode is determined based on the training sample; the single-document training unit is used for, if the current training mode is the single-document mode, use the single-document mode to pair based on the training sample
  • the deep neural network is trained, and the first output result is obtained from the single-document prediction node, and the parameters of the single-document prediction sub-network and the parameters of the hidden layer network are updated by using back propagation based on the first output result And the parameters of the input layer network; a document list training unit, configured to train the deep neural network using the document list method based on the training sample if the current training method is the document list method, and obtain information from the document list
  • the prediction node obtains the second output result, and uses the back propagation method to update the parameters of the document list prediction sub-network, the parameters
  • the first training sample includes a document in a user request and a recall list of the user request
  • the second training sample includes all documents in the user request and a recall list of the user request.
  • the device further includes: a target scene determining module, configured to obtain a recall list when a user request is received, and determine the target scene according to the user request;
  • the prediction node determination module is used to determine the prediction node that obtains the output result from the multi-target ranking learning model according to the target scene;
  • the output result obtaining unit is used to organize the user request and the recall list into the prediction node Corresponding input features, input the input features into the multi-objective ranking learning model, and obtain output results from the prediction node.
  • the deep neural network training device provided by the embodiment of this application is used to implement the steps of the deep neural network training method described in the embodiment of this application.
  • each module of the device please refer to the corresponding steps. Repeat.
  • the sample organization module 1010 organizes the training data into first training samples corresponding to the single document mode and second training samples corresponding to the document list mode.
  • the alternate training module 1030 is based on the first training sample.
  • the training sample and the second training sample alternately use the single document method and the document list method to train the deep neural network to update the parameters of the prediction layer network corresponding to the current training method, the parameters of the hidden layer network, and the parameters of the input layer network, until The training is complete. Due to the alternate training of the single-document method and the document-list method, the single-document method and the document-list method are realized in the network information sharing in the underlying network. The two complement each other, and the information is exclusively shared in the high-level network. At the same time, improve the global evaluation index and the list evaluation index.
  • an embodiment of the present application also discloses an electronic device.
  • the electronic device includes a memory 1102, a processor 1101, and is stored on the memory 1102 and can be stored on the processor 1101.
  • Computer program running The electronic device may also include an interface 1103 and an internal bus 1104.
  • the processor 1101, the interface 1103, and the memory 1102 are connected to each other through the internal bus 1104.
  • the processor 1101 executes the computer program, the deep neural network training method as described in the embodiment of the present application is implemented.
  • the electronic device may be a PC, a server, a mobile terminal, a personal digital assistant, a tablet computer, etc.
  • the embodiment of the present application also discloses a non-volatile computer-readable storage medium on which a computer program is stored.
  • the program is executed by a processor, the method for training a deep neural network as described in the second embodiment of the present application is implemented. step.
  • each implementation manner can be implemented by means of software plus a necessary general hardware platform, and of course, it can also be implemented by hardware.
  • the above technical solutions can be embodied in the form of software products, which can be stored in computer-readable storage media, such as ROM/RAM, magnetic A disc, an optical disc, etc., include a number of instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute the methods described in each embodiment or some parts of the embodiment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

提供了一种深度神经网络及其训练方法、装置、电子设备及存储介质,所述深度神经网络包括:输入层网络,用于对输入特征进行建模,得到底层特征;隐藏层网络,用于对底层特征进行建模,以提取高阶特征;预测层网络,包括单文档预测子网络、文档列表预测子网络、单文档预测节点和文档列表预测节点,单文档预测子网络用于基于单文档方式对高阶特征进行打分预测并将预测结果通过单文档预测节点输出,文档列表预测子网络用于基于文档列表方式对高阶特征进行打分预测,并将预测结果通过文档列表预测节点输出。

Description

深度神经网络及其训练 技术领域
本申请涉及人工智能技术领域,特别是涉及一种用于排序学习的深度神经网络及其训练方法、装置、电子设备及存储介质。
背景技术
排序学习(Learning To Rank,LTR)是机器学习技术在搜索排序场景中的典型应用,是推荐、搜索以及广告的核心算法,对用户体验等方面有重要影响。排序学习通过有监督学习方法,利用有标记的训练样本训练排序打分模型,对用户请求和被检索文档之间的相关度进行评估,从而将搜索结果进行合理的排序。根据模型结构划分,可以分为线性模型、树模型、深度学习模型,以及他们之间的组合模型,而深度学习模型为现阶段排序学习的主流模型。
在排序学习场景下,常用的评估指标可以被分为两类:全局评估指标和列表评估指标。全局评估指标旨在评估模型对于每一个用户请求和被检索文档之间的相关性是否被合理的估计,通常使用全局AUC(Area Under the ROC Curve,ROC曲线下面积)和RMSE(Root Mean Squared Error,均方根误差)来进行衡量。而列表评估指标旨在评估最终模型给出的排序结果是否合理,通常使用MAP(Mean Average Precision,平均精度均值)和NDCG(Normalized Discounted Cumulative Gain,归一化折损累计增益)来进行衡量。
在训练方式上,排序学习可以分为单文档方式(Pointwise)、文档对方式(Pairwise)、文档列表方式(Listwise)这三类。现有的单文档方式针对全局评估指标进行优化,可以在全局评估指标上取得较好的训练结果,然而在列表评估指标上的表现往往明显不如文档列表方式。现有的文档列表方式针对列表评估指标进行优化,取得了很好地训练结果,然而,该训练方式仅能从有点击的列表数据中获取信息,大量的无点击的搜索日志中蕴藏的信息则无法被模型所利用,而且由于其只考虑列表内的相对排序关系,无法对指定的用户请求和被检索文档对给出具有绝对的相似度评估,故在全局评估指标上表现相对较差。因此,通过现有的单文档方式或文档列表方式训练的模型,不能同时具有较好的全局评估指标和列表评估指标。
发明内容
本申请实施例提供一种用于排序学习的深度神经网络及其训练方法、装置、电子设备及存储介质,以提升模型的列表评估指标和全局评估指标。
本申请实施例提供了一种用于排序学习的深度神经网络,包括:输入层网络,用于 对输入特征进行建模,得到底层特征;隐藏层网络,用于对所述底层特征进行建模,以提取高阶特征;预测层网络,包括单文档预测子网络、文档列表预测子网络、单文档预测节点和文档列表预测节点,所述单文档预测子网络用于基于单文档方式对所述高阶特征进行打分预测并将预测结果通过单文档预测节点输出,所述文档列表预测子网络用于基于文档列表方式对所述高阶特征进行打分预测,并将预测结果通过文档列表预测节点输出。
本申请实施例提供了一种用于排序学习的深度神经网络的训练方法,包括:将训练数据分别组织为单文档方式对应的第一训练样本和文档列表方式对应的第二训练样本;随机初始化深度神经网络的输入层网络的参数、隐藏层网络的参数和预测层网络的参数,所述预测层网络的参数包括单文档预测子网络的参数和文档列表预测子网络的参数;根据所述第一训练样本和所述第二训练样本,交替使用单文档方式和文档列表方式对所述深度神经网络进行训练,以更新与当前训练方式对应的预测层网络的参数、所述隐藏层网络的参数和所述输入层网络的参数,直至训练完成,得到多目标排序学习模型。
本申请实施例提供了一种用于排序学习的深度神经网络的训练装置,包括:样本组织模块,用于将训练数据分别组织为单文档方式对应的第一训练样本和文档列表方式对应的第二训练样本;网络参数初始化模块,用于随机初始化深度神经网络的输入层网络的参数、隐藏层网络的参数和预测层网络的参数,所述预测层网络的参数包括单文档预测子网络的参数和文档列表预测子网络的参数;交替训练模块,用于根据所述第一训练样本和所述第二训练样本,交替使用单文档方式和文档列表方式对所述深度神经网络进行训练,以更新与当前训练方式对应的预测层网络的参数、所述隐藏层网络的参数和所述输入层网络的参数,直至训练完成,得到多目标排序学习模型。
本申请实施例还公开了一种电子设备,包括存储器、处理器及存储在所述存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现本申请实施例所述的用于排序学习的深度神经网络的训练方法。
本申请实施例提供了一种非易失性计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时本申请实施例公开的用于排序学习的深度神经网络的训练方法的步骤。
本申请实施例公开的用于排序学习的深度神经网络,在预测层网络中包括单文档预测子网络和文档列表预测子网络,单文档预测子网络基于单文档方式对高阶特征进行打分预测并将预测结果通过单文档预测节点输出,文档列表预测子网络用于基于文档列表方式对高阶特征进行打分预测,并将预测结果通过文档列表预测节点输出。实现了单文档方式和文档列表方式在底层网络中网络信息共享,两者相互补充,在高层网络中信息独享,保留各自的特性,可以同时提高全局评估指标和列表评估指标。
附图说明
图1是本申请实施例一提供的用于排序学习的深度神经网络的结构图。
图2是本申请实施例中的三种不同的训练方式对应的AUC评估曲线图。
图3是本申请实施例中的三种不同的训练方式对应的NDCG评估曲线图。
图4是本申请实施例二提供的用于排序学习的深度神经网络的训练方法的流程图。
图5是本申请实施例中的深度神经网络进行交替训练与传统模型使用单文档方式进行训练的AUC评估曲线图。
图6是本申请实施例中的深度神经网络进行交替训练与传统模型使用文档列表方式进行训练的NDCG评估曲线图。
图7是本申请实施例三提供的用于排序学习的深度神经网络的训练方法的流程图。
图8是本申请实施例中的交替训练的流程图。
图9是本申请实施例四提供的用于排序学习的深度神经网络的训练方法的流程图。
图10是本申请实施例五提供的用于排序学习的深度神经网络的训练装置的结构示意图。
图11是本申请实施例提供的用于排序学习的深度神经网络的电子设备示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
实施例一
本实施例公开一种用于排序学习的深度神经网络,如图1所示,该用于排序学习的深度神经网络包括输入层网络110、隐藏层网络120和预测层网络130。
输入层网络110,用于对输入特征进行建模,得到底层特征。隐藏层网络120,用于对所述底层特征进行建模,以提取高阶特征。预测层网络130,包括单文档预测子网络131、文档列表预测子网络132、单文档预测节点133和文档列表预测节点134。所述单文档预测子网络131用于基于单文档方式对所述高阶特征进行打分预测并将预测结果通过单文档预测节点133输出,所述文档列表预测子网络132用于基于文档列表方式对所述高阶特征进行打分预测,并将预测结果通过文档列表预测节点134输出。
用于排序学习的深度神经网络包括DNN(Deep Neural Networks,深度神经网络)、 DeepFM、Deep&Wide、PNN(Product-based Neural Network,基于产品的神经网络)等模型。其中,DeepFM包括两部分,即DNN和FM(Factorization Machine,因子分解机),分别负责低阶特征的提取和高阶特征的提取。Wide&Deep模型中Wide是指广义线性模型(Wide Linear Model),Deep是指深度神经网络(Deep Netural Networks),旨在使得训练得到的模型能够同时获得记忆(memorization)能力和泛化(generalization)能力。PNN认为在嵌入特征(embedding)输入到MLP(Multi-Layer Perception,多层感知器)之后学习的交叉特征表达并不充分,提出了一种产品层(product layer)的思想,基于乘法的运算来体现体征交叉的DNN网络结构。
上述深度神经网络均包括输入层网络、隐藏层网络和预测层网络。其中位于底层的输入层网络建模了底层特征,包括离散特征的向量嵌入表示、连续特征的数值变换、归一化等。位于中间的隐藏层网络建模了特征之间的相互关系,从中提取高阶特征。位于高层的预测层网络利用网络所建模出的高阶特征来进行打分预测。在本申请实施例中,预测层网络包括单文档预测子网络和文档列表预测子网络,在对该深度神经网络进行训练时,可以使用单文档方式和文档列表方式交替训练,从而单文档方式和文档列表方式在底层网络(输入层网络和隐藏层网络)中网络信息共享,两者相互补充,在高层网络(预测层网络)中信息独享,可以保留各自的特性。
图2是本申请实施例中的三种不同的训练方式对应的AUC评估曲线图,图3是本申请实施例中的三种不同的训练方式对应的NDCG评估曲线图。在图2和图3中,曲线1都表示单独用单文档方式进行训练,曲线2都表示单独用文档列表方式进行训练,曲线3都表示单文档方式和文档列表方式交替训练同一个模型(即传统的排序学习模型,如DNN模型)。其中,横坐标表示训练的轮数(epoch)。在一轮训练中,训练数据会被分为多个训练批次(batch)进行训练。如图2和图3所示,交替训练的曲线会迅速的收敛到单训练方式的训练曲线上,说明单文档方式和文档列表方式这两种训练方式训练出的模型总体上参数较为接近,仅在那些可以被迅速训练的参数上存在着较大的差异。在模型中,可以被快速训练的就是高层网络中的参数,而不容易被快速训练的是底层网络的参数,因此可以得出结论:单文档方式和文档列表方式这两种训练方式在底层网络中具有较高相似度,高层网络中会具有不同的特性。基于此特性,本申请实施例将单文档方式和文档列表方式共用输入层网络和隐藏层网络,并且单独具有对应的单文档预测子网络和文档列表预测子网络,从而可以形成基于单文档方式和文档列表方式的多目标排序学习模型。
本申请实施例公开的深度神经网络,在预测层网络中包括单文档预测子网络和文档列表预测子网络,单文档预测子网络基于单文档方式对高阶特征进行打分预测并将预测结果通过单文档预测节点输出,文档列表预测子网络用于基于文档列表方式对高阶特征进行打分预测,并将预测结果通过文档列表预测节点输出,从而实现了单文档方式和文档列表方式在底层网络中网络信息共享,两者相互补充,在高层网络中信息独享,保留 各自的特性,可以同时提高全局评估指标和列表评估指标。
实施例二
本实施例公开一种用于排序学习的深度神经网络的训练方法,所述深度神经网络为本申请实施例公开的用于排序学习的深度神经网络。如图4所示,该方法包括步骤410至步骤430。
步骤410,将训练数据分别组织为单文档方式对应的第一训练样本和文档列表方式对应的第二训练样本。
将同一份训练数据复制为相同的两份,将一份训练数据组织为单文档方式对应的第一训练样本,将另一份训练数据组织为文档列表方式对应的第二训练样本。其中,所述第一训练样本包括用户请求和该用户请求召回列表中的一个文档,所述第二训练样本包括用户请求和该用户请求召回列表中的所有文档。
将(用户请求,被检索的文档,是否点击)组成的多元组作为一条第一训练样本,(用户请求,被检索的文档)二元组作为深度神经网络的输入,深度神经网络中的输入层网络和隐藏层网络在对(用户请求,被检索的文档)进行特征提取后利用预测层网络预测被检索的文档的点击率。是否点击作为训练数据的标注(label),用来和模型预测的结果进行对比,计算损失函数(loss function),从而引导模型的训练方向。最终以模型预测的点击率对各个被检索的文档进行排序。
将(用户请求,被检索的文档1/2/…/N,文档1/2/…/N是否点击)多元组作为一条第二训练样本,其中,N为检索到的文档总数。深度神经网络中的输入层网络和隐藏层网络对(用户请求,被检索的文档1/2/…/N)二元组进行特征提取后利用预测层网络对被检索的文档进行打分预测,目标是使得基于该打分预测排序后的文档列表所对应的列表评价指标最优。
步骤420,随机初始化深度神经网络的输入层网络的参数、隐藏层网络的参数和预测层网络的参数,所述预测层网络的参数包括单文档预测子网络的参数和文档列表预测子网络的参数。
在对深度神经网络进行训练时,对深度神经网络的网络参数进行初始化,可以通过随机方法初始化所有网络参数以及离散特征的嵌入表示,例如采用Xavier方法进行深度神经网络的初始化。特征嵌入是将数据转换(降维)为固定大小的特征表示(矢量),以便于处理和计算(如求距离)。例如,针对用于说话者识别的语音信号训练的模型可以允许将语音片段转换为数字向量,使得来自相同说话者的另一片段与原始向量具有小的距离(例如,欧几里德距离)。特征嵌入降维的方式可以类比为一个全连接层(没有激活函数),通过嵌入层的权重矩阵计算来降低维度。Xavier方法是一种很有效的神经网络初始化方法,可以使得每一层输出的方差尽量相等。
步骤430,根据所述第一训练样本和所述第二训练样本,交替使用单文档方式和文档列表方式对所述深度神经网络进行训练,以更新与当前训练方式对应的预测层网络的参数、所述隐藏层网络的参数和所述输入层网络的参数,直至训练完成,得到多目标排序学习模型。
对深度神经网络进行交替训练,可以按如下方式进行。从第一训练样本中选取一个或一定数量的样本,使用单文档方式对深度神经网络进行训练,从单文档预测节点获取输出结果,并根据输出结果使用反向传播的方式依次对预测层网络的参数中的单文档预测子网络的参数、隐藏层网络的参数和输入层网络的参数进行更新。之后再从第二训练样本中选取一个或一定数量的样本,使用文档列表方式对深度神经网络进行训练,基于前一次单文档方式训练时更新后的输入层网络的参数和隐藏层网络的参数对第二训练样本进行特征提取,将提取到的高阶特征使用文档列表预测子网络进行打分预测,从文档列表预测节点获取输出结果,并根据输出结果使用反向传播的方式依次对预测层网络的参数中的文档列表预测子网络的参数、隐藏层网络的参数和输入层网络的参数进行更新。之后再使用单文档方式进行训练,并基于文档列表方式训练时更新的输入层网络的参数和隐藏层网络的参数对第一训练样本进行特征提取,将提取到的高阶特征使用单文档预测子网络进行打分预测,从单文档预测节点获取输出结果,并根据输出结果使用反向传播的方式依次对预测层网络的参数中的单文档预测子网络的参数、隐藏层网络的参数和输入层网络的参数进行更新。以此交替使用单文档方式和文档列表方式对深度神经网络进行训练,直至训练完成一轮。如此训练一轮或多轮,可以得到多目标排序学习模型。
图5是本申请实施例中的深度神经网络进行交替训练与传统模型使用单文档方式进行训练的AUC评估曲线图,如图5所示,横坐标为训练样本的数量,曲线4是本申请实施例中的深度神经网络进行交替训练的AUC曲线,曲线5是传统模型(如DNN模型)使用单文档方式进行训练的AUC曲线,可以看出在曲线收敛时,本申请实施例中的深度神经网络的AUC指标高于传统模型使用单文档方式进行训练的AUC指标,因此,本申请实施例中的用于排序学习的深度神经网络的训练方式相对于传统模型提升了全局评价指标。
图6是本申请实施例中的深度神经网络进行交替训练与传统模型使用文档列表方式进行训练的NDCG评估曲线图,如图6所示,横坐标为训练样本的数量,曲线6是本申请实施例中的深度神经网络进行交替训练的NDCG曲线,曲线7是传统模型(如DNN模型)使用单文档方式进行训练的NDCG曲线,可以看出在曲线收敛时,本申请实施例中的深度神经网络的NDCG指标高于传统模型使用文档列表方式进行训练的NDCG指标,因此,本申请实施例中的用于排序学习的深度神经网络的训练方式相对于传统模型提升了列表评价指标。
本申请实施例公开的深度神经网络的训练方法,将训练数据分别组织为单文档方式 对应的第一训练样本和文档列表方式对应的第二训练样本,并根据第一训练样本和第二训练样本交替使用单文档方式和文档列表方式对深度神经网络进行训练,以更新与当前训练方式对应的预测层网络的参数、隐藏层网络的参数和输入层网络的参数,直至训练完成,得到多目标排序学习模型,由于使用单文档方式和文档列表方式进行交替训练,实现了单文档方式和文档列表方式在底层网络中网络信息共享,两者相互补充,在高层网络中信息独享,保留各自的特性,可以同时提高全局评估指标和列表评估指标,从而提高了排序学习模型的准确度。
在上述技术方案的基础上,在所述得到多目标排序学习模型之后,还包括:在接收到用户请求时,获取召回列表,并根据所述用户请求,确定目标场景;根据所述目标场景,确定从所述多目标排序学习模型获取输出结果的预测节点;将所述用户请求和召回列表组织为所述预测节点对应的输入特征,并将所述输入特征输入所述多目标排序学习模型,从所述预测节点获取输出结果。
在进行离线评估或线上打分时,应当根据场景的特性,选择使用文档列表方式或单文档方式所对应的预测节点。例如,当目标场景是搜索排序之类的重列表头部的场景的时候,选取文档列表方式对应的预测节点进行预测,而当目标场景是浏览性的广告推荐时,则选取单文档方式对应的预测节点进行预测。从而根据目标场景选取对应的预测节点进行预测,得到较好的预测结果。
实施例三
本实施例公开一种用于排序学习的深度神经网络的训练方法,所述深度神经网络为本申请实施例公开的用于排序学习的深度神经网络。如图7所示,该方法包括步骤710至步骤740。
步骤710,将训练数据分别组织为单文档方式对应的第一训练样本和文档列表方式对应的第二训练样本。
步骤720,随机初始化深度神经网络的输入层网络的参数、隐藏层网络的参数和预测层网络的参数,所述预测层网络的参数包括单文档预测子网络的参数和文档列表预测子网络的参数。
步骤730,将所述第一训练样本和第二训练样本划分为多个训练批次,其中每个训练批次包括多个第一训练样本或多个第二训练样本。
在本申请的一些实施例中,将所述第一训练样本和第二训练样本划分为多个训练批次,包括:按照第一数量,将所述第一训练样本组织为第一训练批次;按照第二数量,将所述第二训练样本组织为第二训练批次;将所述第一训练批次和所述第二训练批次进行随机排列,得到多个训练批次。
其中,所述第一数量和所述第二数量的选取视数据集和训练的机器条件而定。可以 使所述第一数量等于所述第二数量与用户请求平均展示文档数的乘积,从而使得两种训练目标之间达到平衡。
将第一数量的第一训练样本组织为第一训练批次,从而得到多个第一训练批次。将第二数量的第二训练样本组织为第二训练批次,从而得到多个第二训练批次。之后,将多个第一训练批次和多个第二训练批次进行随机打散,使得多个第一训练批次和多个第二训练批次随机排列,得到混合后的多个训练批次。
步骤740,根据所述多个训练批次,交替使用单文档方式和文档列表方式对所述深度神经网络进行训练,以更新与当前训练方式对应的预测层网络的参数、所述隐藏层网络的参数和所述输入层网络的参数,直至训练完成,得到多目标排序学习模型。
可以从多个训练批次中依次或者随机选取一个训练批次,并基于该训练批次对应的训练方式对深度神经网络进行训练,利用反向传播的方式更新与当前训练方式对应的预测层网络的参数、隐藏层网络的参数和输入层网络的参数,直至训练完成,得到多目标排序学习模型。一个训练批次的训练样本训练完成,可以称为训练一轮。
本实施例训练批次的选取可能是随机的。例如,第一次选取出来训练批次可能是属于第一训练批次的,此时使用单文档方式对所述深度神经网络进行训练,第二次选取出来的训练批次可能还是属于第一训练批次的,此时仍使用单文档方式对所述深度神经网络进行训练。因此本实施例所述的交替训练可以使用单文档方式对所述深度神经网络训练一轮或多轮,再使用文档列表方式对所述深度神经网络进行训练一轮或多轮;也可以使用文档列表方式方式对所述深度神经网络训练一轮或多轮,再使用单文档方式对所述深度神经网络进行训练一轮或多轮。
图8是本申请实施例中的交替训练的流程图,如图8所示,所述根据所述多个训练批次,交替使用单文档方式和文档列表方式对所述深度神经网络进行训练,以更新与当前训练方式对应的预测层网络的参数、所述隐藏层网络的参数和所述输入层网络的参数,直至训练完成,得到多目标排序学习模型,包括以下步骤。
步骤741,从所述多个训练批次中随机选取一个训练批次,并基于所述训练批次中的训练样本确定当前训练方式。
从多个训练批次中随机选取一个训练批次,基于该训练批次中的训练样本的组织形式,确定当前训练方式是单文档方式或文档列表方式。若该训练批次中的训练样本包括用户请求和该用户请求召回列表中的一个文档,则确定当前训练方式为单文档方式。若该训练批次中的训练样本包括用户请求和该用户请求召回列表中的所有文档,则确定当前训练方式为文档列表方式。
步骤742,若所述当前训练方式为单文档方式,则基于所述训练批次使用单文档方式对所述深度神经网络进行训练,并从单文档预测节点获取第一输出结果,基于所述第一输出结果使用反向传播方式更新所述单文档预测子网络的参数、所述隐藏层网络的参 数和所述输入层网络的参数。
若所述当前训练方式为单文档方式,则将所述训练批次中的训练样本输入深度神经网络,深度神经网络中的输入层网络根据前一次训练(包括单文档方式、文档列表方式或初始化参数)时的输入层网络的参数对所述训练样本进行建模,得到底层特征,深度神经网络中的隐藏层网络根据前一次训练(包括单文档方式或文档列表方式)时的隐藏层网络的参数对所述底层特征之间的相互关系进行建模,以提取高阶特征,预测层网络中的单文档预测子网络对所述高阶特征进行打分预测,并通过单文档预测节点输出第一输出结果,基于所述第一输出结果与所述训练样本对应的真实结果,使用反向传播方式更新所述单文档预测子网络的参数、隐藏层网络的参数和输入层网络的参数。
步骤743,若所述当前训练方式为文档列表方式,则基于所述训练批次使用文档列表方式对所述深度神经网络进行训练,并从文档列表预测节点获取第二输出结果,基于所述第二输出结果使用反向传播方式更新所述文档列表预测子网络的参数、隐藏层网络的参数和输入层网络的参数。
若所述当前训练方式为文档列表方式,则将所述训练批次中的训练样本输入深度神经网络,深度神经网络中的输入层网络根据前一次训练(包括单文档方式、文档列表方式或初始化参数)时的输入层网络的参数对所述训练样本进行建模,得到底层特征,深度神经网络中的隐藏层网络根据前一次训练(包括单文档方式或文档列表方式)时的隐藏层网络的参数对所述底层特征之间的相互关系进行建模,以提取高阶特征,预测层网络中的文档列表预测子网络对所述高阶特征进行打分预测,并通过文档列表预测节点输出第二输出结果,基于所述第二输出结果与所述训练样本对应的真实结果,使用反向传播方式更新所述文档列表预测子网络的参数、隐藏层网络的参数和输入层网络的参数。在训练过程中,以列表评价指标为优化目标,在反向传播计算梯度的过程中会利用列表评价指标的改变量来进行梯度加权,进行反向传播。
步骤744,判断训练是否完成,若否,则再次执行步骤741,若是,则执行步骤745。
步骤745,结束训练,得到多目标排序学习模型。
通过判断全局评价指标和列表评价指标是否收敛来判断训练是否完成,若全局评价指标和列表评价指标均收敛,则确定训练完成,结束训练得到多目标排序学习模型,若全局评价指标或列表评价指标还没有收敛,则确定训练没有完成,再次执行步骤741-步骤744,直至训练完成。其中,多目标排序学习模型是指包括单文档方式和文档列表方式的学习模型。
本实施例公开的深度神经网络的训练方法,将第一训练样本和第二训练样本划分为多个训练批次,每个训练批次保存多个第一训练样本或多个第二训练样本,根据多个训练批次,交替使用单文档方式和文档列表方式对深度神经网络进行训练,以更新与当前训练方式对应的预测层网络的参数、隐藏层网络的参数和输入层网络的参数,直至训练 完成,得到多目标排序学习模型,从而实现了单文档方式和文档列表方式的交替训练,并且通过将训练样本划分为多个训练批次,基于训练批次进行训练可以提高训练速度。
实施例四
本实施例公开一种用于排序学习的深度神经网络的训练方法,所述深度神经网络为本申请实施例公开的用于排序学习的深度神经网络。如图9所示,该方法包括步骤910至步骤980。
步骤910,将训练数据分别组织为单文档方式对应的第一训练样本和文档列表方式对应的第二训练样本。
步骤920,随机初始化深度神经网络的输入层网络的参数、隐藏层网络的参数和预测层网络的参数,所述预测层网络的参数包括单文档预测子网络的参数和文档列表预测子网络的参数。
步骤930,对所述第一训练样本和所述第二训练样本进行随机排列,得到训练样本集合。
将第一训练样本和第二训练样本随机排列在一起,得到训练样本集合。
步骤940,从所述训练样本集合中随机选取一个训练样本,并基于所述训练样本确定当前训练方式。
从训练样本集合中随机选取一个训练样本,并基于该训练样本,确定当前训练方式为单文档方式或文档列表方式。若该训练样本包括用户请求和该用户请求召回列表中的一个文档,则确定当前训练方式为单文档方式。若该训练样本包括用户请求和该用户请求召回列表中的所有文档,则确定当前训练方式为文档列表方式。
步骤950,若所述当前训练方式为单文档方式,则基于所述训练样本使用单文档方式对所述深度神经网络进行训练,并从单文档预测节点获取第一输出结果,基于所述第一输出结果使用反向传播方式更新所述单文档预测子网络的参数、所述隐藏层网络的参数和所述输入层网络的参数。
若所述当前训练方式为单文档方式,则将所述训练样本输入深度神经网络,深度神经网络中的输入层网络根据前一次训练(包括单文档方式、文档列表方式或初始化参数)时的输入层网络的参数对所述训练样本进行建模,得到底层特征,深度神经网络中的隐藏层网络根据前一次训练(包括单文档方式或文档列表方式)时的隐藏层网络的参数对所述底层特征之间的相互关系进行建模,以提取高阶特征,预测层网络中的单文档预测子网络对所述高阶特征进行打分预测,并通过单文档预测节点输出第一输出结果,基于所述第一输出结果与所述训练样本对应的真实结果,使用反向传播方式更新所述单文档预测子网络的参数、隐藏层网络的参数和输入层网络的参数。
步骤960,若所述当前训练方式为文档列表方式,则基于所述训练样本使用文档列 表方式对所述深度神经网络进行训练,并从文档列表预测节点获取第二输出结果,基于所述第二输出结果使用反向传播方式更新所述文档列表预测子网络的参数、隐藏层网络的参数和输入层网络的参数。
若所述当前训练方式为文档列表方式,则将所述练样本输入深度神经网络,深度神经网络中的输入层网络根据前一次训练(包括单文档方式、文档列表方式或初始化参数)时的输入层网络的参数对所述训练样本进行建模,得到底层特征,深度神经网络中的隐藏层网络根据前一次训练(包括单文档方式或文档列表方式)时的隐藏层网络的参数对所述底层特征之间的相互关系进行建模,以提取高阶特征,预测层网络中的文档列表预测子网络对所述高阶特征进行打分预测,并通过文档列表预测节点输出第二输出结果,基于所述第二输出结果与所述训练样本对应的真实结果,使用反向传播方式更新所述文档列表预测子网络的参数、隐藏层网络的参数和输入层网络的参数。在训练过程中,以列表评价指标为优化目标,在反向传播计算梯度的过程中会利用列表评价指标的改变量来进行梯度加权,进行反向传播。
步骤970,判断训练是否完成,若否,则再次执行步骤940,若是,则执行步骤980。
步骤980,结束训练,得到多目标排序学习模型。
通过判断全局评价指标和列表评价指标是否收敛来判断训练是否完成,若全局评价指标和列表评价指标均收敛,则确定训练完成,结束训练,得到多目标排序学习模型,若全局评价指标或列表评价指标还没有收敛,则确定训练没有完成,再次执行步骤940-步骤970,直至训练完成。其中,多目标排序学习模型是指包括单文档方式和文档列表方式的学习模型。
本实施例公开的深度神经网络的训练方法,对第一训练样本和第二训练样本进行随机排序,得到训练样本集合,从训练样本集合中随机选取一个训练样本,并基于所述训练样本确定当前训练方式,若当前训练方式为单文档方式,则基于所述训练样本使用单文档方式对深度神经网络进行训练,并从单文档预测节点获取第一输出结果,基于所述第一输出结果使用反向传播方式更新单文档预测子网络的参数、隐藏层网络的参数和输入层网络的参数,若当前训练方式为文档列表方式,则基于所述训练样本使用文档列表方式对深度神经网络进行训练,并从文档列表预测节点获取第二输出结果,基于所述第二输出结果使用反向传播方式更新所述文档列表预测子网络的参数、隐藏层网络的参数和输入层网络的参数,再次执行上述选取训练样本并基于选取的训练样本进行训练的操作,直至训练完成,得到多目标排序学习模型,从而实现了单文档方式和文档列表方式的交替训练,可以同时提高全局评估指标和列表评估指标。
实施例五
本实施例公开一种用于排序学习的深度神经网络的训练装置,所述深度神经网络为本申请实施例公开的用于排序学习的深度神经网络。如图10所示,所述装置1000包括:
样本组织模块1010,用于将训练数据分别组织为与单文档方式对应的第一训练样本和与文档列表方式对应的第二训练样本;
网络参数初始化模块1020,用于随机初始化深度神经网络的输入层网络的参数、隐藏层网络的参数和预测层网络的参数,所述预测层网络的参数包括单文档预测子网络的参数和文档列表预测子网络的参数;
交替训练模块1030,用于根据所述第一训练样本和所述第二训练样本,交替使用单文档方式和文档列表方式对所述深度神经网络进行训练,以更新与当前训练方式对应的预测层网络的参数、所述隐藏层网络的参数和所述输入层网络的参数,直至训练完成,得到多目标排序学习模型。
可选的,所述交替训练模块1030包括:训练批次划分单元,用于将所述第一训练样本和第二训练样本划分为多个训练批次,其中每个训练批次包括多个第一训练样本或多个第二训练样本;交替训练单元,用于根据所述多个训练批次,交替使用单文档方式和文档列表方式对所述深度神经网络进行训练,以更新与当前训练方式对应的预测层网络的参数、所述隐藏层网络的参数和所述输入层网络的参数,直至训练完成,得到多目标排序学习模型。
可选的,所述交替训练单元包括:训练批次选取子单元,用于从所述多个训练批次中随机选取一个训练批次,并基于所述训练批次中的训练样本确定当前训练方式;单文档训练子单元,用于若所述当前训练方式为单文档方式,则基于所述训练批次使用单文档方式对所述深度神经网络进行训练,并从单文档预测节点获取第一输出结果,基于所述第一输出结果使用反向传播方式更新所述单文档预测子网络的参数、所述隐藏层网络的参数和所述输入层网络的参数;文档列表训练子单元,用于若所述当前训练方式为文档列表方式,则基于所述训练批次使用文档列表方式对所述深度神经网络进行训练,并从文档列表预测节点获取第二输出结果,基于所述第二输出结果使用反向传播方式更新所述文档列表预测子网络的参数、隐藏层网络参数和输入层网络的参数;交替训练控制子单元,用于再次执行上述选取训练批次并基于选取的训练批次对所述深度神经网络进行训练的操作,直至训练完成,得到多目标排序学习模型。
可选的,所述训练批次划分单元具体用于:按照第一数量,将所述第一训练样本组织为多个第一训练批次;按照第二数量,将所述第二训练样本组织为多个第二训练批次;将所述多个第一训练批次和所述多个第二训练批次进行随机排列,得到多个训练批次。
可选的,所述第一数量等于所述第二数量与用户请求平均展示文档数的乘积。
可选的,所述交替训练模块1030包括:样本排列单元,用于对所述第一训练样本和所述第二训练样本进行随机排列,得到训练样本集合;样本选取单元,用于从所述训练样本集合中随机选取一个训练样本,并基于所述训练样本确定当前训练方式;单文档训练单元,用于若所述当前训练方式为单文档方式,则基于所述训练样本使用单文档方式 对所述深度神经网络进行训练,并从单文档预测节点获取第一输出结果,基于所述第一输出结果使用反向传播方式更新所述单文档预测子网络的参数、所述隐藏层网络的参数和所述输入层网络的参数;文档列表训练单元,用于若所述当前训练方式为文档列表方式,则基于所述训练样本使用文档列表方式对所述深度神经网络进行训练,并从文档列表预测节点获取第二输出结果,基于所述第二输出结果使用反向传播方式更新所述文档列表预测子网络的参数、隐藏层网络的参数和输入层网络的参数;交替训练控制单元,用于再次执行上述选取训练样本并基于选取的训练样本对所述深度神经网络进行训练的操作,直至训练完成,得到多目标排序学习模型。
可选的,所述第一训练样本包括用户请求和该用户请求召回列表中的一个文档,所述第二训练样本包括用户请求和该用户请求召回列表中的所有文档。
可选的,在所述得到多目标排序学习模型之后,所述装置还包括:目标场景确定模块,用于在接收到用户请求时,获取召回列表,并根据所述用户请求,确定目标场景;预测节点确定模块,用于根据所述目标场景,确定从所述多目标排序学习模型获取输出结果的预测节点;输出结果获取单元,用于将所述用户请求和召回列表组织为所述预测节点对应的输入特征,并将所述输入特征输入所述多目标排序学习模型,从所述预测节点获取输出结果。
本申请实施例提供的深度神经网络的训练装置,用于实现本申请实施例中所述的深度神经网络的训练方法的各步骤,装置的各模块的具体实施方式参见相应步骤,此处不再赘述。
本申请实施例公开的深度神经网络的训练装置,样本组织模块1010将训练数据分别组织为单文档方式对应的第一训练样本和文档列表方式对应的第二训练样本,交替训练模块1030根据第一训练样本和第二训练样本交替使用单文档方式和文档列表方式对深度神经网络进行训练,以更新与当前训练方式对应的预测层网络的参数、隐藏层网络的参数和输入层网络的参数,直至训练完成。由于使用单文档方式和文档列表方式进行交替训练,实现了单文档方式和文档列表方式在底层网络中网络信息共享,两者相互补充,在高层网络中信息独享,保留各自的特性,从而可以同时提高全局评估指标和列表评估指标。
相应的,如图11所示,本申请实施例还公开了一种电子设备,在硬件层面,该电子设备包括存储器1102、处理器1101及存储在所述存储器1102上并可在处理器1101上运行的计算机程序。所述电子设备还可以包括接口1103和内部总线1104,处理器1101、接口1103、存储器1102通过内部总线1104相互连接。所述处理器1101执行所述计算机程序时实现如本申请实施例所述的深度神经网络的训练方法。所述电子设备可以为PC机、服务器、移动终端、个人数字助理、平板电脑等。
本申请实施例还公开了一种非易失性计算机可读存储介质,其上存储有计算机程序, 该程序被处理器执行时实现如本申请实施例二所述的深度神经网络的训练方法的步骤。
本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。对于装置实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
以上对本申请实施例提供的一种用于排序学习的深度神经网络及其训练方法、装置、电子设备及存储介质进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件实现。基于这样的理解,上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。

Claims (12)

  1. 一种用于排序学习的深度神经网络,其特征在于,包括:
    输入层网络,用于对输入特征进行建模,得到底层特征;
    隐藏层网络,用于对所述底层特征进行建模,以提取高阶特征;
    预测层网络,包括单文档预测子网络、文档列表预测子网络、单文档预测节点和文档列表预测节点,其中,所述单文档预测子网络用于基于单文档方式对所述高阶特征进行打分预测并将单文档预测结果通过所述单文档预测节点输出,所述文档列表预测子网络用于基于文档列表方式对所述高阶特征进行打分预测,并将文档列表预测结果通过所述文档列表预测节点输出。
  2. 一种如权利要求1所述深度神经网络的训练方法,其特征在于,包括:
    将训练数据分别组织为与所述单文档方式对应的第一训练样本和与所述文档列表方式对应的第二训练样本;
    随机初始化所述深度神经网络的所述输入层网络的参数、所述隐藏层网络的参数和所述预测层网络的参数,其中,所述预测层网络的所述参数包括所述单文档预测子网络的参数和所述文档列表预测子网络的参数;
    根据所述第一训练样本和所述第二训练样本,交替使用所述单文档方式和所述文档列表方式对所述深度神经网络进行训练,以更新与当前训练方式对应的所述预测层网络的所述参数、所述隐藏层网络的所述参数和所述输入层网络的所述参数,直至训练完成,得到多目标排序学习模型。
  3. 根据权利要求2所述的方法,其特征在于,得到所述多目标排序学习模型,包括:
    将所述第一训练样本和所述第二训练样本划分为多个训练批次,其中,每个训练批次包括多个第一训练样本或多个第二训练样本;
    根据所述多个训练批次,交替使用所述单文档方式和所述文档列表方式对所述深度神经网络进行训练,以更新与当前训练方式对应的所述预测层网络的所述参数、所述隐藏层网络的所述参数和所述输入层网络的所述参数,直至训练完成,得到所述多目标排序学习模型。
  4. 根据权利要求3所述的方法,其特征在于,得到所述多目标排序学习模型,包括:
    从所述多个训练批次中随机选取一个所述训练批次,并基于所述训练批次中的训练样本确定当前训练方式;
    若所述当前训练方式为所述单文档方式,则基于所述训练批次使用所述单文档方式对所述深度神经网络进行训练,并从所述单文档预测节点获取第一输出结果,基于所述第一输出结果使用反向传播方式更新所述单文档预测子网络的所述参数、所述隐藏层网络的所述参数和所述输入层网络的所述参数;
    若所述当前训练方式为所述文档列表方式,则基于所述训练批次使用所述文档列表方式对所述深度神经网络进行训练,并从所述文档列表预测节点获取第二输出结果,基于所述第二输出结果使用所述反向传播方式更新所述文档列表预测子网络的所述参数、所述隐藏层网络的所述参数和所述输入层网络的所述参数;
    再次执行上述选取训练批次并基于选取的训练批次对所述深度神经网络进行训练的操作,直至训练完成,得到所述多目标排序学习模型。
  5. 根据权利要求3所述的方法,其特征在于,将所述多个第一训练样本和所述多个第二训练样本划分为多个所述训练批次,包括:
    按照第一数量,将所述第一训练样本组织为多个第一训练批次;
    按照第二数量,将所述第二训练样本组织为多个第二训练批次;
    将所述多个第一训练批次和所述多个第二训练批次进行随机排列,得到多个所述训练批次。
  6. 根据权利要求5所述的方法,其特征在于,所述第一数量等于所述第二数量与用户请求平均展示文档数的乘积。
  7. 根据权利要求2所述的方法,其特征在于,得到所述多目标排序学习模型,包括:
    对所述第一训练样本和所述第二训练样本进行随机排列,得到训练样本集合;
    从所述训练样本集合中随机选取一个训练样本,并基于所述训练样本确定当前训练方式;
    若所述当前训练方式为所述单文档方式,则基于所述训练样本使用所述单文档方式对所述深度神经网络进行训练,并从所述单文档预测节点获取第一输出结果,基于所述第一输出结果使用反向传播方式更新所述单文档预测子网络的所述参数、所述隐藏层网络的所述参数和所述输入层网络的所述参数;
    若所述当前训练方式为所述文档列表方式,则基于所述训练样本使用所述文档列表方式对所述深度神经网络进行训练,并从所述文档列表预测节点获取第二输出结果,基于所述第二输出结果使用所述反向传播方式更新所述文档列表预测子网络的所述参数、所述隐藏层网络的所述参数和所述输入层网络的所述参数;
    再次执行上述选取训练样本并基于选取的训练样本对所述深度神经网络进行训练的操作,直至训练完成,得到所述多目标排序学习模型。
  8. 根据权利要求2所述的方法,其特征在于,所述第一训练样本包括用户请求和该用户请求召回列表中的一个文档,所述第二训练样本包括用户请求和该用户请求召回列表中的所有文档。
  9. 根据权利要求2所述的方法,其特征在于,在得到所述多目标排序学习模型之后,还包括:
    在接收到用户请求时,获取召回列表,并根据所述用户请求,确定目标场景;
    根据所述目标场景,确定从所述多目标排序学习模型获取输出结果的预测节点;
    将所述用户请求和所述召回列表组织为所述预测节点对应的输入特征,并将所述输入特征输入所述多目标排序学习模型,从所述预测节点获取输出结果。
  10. 一种如权利要求1所述的深度神经网络的训练装置,其特征在于,包括:
    样本组织模块,用于将训练数据分别组织为与所述单文档方式对应的第一训练样本和所述文档列表方式对应的所述第二训练样本;
    网络参数初始化模块,用于随机初始化所述深度神经网络的所述输入层网络的参数、所述隐藏层网络的参数和所述预测层网络的参数,所述预测层网络的所述参数包括所述单文档预测子网络的参数和所述文档列表预测子网络的参数;
    交替训练模块,用于根据所述第一训练样本和所述第二训练样本,交替使用所述单文档方式和所述文档列表方式对所述深度神经网络进行训练,以更新与当前训练方式对应的所述预测层网络的所述参数、所述隐藏层网络的所述参数和所述输入层网络的所述参数,直至训练完成,得到多目标排序学习模型。
  11. 一种电子设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求2至9任意一项所述的深度神经网络的训练方法。
  12. 一种非易失性计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时,促使所述处理器实现权利要求2至9任意一项所述的深度神经网络的训练方法的步骤。
PCT/CN2019/125028 2019-04-30 2019-12-13 深度神经网络及其训练 WO2020220692A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910364257.8A CN110222838B (zh) 2019-04-30 2019-04-30 文档排序方法、装置、电子设备及存储介质
CN201910364257.8 2019-04-30

Publications (1)

Publication Number Publication Date
WO2020220692A1 true WO2020220692A1 (zh) 2020-11-05

Family

ID=67820231

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/125028 WO2020220692A1 (zh) 2019-04-30 2019-12-13 深度神经网络及其训练

Country Status (2)

Country Link
CN (1) CN110222838B (zh)
WO (1) WO2020220692A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222838B (zh) * 2019-04-30 2021-07-20 北京三快在线科技有限公司 文档排序方法、装置、电子设备及存储介质
CN112100493B (zh) * 2020-09-11 2024-04-26 北京三快在线科技有限公司 文档排序方法、装置、设备及存储介质
CN116525121B (zh) * 2023-07-05 2023-09-26 昆明同心医联科技有限公司 栓塞动脉瘤的首发弹簧圈推荐模型建立方法及其应用

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160098844A1 (en) * 2014-10-03 2016-04-07 EyeEm Mobile GmbH Systems, methods, and computer program products for searching and sorting images by aesthetic quality
CN106339756A (zh) * 2016-08-25 2017-01-18 北京百度网讯科技有限公司 训练数据的生成方法、搜索方法以及装置
CN106462626A (zh) * 2014-06-13 2017-02-22 微软技术许可有限责任公司 利用深度神经网络对兴趣度建模
CN110222838A (zh) * 2019-04-30 2019-09-10 北京三快在线科技有限公司 深度神经网络及其训练方法、装置、电子设备及存储介质

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100082639A1 (en) * 2008-09-30 2010-04-01 Microsoft Corporation Processing maximum likelihood for listwise rankings
US9104718B1 (en) * 2013-03-07 2015-08-11 Vast.com, Inc. Systems, methods, and devices for measuring similarity of and generating recommendations for unique items
CN104239375A (zh) * 2013-06-17 2014-12-24 成都按图索骥网络科技有限公司 基于数据包络分析的排序学习方法
CN104317834B (zh) * 2014-10-10 2017-09-29 浙江大学 一种基于深度神经网络的跨媒体排序方法
US20180060438A1 (en) * 2016-08-25 2018-03-01 Linkedin Corporation Prioritizing locations for people search
CN106599577A (zh) * 2016-12-13 2017-04-26 重庆邮电大学 一种结合rbm和特征选择的列表级排序学习方法
CN107105031A (zh) * 2017-04-20 2017-08-29 北京京东尚科信息技术有限公司 信息推送方法和装置
CN107818164A (zh) * 2017-11-02 2018-03-20 东北师范大学 一种智能问答方法及其系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106462626A (zh) * 2014-06-13 2017-02-22 微软技术许可有限责任公司 利用深度神经网络对兴趣度建模
US20160098844A1 (en) * 2014-10-03 2016-04-07 EyeEm Mobile GmbH Systems, methods, and computer program products for searching and sorting images by aesthetic quality
CN106339756A (zh) * 2016-08-25 2017-01-18 北京百度网讯科技有限公司 训练数据的生成方法、搜索方法以及装置
CN110222838A (zh) * 2019-04-30 2019-09-10 北京三快在线科技有限公司 深度神经网络及其训练方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN110222838B (zh) 2021-07-20
CN110222838A (zh) 2019-09-10

Similar Documents

Publication Publication Date Title
US10713317B2 (en) Conversational agent for search
WO2020107806A1 (zh) 一种推荐方法及装置
CN111242310B (zh) 特征有效性评估方法、装置、电子设备及存储介质
WO2020220692A1 (zh) 深度神经网络及其训练
CN108475393A (zh) 通过合成特征和梯度提升决策树进行预测的系统和方法
KR102170968B1 (ko) 머신 러닝 기반의 근사모델 구축 방법 및 시스템
CN109960749B (zh) 模型获取方法、关键词生成方法、装置、介质及计算设备
CN110598084A (zh) 对象排序方法、商品排序方法、装置及电子设备
CN113191838A (zh) 一种基于异质图神经网络的购物推荐方法及系统
US20240046922A1 (en) Systems and methods for dynamically updating machine learning models that provide conversational responses
CN110263125B (zh) 一种基于极限学习机的服务发现方法
US20230029590A1 (en) Evaluating output sequences using an auto-regressive language model neural network
WO2022015390A1 (en) Hardware-optimized neural architecture search
Xiong et al. L-RBF: A customer churn prediction model based on lasso+ RBF
CN116628236B (zh) 多媒体信息的投放方法、装置、电子设备及存储介质
CN111310971B (zh) 一种o2o商业模式的前景分析方法、装置及设备
CN114298118B (zh) 一种基于深度学习的数据处理方法、相关设备及存储介质
KR102383926B1 (ko) 챗봇 시스템의 제어 방법, 장치 및 프로그램
CN117216382A (zh) 一种交互处理的方法、模型训练的方法以及相关装置
WO2020252766A1 (zh) 一种深度神经网络多任务超参数优化方法及装置
Fuchs et al. Pricing the Nearly Known-When Semantic Similarity is Just not Enough
CN116821667A (zh) 一种模型训练方法和相关装置
CN116932759A (zh) 一种文本分类模型获得方法、装置、存储介质及电子设备
CN117272130A (zh) 一种基于特征选择去偏的推荐系统点击预测的方法
Yu Latent Factor Model for Book Recommendation System---Taking Douban as an Example

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19927481

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19927481

Country of ref document: EP

Kind code of ref document: A1