CN111931058A - Sequence recommendation method and system based on adaptive network depth - Google Patents

Sequence recommendation method and system based on adaptive network depth Download PDF

Info

Publication number
CN111931058A
CN111931058A CN202010835626.XA CN202010835626A CN111931058A CN 111931058 A CN111931058 A CN 111931058A CN 202010835626 A CN202010835626 A CN 202010835626A CN 111931058 A CN111931058 A CN 111931058A
Authority
CN
China
Prior art keywords
network
sequence
residual block
sequence recommendation
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010835626.XA
Other languages
Chinese (zh)
Other versions
CN111931058B (en
Inventor
陈磊
杨敏
原发杰
李成明
姜青山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN202010835626.XA priority Critical patent/CN111931058B/en
Publication of CN111931058A publication Critical patent/CN111931058A/en
Application granted granted Critical
Publication of CN111931058B publication Critical patent/CN111931058B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Abstract

The invention discloses a sequence recommendation method and system based on adaptive network depth. The method comprises the following steps: constructing a sequence recommendation model, wherein the sequence recommendation model is provided with a plurality of cavity convolution residual blocks as a main network and a strategy network for managing the depth of the main network; training the sequence recommendation model by using a sample set with a set loss function as a target to obtain a trained main body network, and outputting a decision indication for representing the reservation or skipping of the hollow convolutional residual block by using a strategy network for each of the plurality of hollow convolutional residual blocks; inputting the historical browsing sequence of the user to be recommended into the trained sequence recommendation model, and determining the void convolution residual block to be skipped according to the decision indication of the strategy network so as to output the prediction result of the user recommended item at the subsequent moment. The invention can self-adaptively adjust the depth of the main network by utilizing the policy network and can provide quick and accurate recommendation service for the user.

Description

Sequence recommendation method and system based on adaptive network depth
Technical Field
The invention relates to the technical field of sequence recommendation, in particular to a sequence recommendation method and system based on adaptive network depth.
Background
The recommendation system is a field which is researched very hot and rapidly in recent years, attracts attention due to wide application scenes and huge commercial value, and is defined as providing commodity information and suggestions to customers by utilizing an e-commerce website, helping the customers to decide what products should be purchased, simulating sales personnel to help the customers to complete a purchase process, and recommending information and commodities which the customers are interested in to the customers according to the characteristics and purchasing behaviors of the customers through personalized recommendation. The sequence recommendation system is an important branch in the recommendation system, and aims to accurately recommend a user by analyzing a historical browsing sequence of the user, and is always a hot research problem concerned by academia and industry.
Taking a commonly-used sequence recommendation model NextItNet as an example, the method combines a hole convolutional neural network and a residual error network, and can better model a user history browsing sequence, thereby better providing recommendation service for the user and playing an excellent effect in a sequence recommendation system.
The model structure of NextItNet is shown in figure 1, and is generally formed by stacking a plurality of hole convolution residual blocks with the same structure, a user history browsing sequence is input into the whole network for modeling, a user preference representation is obtained after the last hole convolution residual block passes through, and finally an item (item) recommended to a user at the next moment is predicted through a Softmax classifier.
The output of the hollow convolution residual block in NextItNet is represented as:
Xl+1=Xl+F(Xl)
i.e. each spaceOutput X of the hole convolution residual blockl+1Is input XlResult F (X) after addition of the present residual block processingl)。F(Xl) The processing procedure includes sequentially inputting hole convolution Layer 1 (scaled Conv1), Layer normalization Layer 1(Layer Norm1), ReLU active Layer 1(ReLU1), hole convolution Layer 2 (scaled Conv2), Layer normalization Layer 2(Layer Norm2), and ReLU active Layer 2(ReLU2), and outputting the processed results.
However, when recommendation service is performed using the existing sequence recommendation model, there are problems of the number of model parameters, high calculation overhead required for the model, and long estimation time. For example, NextItNet requires a large number of hollow convolution residual blocks to be stacked to achieve a better effect, so that the model parameter quantity is large, and output prediction can be completed only through a complete model for each input user history browsing sequence, so that the trained model is difficult to deploy in practical application, the calculation cost is large, the time spent in inference is long, and the actual requirements of users are difficult to meet.
Disclosure of Invention
The present invention is directed to overcome the above-mentioned drawbacks of the prior art, and provides a sequence recommendation method and system based on adaptive network depth, which improves the efficiency of recommendation service by adaptively adjusting the depth of a sequence recommendation model.
According to a first aspect of the present invention, a sequence recommendation method based on adaptive network depth is provided. The method comprises the following steps:
constructing a sequence recommendation model, wherein the sequence recommendation model is provided with a plurality of cavity convolution residual blocks as a main network and a strategy network for managing the depth of the main network;
training the sequence recommendation model by using a sample set with a set loss function as a target to obtain a trained main body network, and outputting a decision indication for representing the reservation or skipping of the hollow convolutional residual block by using a strategy network for each of the plurality of hollow convolutional residual blocks;
inputting the historical browsing sequence of the user to be recommended into the trained sequence recommendation model, and determining the void convolution residual block to be skipped according to the decision indication of the strategy network so as to output the prediction result of the user recommended item at the subsequent moment.
In one embodiment, the sequence recommendation model is trained according to the following steps:
training a main network by using a sample set by taking a set first loss function as a target to obtain a pre-trained main network;
and training the pre-trained main network and the strategy network end to end by using the sample set by taking the set second loss function as a target to obtain a decision indication for representing the reservation or skipping of each hollow convolution residual block in the main network.
In one embodiment, the first loss function and the second loss function are identical and are both set as cross entropy between correct terms and predicted terms.
In one embodiment, the policy network includes a hole convolutional residual block, which is run in the user history browsing sequence X ═ X1,x2,...,xn-1Taking the sequence as input, and generating a decision indication sequence (a) through Gumbel-softmax sampling1,a2,...,aNAnd N is the number of the hole convolution residual blocks in the main network, and each decision indication is respectively used for guiding the main network to select to reserve the hole convolution residual block or skip the hole convolution residual block before entering the corresponding hole convolution residual block.
In one embodiment, each hole convolution residual block includes a plurality of superimposed hole convolution layers, a layer normalization layer, and an activation layer.
In one embodiment, the subject network is the NextItNet model.
According to a second aspect of the present invention, an adaptive network depth based sequence recommendation system is provided. The system comprises:
a model construction unit; the system comprises a sequence recommendation model, a strategy network and a data processing module, wherein the sequence recommendation model is provided with a plurality of cavity convolution residual blocks as a main network and a strategy network for managing the depth of the main network;
a model training unit: training the sequence recommendation model by using a sample set to obtain a trained main body network by taking a set loss function as a target, and outputting a decision indication for representing the reservation or skipping of the hollow convolutional residual block by using a strategy network for each of the plurality of hollow convolutional residual blocks;
a sequence recommendation unit: and the system is used for inputting the historical browsing sequence of the user to be recommended into the trained sequence recommendation model, determining the void convolution residual block to be skipped according to the decision indication of the strategy network, and further outputting the prediction result of the user recommendation item at the subsequent moment.
Compared with the prior art, the sequence recommendation method has the advantages that the network depth can be selected in a self-adaptive mode according to the input user historical browsing sequence, the decision sequence is output through the strategy network aiming at each user sequence input through learning of the strategy network, and then the cavity convolution residual blocks in the main network are determined to be reserved and the cavity convolution residual blocks are skipped, so that the purpose of self-adapting the network depth is achieved, the calculation cost of a model is reduced, the overall inference speed is remarkably improved, and quick and accurate recommendation service can be provided for users.
Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.
FIG. 1 is a schematic diagram of a prior NextItNet model structure;
FIG. 2 is a flow diagram of a method for adaptive network depth based sequence recommendation in accordance with one embodiment of the present invention;
fig. 3 is a schematic diagram of an adaptive network depth based sequence recommendation model according to an embodiment of the present invention.
Detailed Description
Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
The sequence recommendation model (or called Adaptive-NextItNet) based on the Adaptive network depth can adaptively select the network depth according to the input user historical browsing sequence, and a decision sequence is output through a policy network for each user sequence input by learning the policy network to determine which cavity convolution residual blocks in a main network should be reserved and which cavity convolution residual blocks should be skipped, so that the purpose of adapting the network depth is achieved.
Specifically, referring to fig. 2, the sequence recommendation method based on adaptive network depth of the embodiment includes the following steps:
step S210, constructing a sequence recommendation model, wherein the sequence recommendation model is provided with a plurality of hole convolution residual blocks as a main network and a strategy network for managing the depth of the main network.
In this step, taking the NextItNet model as an example of the main network, an Adaptive-NextItNet overall model is built. As shown in FIG. 3, the overall model structure of Adaptive-NextItNet is divided into a NextItNet main network and a strategy network.
The NextItNet main body network is formed by stacking N (N is more than or equal to 2) cavity convolution residual blocks with the same structure, a user history browsing sequence is input into the whole network for modeling, a user preference representation is obtained after the last cavity convolution residual block passes through, and finally an item recommended to the user at the next moment is predicted through a classifier.
In the embodiment of fig. 3, the policy network is a lightweight hole convolution neural network, similar in structure to the NextItNet model, but containing only one hole convolution residual block. By setting the lightweight strategy network, the decision indication aiming at the main network can be learned, and the training time is not increased. Briefly, in the present invention, a policy network is used to learn input sample dependent decisions, which are sampled from a discrete distribution of outputs from a lightweight neural network, and the depth of the subject network is adjusted based on the input samples.
Step S220, training the sequence recommendation model by using a sample set to obtain a trained main network, and for each of the plurality of hole convolution residual blocks, outputting, by the policy network, a decision indication for representing the reservation or skipping of the hole convolution residual block, by using a set loss function as a target.
After an Adaptive-NextItNet overall model is built, the NextItNet main body network is pre-trained by using training data so as to achieve a good model effect. For example, the input is a historical browsing sequence of the user, the output is an item recommended to the user at the next moment, and the loss function is the cross entropy between the correct item and the predicted item. The total Loss of the pre-training process is expressed as:
Figure BDA0002639615250000051
wherein
Figure BDA0002639615250000052
To correct item tag, yiIn order to predict the tag of the item,t is the total number of training samples.
After the pre-training of the NextItNet main body network is completed, the NextItNet main body network has better feature extraction capability, and in order to achieve the purpose of self-adapting network depth, a strategy network and the NextItNet main body network in the Adaptive-NextItNet model are further jointly trained. For example, assume that the input user history browsing sequence is X ═ { X1,x2,...,xn-1After passing through a strategy network only containing a hole convolution residual block, a decision sequence { a ] is generated through Gumbel-softmax sampling1,a2,...,aNAnd N is the number of the hole convolution residual blocks in the NextItNet main body network, and each decision action is used for guiding the NextItNet main body network to select to reserve the hole convolution residual block or skip the hole convolution residual block before entering each hole convolution residual block. For example, there are two possible values for each decision, 1 for reserving the block of hole convolution residues and 0 for skipping the block of hole convolution residues. And finally, predicting item recommended to the user at the next moment through a classifier after the last cavity convolution residual block of the NextItNet main body network is obtained.
Because the strategy network and the NextItNet main network in the Adaptive-NextItNet model are guided everywhere, the gradient can be smoothly transmitted, end-to-end joint training can be carried out, and the total loss function in the joint training process and the loss function in the main network pre-training process can be set to be consistent or inconsistent. For example, the joint training process total loss function is still set to the cross entropy between the correct item and the predicted item, expressed as:
Figure BDA0002639615250000061
wherein
Figure BDA0002639615250000062
To correct item tag, yiTo predict the item tag, T is the total number of training samples.
Through the process, the training of the overall model including the strategy network and the NextItNet main body network is completed, and the overall model can be used for subsequent deployment.
Step S230, inputting the historical browsing sequence of the user to be recommended into the trained sequence recommendation model, and determining the void convolution residual block to be skipped according to the decision indication of the strategy network so as to output the prediction result of the user recommendation item at the subsequent moment.
In the step, the adaptive network depth deduces, and a trained sequence recommendation model is utilized to provide quick and accurate recommendation service for the user.
In practical application, the sequence recommendation is performed to correspond to one test process of the model. When a historical browsing sequence of a user is given, through a trained model, a network depth is selected in a self-adaptive mode according to input for deduction, items which are most likely to be interested by the user at the next moment are found, and quick and accurate recommendation service is provided for the user.
It is to be noted that those skilled in the art can make appropriate changes or modifications to the above-described embodiments, for example, adopting a policy network of a different structure, or setting other types of loss functions, etc., without departing from the spirit and scope of the present invention. In another example, the training process may be performed offline at the cloud or at the server. For another example, other models with a plurality of hole convolution residual blocks are adopted to replace the NextItNet model as the main network.
Correspondingly, the invention also provides a sequence recommendation system based on the adaptive network depth, which is used for realizing one or more aspects of the method. For example, the system includes: the model building unit is used for building a sequence recommendation model, the sequence recommendation model is provided with a plurality of cavity convolution residual blocks as a main network and a strategy network for managing the depth of the main network; a model training unit, configured to train the sequence recommendation model with a sample set to obtain a trained subject network, with a set loss function as a target, and for each of the plurality of hole convolution residual blocks, a policy network outputs a decision indication for representing retention or skipping of the hole convolution residual block; and the sequence recommendation unit is used for inputting the historical browsing sequence of the user to be recommended into the trained sequence recommendation model, determining the void convolution residual block to be skipped according to the decision indication of the strategy network, and further outputting the prediction result of the user recommendation item at the subsequent moment.
In order to verify the effectiveness and the advancement of the invention, extensive experiments are carried out on the published data set MovieLens in the field of sequence recommendation systems. The experimental result shows that the method achieves the best effect at present on model calculation overhead, inference time and model performance, can provide quick and accurate recommendation service for users, is very suitable for being deployed and applied to a sequence recommendation system, and has very important practical significance and wide application prospect. For example, the invention can recommend the items which may be interested according to the attributes of the user (such as gender, age, academic calendar, region, occupation) and the past behaviors of the user in the system (such as browsing, clicking, searching, purchasing, collecting and the like).
The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied therewith for causing a processor to implement various aspects of the present invention.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.
The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
The computer program instructions for carrying out operations of the present invention may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present invention are implemented by personalizing an electronic circuit, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), with state information of computer-readable program instructions, which can execute the computer-readable program instructions.
Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. It is well known to those skilled in the art that implementation by hardware, by software, and by a combination of software and hardware are equivalent.
Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein. The scope of the invention is defined by the appended claims.

Claims (8)

1. A sequence recommendation method based on adaptive network depth comprises the following steps:
constructing a sequence recommendation model, wherein the sequence recommendation model is provided with a plurality of cavity convolution residual blocks as a main network and a strategy network for managing the depth of the main network;
training the sequence recommendation model by using a sample set with a set loss function as a target to obtain a trained main body network, and outputting a decision indication for representing the reservation or skipping of the hollow convolutional residual block by using a strategy network for each of the plurality of hollow convolutional residual blocks;
inputting the historical browsing sequence of the user to be recommended into the trained sequence recommendation model, and determining the void convolution residual block to be skipped according to the decision indication of the strategy network so as to output the prediction result of the user recommended item at the subsequent moment.
2. The method of claim 1, wherein the sequence recommendation model is trained according to the following steps:
training a main network by using a sample set by taking a set first loss function as a target to obtain a pre-trained main network;
and training the pre-trained main network and the strategy network end to end by using the sample set by taking the set second loss function as a target to obtain a decision indication for representing the reservation or skipping of each hollow convolution residual block in the main network.
3. The method of claim 2, wherein the first and second penalty functions are identical, each set as a cross entropy between correct and predicted terms.
4. The method of claim 1, wherein the policy network comprises a hole convolutional residual block, which is indexed by a user history browsing sequence X ═ { X ═ X1,x2,...,xn-1Taking the sequence as input, and generating a decision indication sequence (a) through Gumbel-softmax sampling1,a2,...,aNAnd N is the number of the hole convolution residual blocks in the main network, and each decision indication is respectively used for guiding the main network to select to reserve the hole convolution residual block or skip the hole convolution residual block before entering the corresponding hole convolution residual block.
5. The method of claim 1, wherein each hole convolution residual block comprises a plurality of superimposed hole convolution layers, a layer normalization layer, and an activation layer.
6. The method of claim 1, wherein the subject network is a NextItNet model.
7. An adaptive network depth based sequence recommendation system comprising:
a model construction unit; the system comprises a sequence recommendation model, a strategy network and a data processing module, wherein the sequence recommendation model is provided with a plurality of cavity convolution residual blocks as a main network and a strategy network for managing the depth of the main network;
a model training unit: training the sequence recommendation model by using a sample set to obtain a trained main body network by taking a set loss function as a target, and outputting a decision indication for representing the reservation or skipping of the hollow convolutional residual block by using a strategy network for each of the plurality of hollow convolutional residual blocks;
a sequence recommendation unit: and the system is used for inputting the historical browsing sequence of the user to be recommended into the trained sequence recommendation model, determining the void convolution residual block to be skipped according to the decision indication of the strategy network, and further outputting the prediction result of the user recommendation item at the subsequent moment.
8. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.
CN202010835626.XA 2020-08-19 2020-08-19 Sequence recommendation method and system based on self-adaptive network depth Active CN111931058B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010835626.XA CN111931058B (en) 2020-08-19 2020-08-19 Sequence recommendation method and system based on self-adaptive network depth

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010835626.XA CN111931058B (en) 2020-08-19 2020-08-19 Sequence recommendation method and system based on self-adaptive network depth

Publications (2)

Publication Number Publication Date
CN111931058A true CN111931058A (en) 2020-11-13
CN111931058B CN111931058B (en) 2024-01-05

Family

ID=73305848

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010835626.XA Active CN111931058B (en) 2020-08-19 2020-08-19 Sequence recommendation method and system based on self-adaptive network depth

Country Status (1)

Country Link
CN (1) CN111931058B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180330194A1 (en) * 2017-05-15 2018-11-15 Siemens Aktiengesellschaft Training an rgb-d classifier with only depth data and privileged information
US20190114511A1 (en) * 2017-10-16 2019-04-18 Illumina, Inc. Deep Learning-Based Techniques for Training Deep Convolutional Neural Networks
WO2019079200A1 (en) * 2017-10-16 2019-04-25 Illumina, Inc. Deep learning-based aberrant splicing detection
CN110334661A (en) * 2019-07-09 2019-10-15 国网江苏省电力有限公司扬州供电分公司 Infrared power transmission and transformation abnormal heating point target detecting method based on deep learning
CN111159542A (en) * 2019-12-12 2020-05-15 中国科学院深圳先进技术研究院 Cross-domain sequence recommendation method based on self-adaptive fine-tuning strategy
WO2020135193A1 (en) * 2018-12-27 2020-07-02 深圳Tcl新技术有限公司 Deep neural network-based video recommendation method and system, and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180330194A1 (en) * 2017-05-15 2018-11-15 Siemens Aktiengesellschaft Training an rgb-d classifier with only depth data and privileged information
US20190114511A1 (en) * 2017-10-16 2019-04-18 Illumina, Inc. Deep Learning-Based Techniques for Training Deep Convolutional Neural Networks
WO2019079200A1 (en) * 2017-10-16 2019-04-25 Illumina, Inc. Deep learning-based aberrant splicing detection
WO2020135193A1 (en) * 2018-12-27 2020-07-02 深圳Tcl新技术有限公司 Deep neural network-based video recommendation method and system, and storage medium
CN110334661A (en) * 2019-07-09 2019-10-15 国网江苏省电力有限公司扬州供电分公司 Infrared power transmission and transformation abnormal heating point target detecting method based on deep learning
CN111159542A (en) * 2019-12-12 2020-05-15 中国科学院深圳先进技术研究院 Cross-domain sequence recommendation method based on self-adaptive fine-tuning strategy

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李太松;贺泽宇;王冰;颜永红;唐向红;: "基于循环时间卷积网络的序列流推荐算法", 计算机科学, no. 03 *
李淑芝;余乐陶;邓小鸿;李志军;: "结合Skip-gram和加权损失函数的神经网络推荐模型", 计算机工程与应用, no. 19 *

Also Published As

Publication number Publication date
CN111931058B (en) 2024-01-05

Similar Documents

Publication Publication Date Title
CN110366734B (en) Optimizing neural network architecture
CN110807515B (en) Model generation method and device
US11593642B2 (en) Combined data pre-process and architecture search for deep learning models
US11907675B2 (en) Generating training datasets for training neural networks
US11151324B2 (en) Generating completed responses via primal networks trained with dual networks
US10936950B1 (en) Processing sequential interaction data
WO2019152929A1 (en) Regularized neural network architecture search
CN112507209B (en) Sequence recommendation method for knowledge distillation based on land moving distance
CN111931054A (en) Sequence recommendation method and system based on improved residual error structure
CN111368973B (en) Method and apparatus for training a super network
US20220172038A1 (en) Automated deep learning architecture selection for time series prediction with user interaction
US11281867B2 (en) Performing multi-objective tasks via primal networks trained with dual networks
US11928699B2 (en) Auto-discovery of reasoning knowledge graphs in supply chains
CN111967941A (en) Method for constructing sequence recommendation model and sequence recommendation method
US11250602B2 (en) Generating concept images of human poses using machine learning models
WO2024051707A1 (en) Recommendation model training method and apparatus, and resource recommendation method and apparatus
US11100407B2 (en) Building domain models from dialog interactions
CN113366510A (en) Performing multi-objective tasks via trained raw network and dual network
CN111931058B (en) Sequence recommendation method and system based on self-adaptive network depth
US20230134798A1 (en) Reasonable language model learning for text generation from a knowledge graph
US20220358388A1 (en) Machine learning with automated environment generation
US20220245460A1 (en) Adaptive self-adversarial negative sampling for graph neural network training
US20220171985A1 (en) Item recommendation with application to automated artificial intelligence
CN111931057B (en) Self-adaptive output sequence recommendation method and system
US20200184261A1 (en) Collaborative deep learning model authoring tool

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant