CN111931058B - Sequence recommendation method and system based on self-adaptive network depth - Google Patents

Sequence recommendation method and system based on self-adaptive network depth Download PDF

Info

Publication number
CN111931058B
CN111931058B CN202010835626.XA CN202010835626A CN111931058B CN 111931058 B CN111931058 B CN 111931058B CN 202010835626 A CN202010835626 A CN 202010835626A CN 111931058 B CN111931058 B CN 111931058B
Authority
CN
China
Prior art keywords
network
sequence
convolution residual
residual block
trained
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010835626.XA
Other languages
Chinese (zh)
Other versions
CN111931058A (en
Inventor
陈磊
杨敏
原发杰
李成明
姜青山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN202010835626.XA priority Critical patent/CN111931058B/en
Publication of CN111931058A publication Critical patent/CN111931058A/en
Application granted granted Critical
Publication of CN111931058B publication Critical patent/CN111931058B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Finance (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a sequence recommendation method and a sequence recommendation system based on self-adaptive network depth. The method comprises the following steps: constructing a sequence recommendation model, wherein the sequence recommendation model is provided with a plurality of cavity convolution residual blocks serving as a main network and a strategy network for managing the depth of the main network; training the sequence recommendation model by using a sample set with the set loss function as a target to obtain a trained main body network, and outputting a decision instruction for representing the retention or skipping of the cavity convolution residual block by using a strategy network for each of the plurality of cavity convolution residual blocks; and inputting the historical browsing sequence of the user to be recommended into a trained sequence recommendation model, and determining a cavity convolution residual block to be skipped according to the decision indication of the strategy network so as to output the prediction result of the user recommendation item at the subsequent moment. The invention can adaptively adjust the depth of the main network by using the strategy network, and can provide quick and accurate recommendation service for users.

Description

Sequence recommendation method and system based on self-adaptive network depth
Technical Field
The invention relates to the technical field of sequence recommendation, in particular to a sequence recommendation method and system based on self-adaptive network depth.
Background
The recommendation system is a field which is researched very hot and developed very rapidly in recent years, is spotlighted due to wide application scenes and huge commercial value, is defined as providing commodity information and suggestions to customers by utilizing an e-commerce website, helping users to decide what products should be purchased, simulating sales personnel to help the customers to complete the purchasing process, and personalized recommendation is to recommend information and commodities of interest to the users according to the interesting characteristics and purchasing behaviors of the users. The sequence recommendation system is an important branch in the recommendation system, and aims to accurately recommend the user by analyzing the historical browsing sequence of the user, so that the sequence recommendation system is always a hot research problem focused on academia and industry.
Taking a commonly used sequence recommendation model NextItNet as an example, the method combines a cavity convolutional neural network and a residual error network, and can better model a user history browsing sequence, thereby better providing recommendation service for users and playing an excellent effect in a sequence recommendation system.
The model structure of NextItNet is shown in FIG. 1, and is formed by stacking a plurality of hollow convolution residual blocks with the same structure, inputting a user history browsing sequence into the whole network for modeling, obtaining user preference characterization after passing through the last hollow convolution residual block, and finally predicting items (items) recommended to a user at the next moment through a Softmax classifier.
The output of the hole convolution residual block in NextItNet is expressed as:
X l+1 =X l +F(X l )
i.e. the output X of each hole convolution residual block l+1 For inputting X l The result F (X l )。F(X l ) The processing procedure is to sequentially input a hole convolution Layer 1 (Dilated Conv 1), a Layer normalization Layer 1 (Layer Nor 1), a ReLU activation Layer 1 (ReLU 1), a hole convolution Layer 2 (Dilated Conv 2), a Layer normalization Layer 2 (Layer Nor 2) and a ReLU activation Layer 2 (ReLU 2) for processing and outputting.
However, when the existing sequence recommendation model is used for recommendation service, the problems of high model parameter number, high calculation cost required by the model, long inference time and the like exist. For example, the nexttinet needs to stack a large number of hole convolution residual blocks to play a better effect, resulting in a huge amount of model parameters, and complete models are needed to be passed through for each input user history browsing sequence to complete output prediction, so that the trained models are difficult to deploy in practical application, the calculation cost is high, the time spent in carrying out inference is long, and the practical requirements of users are difficult to meet.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a sequence recommending method and a system based on self-adaptive network depth, which can improve the efficiency of recommending services by self-adaptively adjusting the depth of a sequence recommending model.
According to a first aspect of the present invention, there is provided a sequence recommendation method based on adaptive network depth. The method comprises the following steps:
constructing a sequence recommendation model, wherein the sequence recommendation model is provided with a plurality of cavity convolution residual blocks serving as a main network and a strategy network for managing the depth of the main network;
training the sequence recommendation model by using a sample set with the set loss function as a target to obtain a trained main body network, and outputting a decision instruction for representing the retention or skipping of the cavity convolution residual block by using a strategy network for each of the plurality of cavity convolution residual blocks;
and inputting the historical browsing sequence of the user to be recommended into a trained sequence recommendation model, and determining a cavity convolution residual block to be skipped according to the decision indication of the strategy network so as to output the prediction result of the user recommendation item at the subsequent moment.
In one embodiment, the sequence recommendation model is trained according to the following steps:
training a main body network by using the set first loss function as a target and utilizing a sample set to obtain a pre-trained main body network;
and aiming at a set second loss function, training the pre-trained main body network and the strategy network by utilizing the end-to-end joint of the sample set to obtain a decision instruction for representing the reservation or skip of each cavity convolution residual block in the main body network.
In one embodiment, the first and second loss functions are identical and are each set to the cross entropy between the correct term and the predicted term.
In one embodiment, the policy network includes a hole convolution residual block that browses the sequence x= { X in user history 1 ,x 2 ,...,x n-1 Using the sequence { alpha } as input and sampling by Gumbel-softmax to generate a decision indication sequence { alpha } 1 ,a 2 ,...,a N And N is the number of the hole convolution residual blocks in the main network, and each decision motion instruction is used for guiding the main network to select to reserve the hole convolution residual block or skip the hole convolution residual block before entering the corresponding hole convolution residual block.
In one embodiment, each hole convolution residual block includes a plurality of superimposed hole convolution layers, a layer normalization layer, and an activation layer.
In one embodiment, the subject network is a NextItNet model.
According to a second aspect of the present invention, there is provided a sequence recommendation system based on adaptive network depth. The system comprises:
a model construction unit; the method comprises the steps of constructing a sequence recommendation model, wherein the sequence recommendation model is provided with a plurality of cavity convolution residual blocks serving as a main network and a strategy network for managing the depth of the main network;
model training unit: training the sequence recommendation model with a sample set to obtain a trained body network with a set of loss functions as targets, and outputting, for each of the plurality of hole convolution residual blocks, a decision indication for characterizing the hole convolution residual block retention or skip by a policy network;
sequence recommendation unit: the method comprises the steps of inputting a historical browsing sequence of a user to be recommended into a trained sequence recommendation model, determining a cavity convolution residual block to be skipped according to a decision instruction of a strategy network, and outputting a prediction result of a user recommendation item at a subsequent moment.
Compared with the prior art, the sequence recommendation method has the advantages that the network depth can be adaptively selected according to the input user history browsing sequence, a strategy network is learned, a decision sequence is output through the strategy network for each user sequence input, and then, which cavity convolution residual blocks in the main network should be reserved and which cavity convolution residual blocks should be skipped are determined, so that the purpose of adapting the network depth is achieved, the calculation cost of a model is reduced, the overall inference speed is remarkably improved, and quick and accurate recommendation service can be provided for users.
Other features of the present invention and its advantages will become apparent from the following detailed description of exemplary embodiments of the invention, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.
FIG. 1 is a schematic diagram of a conventional NextItNet model structure;
FIG. 2 is a flow chart of a sequence recommendation method based on adaptive network depth according to one embodiment of the invention;
FIG. 3 is a schematic diagram of a sequence recommendation model based on adaptive network depth according to one embodiment of the invention.
Detailed Description
Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless it is specifically stated otherwise.
The following description of at least one exemplary embodiment is merely exemplary in nature and is in no way intended to limit the invention, its application, or uses.
Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but are intended to be part of the specification where appropriate.
In all examples shown and discussed herein, any specific values should be construed as merely illustrative, and not a limitation. Thus, other examples of exemplary embodiments may have different values.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.
The sequence recommendation model (or called Adaptive-NextItNet) based on the self-Adaptive network depth can adaptively select the network depth according to the input user history browsing sequence, and a decision sequence is output through a strategy network for each user sequence input by learning a strategy network to determine which cavity convolution residual blocks in the main network should be reserved and which cavity convolution residual blocks should be skipped, so that the purpose of self-Adaptive network depth is achieved.
Specifically, referring to fig. 2, the adaptive network depth-based sequence recommendation method of this embodiment includes the following steps:
step S210, a sequence recommendation model is constructed, wherein the sequence recommendation model is provided with a plurality of cavity convolution residual blocks as a main network and a strategy network for managing the depth of the main network.
In this step, an Adaptive-NextItNet overall model is built using the NextItNet model as a subject network. The overall model structure of Adaptive-NextItNet is shown in figure 3, and the model is divided into a NextItNet main body network and a strategy network.
The NextItNet main body network is formed by stacking N (N is more than or equal to 2) hollow convolution residual blocks with the same structure, a user history browsing sequence is input into the whole network to carry out modeling, after the last hollow convolution residual block is passed, user preference characterization is obtained, and finally, a classifier is used for predicting item recommended to a user at the next moment.
In the embodiment of fig. 3, the strategy network is a lightweight, hole-convolution neural network, similar in structure to the nexttinet model, but containing only one hole-convolution residual block. By setting a lightweight policy network, decision instructions for a subject network can be learned without increasing training time. Briefly, in the present invention, a policy network is used to learn input sample dependent decisions that sample from a discrete distribution of outputs of a lightweight neural network, and adjust the depth of a subject network based on the input samples.
Step S220, training the sequence recommendation model with a sample set with the set loss function as a target, obtaining a trained body network, and for each of the plurality of hole convolution residual blocks, outputting, by the policy network, a decision indication for characterizing the hole convolution residual block retention or skip.
After the Adaptive-NextItNet overall model is built, training data is utilized to pretrain the NextItNet main body network so as to achieve a good model effect. For example, the input is a historical browsing sequence of the user, the output is a item recommended to the user at the next moment, and the loss function is the cross entropy between the correct item and the predicted item. The total Loss for the pre-training process is expressed as:
wherein the method comprises the steps ofFor correct item tags, y i To predict item tags, T is the total number of training samples.
After the NextItNet main body network pre-training is completed, the method has better feature extraction capability, and further, the strategy network and the NextItNet main body network in the Adaptive-NextItNet model are jointly trained for the purpose of realizing the self-Adaptive network depth. For example, assume that the input user history browsing sequence is x= { X 1 ,x 2 ,...,x n-1 Through a strategy network comprising only one hole convolution residual block, then through Gumbel-softmaxSampling produces a decision sequence { a } 1 ,a 2 ,...,a N And (2) N is the number of the hole convolution residual blocks in the NextItNet main body network, wherein each decision action is used for guiding the NextItNet main body network to select whether to reserve or skip each hole convolution residual block before entering the hole convolution residual block. For example, there are two possible values for each decision, 1 means to preserve the hole convolution residual block, and 0 means to skip the hole convolution residual block. After the last cavity convolution residual block of the NextItNet main body network is passed, user preference characterization is obtained, and finally, a classifier is used for predicting item recommended to the user at the next moment.
Because the strategy network and the NextItNet main body network in the Adaptive-NextItNet model are everywhere conductive, the gradient can be smoothly transferred, so that end-to-end joint training can be performed, and the total loss function in the joint training process and the loss function in the main body network pre-training process can be set to be consistent or inconsistent. For example, the joint training process total loss function is still set to the cross entropy between the correct item and the predicted item, expressed as:
wherein the method comprises the steps ofFor correct item tags, y i To predict item tags, T is the total number of training samples.
Through the process, the whole model training including the strategy network and the NextItNet main body network is completed, and the method can be used for subsequent deployment.
Step S230, the historical browsing sequence of the user to be recommended is input into a trained sequence recommendation model, and a hole convolution residual block to be skipped is determined according to the decision indication of the strategy network so as to output the prediction result of the user recommendation item at the subsequent moment.
In this step, the adaptive network depth is inferred, and a trained sequence recommendation model is utilized to provide a quick and accurate recommendation service for the user.
In practical application, the sequence recommendation is performed as a test process of the model. When the historical browsing sequence of the user is given, through a trained model, the network depth is adaptively selected according to input to infer, and the most possibly interested item of the user at the next moment is found out, so that quick and accurate recommendation service is provided for the user.
It should be noted that those skilled in the art may make appropriate changes or modifications to the above-described embodiments, for example, using policy networks of different structures, or setting other types of loss functions, etc., without departing from the spirit and scope of the present invention. As another example, the training process may be performed off-line at the cloud or server. For another example, another model having a plurality of hole convolution residual blocks is used instead of the nexttnet model as the principal network.
Correspondingly, the invention also provides a sequence recommendation system based on the adaptive network depth, which is used for realizing one aspect or more aspects of the method. For example, the system includes: a model construction unit for constructing a sequence recommendation model provided with a plurality of cavity convolution residual blocks as a main network and a strategy network for managing the depth of the main network; a model training unit for training the sequence recommendation model with a set loss function as a target, obtaining a trained body network, and for each of the plurality of hole convolution residual blocks, outputting a decision indication for characterizing the hole convolution residual block retention or skip by a policy network; the sequence recommending unit is used for inputting the historical browsing sequence of the user to be recommended into the trained sequence recommending model, determining a cavity convolution residual block to be skipped according to the decision indication of the strategy network, and further outputting the predicted result of the user recommending item at the subsequent moment.
In order to verify the effectiveness and the advancement of the invention, extensive experiments were performed on the public dataset MovieLens in the field of sequence recommendation systems. Experimental results show that the method achieves the best effect in terms of model calculation cost, inference time and model performance, can provide quick and accurate recommendation service for users, is very suitable for being deployed and applied to a sequence recommendation system, and has very important practical significance and wide application prospect. For example, with the present invention, items of possible interest may be recommended to a user based on his or her attributes (e.g., gender, age, academic, territory, occupation), and the user's past behavior in the system (e.g., browsing, clicking, searching, purchasing, collecting, etc.).
The present invention may be a system, method, and/or computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for causing a processor to implement aspects of the present invention.
The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.
The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.
Computer program instructions for carrying out operations of the present invention may be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present invention are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information for computer readable program instructions, which can execute the computer readable program instructions.
Various aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. It is well known to those skilled in the art that implementation by hardware, implementation by software, and implementation by a combination of software and hardware are all equivalent.
The foregoing description of embodiments of the invention has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the technical improvements in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein. The scope of the invention is defined by the appended claims.

Claims (6)

1. A sequence recommendation method based on self-adaptive network depth comprises the following steps:
constructing a sequence recommendation model, wherein the sequence recommendation model is provided with a plurality of cavity convolution residual blocks serving as a main network and a strategy network for managing the depth of the main network;
training the sequence recommendation model by using a sample set with the set loss function as a target to obtain a trained main body network, and outputting a decision instruction for representing the retention or skipping of the cavity convolution residual block by using a strategy network for each of the plurality of cavity convolution residual blocks;
inputting a historical browsing sequence of a user to be recommended into a trained sequence recommendation model, and determining a cavity convolution residual block to be skipped according to a decision indication of a strategy network so as to output a prediction result of a user recommendation item at a subsequent moment;
wherein training the sequence recommendation model comprises:
training a main body network by using the set first loss function as a target and utilizing a sample set to obtain a pre-trained main body network;
the pre-trained main body network and the strategy network are trained in an end-to-end combined mode by using the sample set with the set second loss function as a target, and decision indication used for representing retention or skipping of each cavity convolution residual block in the main body network is obtained;
wherein the policy network comprises a hole convolution residual block which browses the sequence X= { X in the user history 1 ,x 2 ,...,x n-1 Using the sequence { alpha } as input and sampling by Gumbel-softmax to generate a decision indication sequence { alpha } 1 ,a 2 ,...,a N And N is the number of the hole convolution residual blocks in the main network, and each decision motion instruction is used for guiding the main network to select to reserve the hole convolution residual block or skip the hole convolution residual block before entering the corresponding hole convolution residual block.
2. The method of claim 1, wherein the first and second loss functions are consistent, each set to a cross entropy between a correct term and a predicted term.
3. The method of claim 1, wherein each hole convolution residual block comprises a plurality of superimposed hole convolution layers, layer normalization layers, and activation layers.
4. The method of claim 1, wherein the subject network is a nexttinet model.
5. A sequence recommendation system based on adaptive network depth, comprising:
a model construction unit; the method comprises the steps of constructing a sequence recommendation model, wherein the sequence recommendation model is provided with a plurality of cavity convolution residual blocks serving as a main network and a strategy network for managing the depth of the main network;
model training unit: training the sequence recommendation model with a sample set to obtain a trained body network with a set of loss functions as targets, and outputting, for each of the plurality of hole convolution residual blocks, a decision indication for characterizing the hole convolution residual block retention or skip by a policy network;
sequence recommendation unit: the method comprises the steps of inputting a historical browsing sequence of a user to be recommended into a trained sequence recommendation model, determining a cavity convolution residual block to be skipped according to a decision indication of a strategy network, and outputting a prediction result of a user recommendation item at a subsequent moment;
wherein training the sequence recommendation model comprises:
training a main body network by using the set first loss function as a target and utilizing a sample set to obtain a pre-trained main body network;
the pre-trained main body network and the strategy network are trained in an end-to-end combined mode by using the sample set with the set second loss function as a target, and decision indication used for representing retention or skipping of each cavity convolution residual block in the main body network is obtained;
wherein the policy network comprises a hole convolution residual block which browses the sequence X= { X in the user history 1 ,x 2 ,...,x n-1 Using the sequence { alpha } as input and sampling by Gumbel-softmax to generate a decision indication sequence { alpha } 1 ,a 2 ,...,a N And N is the number of the hole convolution residual blocks in the main network, and each decision motion instruction is used for guiding the main network to select to reserve the hole convolution residual block or skip the hole convolution residual block before entering the corresponding hole convolution residual block.
6. A computer readable storage medium having stored thereon a computer program, wherein the program when executed by a processor realizes the steps of the method according to any of claims 1 to 4.
CN202010835626.XA 2020-08-19 2020-08-19 Sequence recommendation method and system based on self-adaptive network depth Active CN111931058B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010835626.XA CN111931058B (en) 2020-08-19 2020-08-19 Sequence recommendation method and system based on self-adaptive network depth

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010835626.XA CN111931058B (en) 2020-08-19 2020-08-19 Sequence recommendation method and system based on self-adaptive network depth

Publications (2)

Publication Number Publication Date
CN111931058A CN111931058A (en) 2020-11-13
CN111931058B true CN111931058B (en) 2024-01-05

Family

ID=73305848

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010835626.XA Active CN111931058B (en) 2020-08-19 2020-08-19 Sequence recommendation method and system based on self-adaptive network depth

Country Status (1)

Country Link
CN (1) CN111931058B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019079200A1 (en) * 2017-10-16 2019-04-25 Illumina, Inc. Deep learning-based aberrant splicing detection
CN110334661A (en) * 2019-07-09 2019-10-15 国网江苏省电力有限公司扬州供电分公司 Infrared power transmission and transformation abnormal heating point target detecting method based on deep learning
CN111159542A (en) * 2019-12-12 2020-05-15 中国科学院深圳先进技术研究院 Cross-domain sequence recommendation method based on self-adaptive fine-tuning strategy
WO2020135193A1 (en) * 2018-12-27 2020-07-02 深圳Tcl新技术有限公司 Deep neural network-based video recommendation method and system, and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180330194A1 (en) * 2017-05-15 2018-11-15 Siemens Aktiengesellschaft Training an rgb-d classifier with only depth data and privileged information
EP3622521A1 (en) * 2017-10-16 2020-03-18 Illumina, Inc. Deep convolutional neural networks for variant classification

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019079200A1 (en) * 2017-10-16 2019-04-25 Illumina, Inc. Deep learning-based aberrant splicing detection
WO2020135193A1 (en) * 2018-12-27 2020-07-02 深圳Tcl新技术有限公司 Deep neural network-based video recommendation method and system, and storage medium
CN110334661A (en) * 2019-07-09 2019-10-15 国网江苏省电力有限公司扬州供电分公司 Infrared power transmission and transformation abnormal heating point target detecting method based on deep learning
CN111159542A (en) * 2019-12-12 2020-05-15 中国科学院深圳先进技术研究院 Cross-domain sequence recommendation method based on self-adaptive fine-tuning strategy

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于循环时间卷积网络的序列流推荐算法;李太松;贺泽宇;王冰;颜永红;唐向红;;计算机科学(03);全文 *
结合Skip-gram和加权损失函数的神经网络推荐模型;李淑芝;余乐陶;邓小鸿;李志军;;计算机工程与应用(19);全文 *

Also Published As

Publication number Publication date
CN111931058A (en) 2020-11-13

Similar Documents

Publication Publication Date Title
CN110366734B (en) Optimizing neural network architecture
CN111931057B (en) Self-adaptive output sequence recommendation method and system
US11669744B2 (en) Regularized neural network architecture search
CN108520470B (en) Method and apparatus for generating user attribute information
CN110807515A (en) Model generation method and device
CN111931054B (en) Sequence recommendation method and system based on improved residual error structure
CN111708876B (en) Method and device for generating information
CN111406264A (en) Neural architecture search
US11423307B2 (en) Taxonomy construction via graph-based cross-domain knowledge transfer
CN116010684A (en) Article recommendation method, device and storage medium
CN111967941B (en) Method for constructing sequence recommendation model and sequence recommendation method
CN112507209B (en) Sequence recommendation method for knowledge distillation based on land moving distance
CN110688528A (en) Method, apparatus, electronic device, and medium for generating classification information of video
CN111950593A (en) Method and device for recommending model training
US20220366257A1 (en) Small and Fast Video Processing Networks via Neural Architecture Search
CN111340220A (en) Method and apparatus for training a predictive model
CN118119954A (en) Hint adjustment using one or more machine learning models
JP2024500459A (en) Multi-level multi-objective automatic machine learning
CN116452263A (en) Information recommendation method, device, equipment, storage medium and program product
CN112182281B (en) Audio recommendation method, device and storage medium
WO2024152686A1 (en) Method and apparatus for determining recommendation index of resource information, device, storage medium and computer program product
CN110782016A (en) Method and apparatus for optimizing neural network architecture search
WO2024051707A1 (en) Recommendation model training method and apparatus, and resource recommendation method and apparatus
JP7361121B2 (en) Performing multi-objective tasks via primary network trained with dual network
CN117056595A (en) Interactive project recommendation method and device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant