CN113129108A - Product recommendation method and device based on Double DQN algorithm - Google Patents

Product recommendation method and device based on Double DQN algorithm Download PDF

Info

Publication number
CN113129108A
CN113129108A CN202110452994.0A CN202110452994A CN113129108A CN 113129108 A CN113129108 A CN 113129108A CN 202110452994 A CN202110452994 A CN 202110452994A CN 113129108 A CN113129108 A CN 113129108A
Authority
CN
China
Prior art keywords
product
basic information
historical
target user
products
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110452994.0A
Other languages
Chinese (zh)
Other versions
CN113129108B (en
Inventor
王光臣
张衡
张盼盼
王宇
潘宇光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202110452994.0A priority Critical patent/CN113129108B/en
Publication of CN113129108A publication Critical patent/CN113129108A/en
Application granted granted Critical
Publication of CN113129108B publication Critical patent/CN113129108B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a product recommendation method and a product recommendation system based on Double DQN algorithm, which comprises the following steps: acquiring basic information of a target user; inputting the basic information of a target user into a trained Double DQN algorithm, wherein the Double DQN algorithm outputs the predicted satisfaction degree of each product; and sorting the products according to the sequence of the predicted satisfaction degree from high to low, and recommending the sorted products to the target user. Not only personal information of a user, such as personal risk preference, income condition and the like, but also information of a product per se, such as historical purchase data of the product, purchase satisfaction of the product and the like, is sufficiently analyzed, so that the most appropriate product is recommended to the user.

Description

Product recommendation method and device based on Double DQN algorithm
Technical Field
The invention relates to the technical field of product recommendation, in particular to a product recommendation method and device based on a Double DQN algorithm.
Background
The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.
In recent years, with the rapid development of internet technology, product recommendation systems have been rapidly developed, and are now widely used in various services such as e-commerce services and financial product recommendation services.
The current product recommendation methods are generally recommendation methods based on user information, and these methods analyze data such as risk preference of a user to obtain similarity between the user and a product, so as to perform corresponding product recommendation according to the similarity. However, the existing product recommendation method does not fully analyze the information of the product purchased by the user, such as historical purchase data of the product and price change of the product, and does not realize accurate recommendation of the product, so that the product is not accurately recommended to a required customer.
Therefore, in the prior art, the product recommendation method and device cannot be well designed, cannot meet the requirements of users, and cannot provide satisfactory experience for users.
Disclosure of Invention
In order to solve the defects of the prior art, the invention provides a product recommendation method and device based on a Double DQN algorithm.
In a first aspect, the invention provides a product recommendation method based on Double DQN algorithm;
the product recommendation method based on the Double DQN algorithm comprises the following steps:
acquiring basic information of a target user;
processing the basic information of the target user and extracting the characteristics of the basic information;
inputting the characteristics representing the basic information of the target user into the trained deep reinforcement learning model to obtain the prediction satisfaction degree of each product;
sorting the products according to the sequence of the predicted satisfaction degrees from high to low, and recommending the sorted products to a target user;
the deep reinforcement learning model refers to a Double DQN algorithm.
In a second aspect, the invention provides a product recommendation device based on Double DQN algorithm;
product recommendation device based on Double DQN algorithm includes:
an acquisition module configured to: acquiring basic information of a target user;
a feature extraction module configured to: processing the basic information of the target user and extracting the characteristics of the basic information;
a prediction module configured to: inputting the characteristics representing the basic information of the target user into the trained deep reinforcement learning model to obtain the prediction satisfaction degree of each product;
a recommendation module configured to: sorting the products according to the sequence of the predicted satisfaction degrees from high to low, and recommending the sorted products to a target user;
the deep reinforcement learning model refers to a Double DQN algorithm.
In a third aspect, the present invention further provides an electronic device, including: one or more processors, one or more memories, and one or more computer programs; wherein a processor is connected to the memory, the one or more computer programs are stored in the memory, and when the electronic device is running, the processor executes the one or more computer programs stored in the memory, so as to make the electronic device execute the method according to the first aspect.
In a fourth aspect, the present invention also provides a computer-readable storage medium for storing computer instructions which, when executed by a processor, perform the method of the first aspect.
Compared with the prior art, the invention has the beneficial effects that: the method not only utilizes personal information of the user, such as personal risk preference, income condition and the like, but also fully utilizes information of the product, such as historical purchase data of the product, purchase satisfaction degree of the product and the like, so as to recommend the most suitable product to the user.
According to the method, a Double Q learning algorithm (Double DQN algorithm) in deep reinforcement learning is applied to product recommendation, and the data of the product is fully analyzed by using the algorithm, so that the product with higher user satisfaction is recommended.
Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.
FIG. 1 is a flowchart illustrating an implementation of a product recommendation method based on a Double DQN algorithm according to the present invention;
FIG. 2 is a diagram of a reinforcement learning framework according to an embodiment of the present invention.
Detailed Description
It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, and it should be understood that the terms "comprises" and "comprising", and any variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The embodiments and features of the embodiments of the present invention may be combined with each other without conflict.
As shown in fig. 1, the method for recommending a product based on a Double DQN algorithm includes:
s101: acquiring basic information of a target user;
s102: processing the basic information of the target user and extracting the characteristics of the basic information;
s103: inputting the characteristics representing the basic information of the target user into the trained deep reinforcement learning model to obtain the prediction satisfaction degree of each product;
s104: sorting the products according to the sequence of the predicted satisfaction degrees from high to low, and recommending the sorted products to a target user;
the deep reinforcement learning model refers to a Double DQN algorithm.
Further, the step S101: acquiring basic information of a target user; the method specifically comprises the following steps:
acquiring average monthly income, historical product purchasing times, historical product purchasing frequency, historical purchased product risk level and historical purchased product price fluctuation data of a target user.
Further, the S102: processing the basic information of the target user and extracting the characteristics of the basic information; the method specifically comprises the following steps:
and performing feature extraction through a convolutional neural network.
Further, the step S103: inputting the characteristics representing the basic information of the target user into the trained deep reinforcement learning model to obtain the prediction satisfaction degree of each product; the training step comprises:
constructing a training set, wherein the training set is user basic information of historical purchase satisfaction of known products;
preprocessing the basic information of the users in the training set, and taking the state characteristics of the basic information of the users and the historical purchase satisfaction of known products obtained after preprocessing as input values of a deep reinforcement learning model; and training the model to obtain a trained deep reinforcement learning model.
Further, the preprocessing the basic information of the users in the training set specifically includes:
the average monthly income, the purchase frequency of historical products, the risk level of the historical purchased products and the price fluctuation data of the users in the training set are all represented by N time sheetsDividing the bits to obtain a plurality of data s after divisiontThe time unit can be divided according to the time dimension of the data, for example, a time unit is set to be a month, and the subscript t represents a time point, so as to record the time interval of the data represented by the state;
performing feature extraction on all the segmented data in the same time unit through a Convolutional Neural Network (CNN) to obtain a monthly average income feature, a historical product purchase frequency feature, a historical product purchase risk level feature and a price fluctuation data feature;
serially splicing the monthly average income characteristic, the historical product purchasing frequency characteristic, the risk grade characteristic and the price fluctuation data characteristic of the historical purchased products to obtain the state characteristic x(s) corresponding to the same time unitt) And obtaining the state characteristics of all time units in the same way.
It should be understood that, since the average monthly income of users, the number of purchases of historical products, the frequency of purchases of historical products, the risk level of products purchased historically and the price fluctuation of users in the training set are large in data amount and variety, the inputted various data need to be preprocessed to extract the features of all the data to reduce the dimensionality of the data. A plurality of data s after being dividedtExtracting features through a deep neural network, wherein the extracted features are chi(s)t) There are many kinds of feature extraction networks, e.g. using corresponding feature extraction networks such thattCarrying out extracted data state characteristics chi(s)t) Is a multi-dimensional vector. If a variety of historical data is considered, the extracted features may be a combination of vectors, such as in the form of a matrix.
Further, the step S103: inputting the characteristics representing the basic information of the target user into the trained deep reinforcement learning model to obtain the prediction satisfaction degree of each product; the method specifically comprises the following steps:
inputting the characteristics representing the basic information of the target user into the trained deep reinforcement learning model to obtain the prediction satisfaction degree of the product, wherein the prediction satisfaction degree of the product is a value obtained through an optimal Q value function of a Double DQN algorithm.
The training principle of the Double DQN algorithm of deep reinforcement learning is described in detail, and how to obtain all state features χ(s)t) Is the optimum Q-value function Q*(χ(st),a):
As shown in FIG. 2, at each time t, the current state of the agent is characterized by χ(s)t) At this point, the agent performs operation atDeriving a reward r from the environmenttAnd a new state characteristic χ(s) was observedt+1)。
The goal of agent learning is to select a strategy pi to maximize the total reward expected, the definition of strategy pi being the action a taken at each instant ttI.e. pi ═ at,at+1,at+2,…aTT is a set terminal time;
maximizing the expected reward maximizes the cumulative discount rewards in the future, i.e.:
rt+γrt+12rt+2+…+γT-trTthe maximum value is that gamma is more than or equal to 0 and less than or equal to 1, which is the discount rate,
the value of the strategy pi to take the action a under the state characteristic χ(s) is recorded as:
E[rt+γrt+12rt+2+…+γT-trT|χ(st)=X(s),at=a],
which represents the desired total reward of all possible decision sequences after execution of operation a, starting from the state feature χ(s), according to policy π.
Simultaneously defining an optimal Q value function:
Q*(χ(s),a)=maxπQπ(χ(s),a)=maxπE[rt+γrt+12rt+2+…+γT-trT|χ(st)=χ(s),at=a]
which represents the desired total reward for decision-making according to the optimal strategy after performing operation a under the state characteristic χ(s).
Obtaining an optimal Q value function Q (X(s), a) under each state characteristic χ(s) in an iterative mode:
from the Bellman formula:
Q*(χ(s),a)=E[rt+γmaxa'Q*(χ(s'),a')|χ(st)=χ(s),at=a]。
thus, from the above equation, Q (χ(s), a; θ) is estimated using a functional approximator Q*(χ(s), a) by iterating θ through a stochastic gradient descent method (SGD),
Figure BDA0003039511790000071
Figure BDA0003039511790000072
wherein theta is-Updating once every k steps, i.e. every k steps
Figure BDA0003039511790000073
And then at other steps theta-Remain unchanged.
Therefore, given the set of allowed operations, for example, in the embodiment, the set of allowed operations a for a product may include, but is not limited to, purchasing a product, not purchasing the product, selling an already owned product, etc., the reward in the deep reinforcement learning model is the satisfaction obtained by performing the above operations each time, and the satisfaction is set in various ways, for example, how to set the satisfaction is determined according to the personal information of the user, such as risk preference, etc.
In the deep reinforcement learning model training principle, the final theta can be obtained by iteration by using the characteristics extracted by the data in the training set*Its corresponding Q (χ(s), a; θ)*)=Q*(χ(s), a) is the corresponding optimal Q function.
Therefore, it is only necessary to know the characteristic χ(s) of each statet) Lower pairFunction Q of the optimum Q value*(χ(st) A), then only the state characteristic χ(s) is neededt) Lower selection of Q*(χ(st) A) the operation a which is the largest is the state characteristic χ(s) of the productt) Optimum operation a capable of maximizing future cumulative satisfaction*In each state χ(s)t) All adopt the optimum operation a*It is possible to maximize the future satisfaction of the product, the totality of all the optimal operations, called the optimal strategy pi*It is recorded as
Figure BDA0003039511790000074
Get the optimal strategy pi*Then, the optimal strategy pi is executed on the product simulation*The predicted maximum satisfaction of the product can be obtained:
obtains an optimal strategy pi*That is, each state characteristic χ(s) is knownt) Optimum operation of the lower
Figure BDA0003039511790000081
Only the operation process of each product needs to be simulated, and the corresponding state s is matched during the transactiontAll adopt χ(s)t) Corresponding optimal transaction operation
Figure BDA0003039511790000082
The predicted maximum satisfaction of the product can be obtained. Meanwhile, in the process of simulating the transaction, a final period T, time, the number of operations and the like can be set for the product, for example, the final transaction period is set to be six months, namely, the cumulative total prediction satisfaction degree of the product in six months is simulated. It can also be set that the operation can be performed once every five days, and the data characteristics five days before the operation day are the current state characteristics χ(s)t) For state s at each operationtTaking χ(s)t) Corresponding optimal transaction operation
Figure BDA0003039511790000083
The predicted maximum satisfaction of the product can be obtained, i.e.The satisfaction is predicted for the product output in step S103.
Further, the S104 sorts the products according to the sequence of the prediction satisfaction degree from large to small, and recommends the sorted products to the target user; the method specifically comprises the following steps:
the sorting may be performed by directly comparing the sizes of the predicted satisfaction degrees obtained in S103 or by using the relative recommendation rate of each product, and there are many methods for calculating the relative recommendation rate by using the simulated maximum satisfaction degree of each product obtained in S103, for example, for the present embodiment, assuming that the predicted satisfaction degrees of three products 1, 2 and 3 are respectively constants 18, 15 and 13, the relative recommendation rate can be calculated by using the lower simulated satisfaction degree of the product as a standard, that is, the simulated satisfaction degree of the product 3 is taken as a standard, the recommendation rate is 1, the relative recommendation rate of the product 1 is 18 ÷ 13 ≈ 1.38, the relative recommendation rate of the product 2 is 15 ÷ 13 ≈ 1.15, and similar calculation can be performed when there are multiple products.
The invention aims to make up for the technical defects, and provides a product recommendation method and device based on a Double DQN algorithm.
Example two
The invention provides a product recommendation device based on Double DQN algorithm;
product recommendation device based on Double DQN algorithm includes:
an acquisition module configured to: acquiring basic information of a target user;
a feature extraction module configured to: processing the basic information of the target user and extracting the characteristics of the basic information;
a prediction module configured to: inputting the characteristics representing the basic information of the target user into the trained deep reinforcement learning model to obtain the prediction satisfaction degree of each product;
a recommendation module configured to: sorting the products according to the sequence of the predicted satisfaction degrees from high to low, and recommending the sorted products to a target user;
the deep reinforcement learning model refers to a Double DQN algorithm.
It should be noted here that the above-mentioned obtaining module, the feature extracting module, the predicting module and the recommending module correspond to steps S101 to S104 in the first embodiment, and the above-mentioned modules are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the contents disclosed in the first embodiment. It should be noted that the modules described above as part of a system may be implemented in a computer system such as a set of computer-executable instructions.
In the foregoing embodiments, the descriptions of the embodiments have different emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The proposed system can be implemented in other ways. For example, the above-described system embodiments are merely illustrative, and for example, the division of the above-described modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules may be combined or integrated into another system, or some features may be omitted, or not executed.
EXAMPLE III
The present embodiment also provides an electronic device, including: one or more processors, one or more memories, and one or more computer programs; wherein, a processor is connected with the memory, the one or more computer programs are stored in the memory, and when the electronic device runs, the processor executes the one or more computer programs stored in the memory, so as to make the electronic device execute the method according to the first embodiment.
It should be understood that in this embodiment, the processor may be a central processing unit CPU, and the processor may also be other general purpose processors, digital signal processors DSP, application specific integrated circuits ASIC, off-the-shelf programmable gate arrays FPGA or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and so on. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may include both read-only memory and random access memory, and may provide instructions and data to the processor, and a portion of the memory may also include non-volatile random access memory. For example, the memory may also store device type information.
In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software.
The method in the first embodiment may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in the processor. The software modules may be located in ram, flash, rom, prom, or eprom, registers, among other storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor. To avoid repetition, it is not described in detail here.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
Example four
The present embodiments also provide a computer-readable storage medium for storing computer instructions, which when executed by a processor, perform the method of the first embodiment.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. The product recommendation method based on Double DQN algorithm is characterized by comprising the following steps:
acquiring basic information of a target user;
processing the basic information of the target user and extracting the characteristics of the basic information;
inputting the characteristics representing the basic information of the target user into the trained deep reinforcement learning model to obtain the prediction satisfaction degree of each product;
sorting the products according to the sequence of the predicted satisfaction degrees from high to low, and recommending the sorted products to a target user;
the deep reinforcement learning model refers to a Double DQN algorithm.
2. The method for recommending products based on Double DQN algorithm of claim 1, wherein basic information of a target user is obtained; the method specifically comprises the following steps:
acquiring average monthly income, historical product purchasing times, historical product purchasing frequency, historical purchased product risk level and historical purchased product price fluctuation data of a target user.
3. The method of claim 1, wherein the basic information of the target user is processed to extract its features; the method specifically comprises the following steps:
and performing feature extraction through a convolutional neural network.
4. The method of claim 1, wherein the features representing the basic information of the target user are input into the trained deep reinforcement learning model to obtain the predicted satisfaction of each product; the training step comprises:
constructing a training set, wherein the training set is user basic information of historical purchase satisfaction of known products;
and preprocessing the basic information of the users in the training set, taking the state characteristics of the basic information of the users obtained after preprocessing and the historical purchase satisfaction degree of the known products as input values of a deep reinforcement learning model, and training the model to obtain the trained deep reinforcement learning model.
5. The method for recommending products based on Double DQN algorithm of claim 1, wherein said preprocessing the basic information of users in training set comprises:
dividing the average monthly income, the purchase frequency of historical products, the purchase frequency of the historical products, the risk level of the historical purchased products and the price fluctuation data of the users in the training set by N time units to obtain a plurality of divided data stThe subscript t represents the time point at which the time interval of the data represented by the state is recorded;
performing feature extraction on all the segmented data in the same time unit through a Convolutional Neural Network (CNN) to obtain a monthly average income feature, a historical product purchase frequency feature, a historical product purchase risk level feature and a price fluctuation data feature;
serially splicing the monthly average income characteristic, the historical product purchasing frequency characteristic, the risk grade characteristic and the price fluctuation data characteristic of the historical purchased products to obtain the state characteristic x(s) corresponding to the same time unitt) And obtaining the state characteristics of all time units in the same way.
6. The method of claim 1, wherein the features representing the basic information of the target user are input into the trained deep reinforcement learning model to obtain the predicted satisfaction of each product; the method specifically comprises the following steps:
inputting the characteristics representing the basic information of the target user into the trained deep reinforcement learning model to obtain the prediction satisfaction degree of the product, wherein the prediction satisfaction degree of the product is a value obtained through an optimal Q value function of a Double DQN algorithm.
7. Product recommendation device based on Double DQN algorithm, characterized by, include:
an acquisition module configured to: acquiring basic information of a target user;
a feature extraction module configured to: processing the basic information of the target user and extracting the characteristics of the basic information;
a prediction module configured to: inputting the characteristics representing the basic information of the target user into the trained deep reinforcement learning model to obtain the prediction satisfaction degree of each product;
a recommendation module configured to: sorting the products according to the sequence of the predicted satisfaction degrees from high to low, and recommending the sorted products to a target user;
the deep reinforcement learning model refers to a Double DQN algorithm.
8. The Double DQN algorithm-based product recommendation device of claim 7, wherein,
the preprocessing of the basic information of the users in the training set specifically includes:
dividing the average monthly income, the purchase frequency of historical products, the purchase frequency of the historical products, the risk level of the historical purchased products and the price fluctuation data of the users in the training set by N time units to obtain a plurality of divided data stThe subscript t represents the time point at which the time interval of the data represented by the state is recorded;
performing feature extraction on all the segmented data in the same time unit through a Convolutional Neural Network (CNN) to obtain a monthly average income feature, a historical product purchase frequency feature, a historical product purchase risk level feature and a price fluctuation data feature;
serially splicing the monthly average income characteristic, the historical product purchasing frequency characteristic, the risk grade characteristic and the price fluctuation data characteristic of the historical purchased products to obtain the state characteristic x(s) corresponding to the same time unitt),
And similarly, obtaining the state characteristics of all time units.
9. An electronic device, comprising: one or more processors, one or more memories, and one or more computer programs; wherein a processor is connected to the memory, the one or more computer programs being stored in the memory, the processor executing the one or more computer programs stored in the memory when the electronic device is running, to cause the electronic device to perform the method of any of the preceding claims 1-6.
10. A computer-readable storage medium storing computer instructions which, when executed by a processor, perform the method of any one of claims 1 to 6.
CN202110452994.0A 2021-04-26 2021-04-26 Product recommendation method and device based on Double DQN algorithm Active CN113129108B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110452994.0A CN113129108B (en) 2021-04-26 2021-04-26 Product recommendation method and device based on Double DQN algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110452994.0A CN113129108B (en) 2021-04-26 2021-04-26 Product recommendation method and device based on Double DQN algorithm

Publications (2)

Publication Number Publication Date
CN113129108A true CN113129108A (en) 2021-07-16
CN113129108B CN113129108B (en) 2023-05-30

Family

ID=76780002

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110452994.0A Active CN113129108B (en) 2021-04-26 2021-04-26 Product recommendation method and device based on Double DQN algorithm

Country Status (1)

Country Link
CN (1) CN113129108B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114581249A (en) * 2022-03-22 2022-06-03 山东大学 Financial product recommendation method and system based on investment risk bearing capacity assessment
CN114581249B (en) * 2022-03-22 2024-05-31 山东大学 Financial product recommendation method and system based on investment risk bearing capacity assessment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109711871A (en) * 2018-12-13 2019-05-03 北京达佳互联信息技术有限公司 A kind of potential customers determine method, apparatus, server and readable storage medium storing program for executing
CN110263244A (en) * 2019-02-14 2019-09-20 腾讯科技(深圳)有限公司 Content recommendation method, device, storage medium and computer equipment
CN110598120A (en) * 2019-10-16 2019-12-20 信雅达系统工程股份有限公司 Behavior data based financing recommendation method, device and equipment
CN110866791A (en) * 2019-11-25 2020-03-06 恩亿科(北京)数据科技有限公司 Commodity pushing method and device, storage medium and electronic equipment
CN111898032A (en) * 2020-08-13 2020-11-06 腾讯科技(深圳)有限公司 Information recommendation method and device based on artificial intelligence, electronic equipment and storage medium
CN112045680A (en) * 2020-09-02 2020-12-08 山东大学 Cloth stacking robot control system and control method based on behavior cloning
CN112291284A (en) * 2019-07-22 2021-01-29 中国移动通信有限公司研究院 Content pushing method and device and computer readable storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109711871A (en) * 2018-12-13 2019-05-03 北京达佳互联信息技术有限公司 A kind of potential customers determine method, apparatus, server and readable storage medium storing program for executing
CN110263244A (en) * 2019-02-14 2019-09-20 腾讯科技(深圳)有限公司 Content recommendation method, device, storage medium and computer equipment
CN112291284A (en) * 2019-07-22 2021-01-29 中国移动通信有限公司研究院 Content pushing method and device and computer readable storage medium
CN110598120A (en) * 2019-10-16 2019-12-20 信雅达系统工程股份有限公司 Behavior data based financing recommendation method, device and equipment
CN110866791A (en) * 2019-11-25 2020-03-06 恩亿科(北京)数据科技有限公司 Commodity pushing method and device, storage medium and electronic equipment
CN111898032A (en) * 2020-08-13 2020-11-06 腾讯科技(深圳)有限公司 Information recommendation method and device based on artificial intelligence, electronic equipment and storage medium
CN112045680A (en) * 2020-09-02 2020-12-08 山东大学 Cloth stacking robot control system and control method based on behavior cloning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114581249A (en) * 2022-03-22 2022-06-03 山东大学 Financial product recommendation method and system based on investment risk bearing capacity assessment
CN114581249B (en) * 2022-03-22 2024-05-31 山东大学 Financial product recommendation method and system based on investment risk bearing capacity assessment

Also Published As

Publication number Publication date
CN113129108B (en) 2023-05-30

Similar Documents

Publication Publication Date Title
CN111008858B (en) Commodity sales prediction method and system
CN106485562B (en) Commodity information recommendation method and system based on user historical behaviors
CN111553759A (en) Product information pushing method, device, equipment and storage medium
CN110647696B (en) Business object sorting method and device
Salehinejad et al. Customer shopping pattern prediction: A recurrent neural network approach
CN109034941B (en) Product recommendation method and device, computer equipment and storage medium
US11770407B2 (en) Methods and apparatuses for defending against data poisoning attacks in recommender systems
CN111738780A (en) Method and system for recommending object
CN116915710A (en) Traffic early warning method, device, equipment and readable storage medium
CN115423538A (en) Method and device for predicting new product sales data, storage medium and electronic equipment
Jiang et al. Intertemporal pricing via nonparametric estimation: Integrating reference effects and consumer heterogeneity
CN117422306A (en) Cross-border E-commerce risk control method and system based on dynamic neural network
CN113129108A (en) Product recommendation method and device based on Double DQN algorithm
CN113313562B (en) Product data processing method and device, computer equipment and storage medium
CN110659701A (en) Information processing method, information processing apparatus, electronic device, and medium
CN109767263A (en) Business revenue data predication method, device, computer equipment and storage medium
CN115713389A (en) Financial product recommendation method and device
JP2023017701A (en) Machine learning method, training method, forecasting system, and non-transitory computer readable medium
CN114897607A (en) Data processing method and device for product resources, electronic equipment and storage medium
CN114693428A (en) Data determination method and device, computer readable storage medium and electronic equipment
CN113743440A (en) Information processing method and device and storage medium
CN113554099A (en) Method and device for identifying abnormal commercial tenant
Van Calster et al. Profit-oriented sales forecasting: a comparison of forecasting techniques from a business perspective
CN117216379A (en) Method and device for selecting combined recruitment suppliers in context and electronic equipment
CN111563760A (en) Prediction method, medium, device and computing equipment of total volume of trades

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant