CN110135951B - Game commodity recommendation method and device and readable storage medium - Google Patents

Game commodity recommendation method and device and readable storage medium Download PDF

Info

Publication number
CN110135951B
CN110135951B CN201910406926.3A CN201910406926A CN110135951B CN 110135951 B CN110135951 B CN 110135951B CN 201910406926 A CN201910406926 A CN 201910406926A CN 110135951 B CN110135951 B CN 110135951B
Authority
CN
China
Prior art keywords
player
game
attribute
commodity
learning algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910406926.3A
Other languages
Chinese (zh)
Other versions
CN110135951A (en
Inventor
杜鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Netease Hangzhou Network Co Ltd
Original Assignee
Netease Hangzhou Network Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Netease Hangzhou Network Co Ltd filed Critical Netease Hangzhou Network Co Ltd
Priority to CN201910406926.3A priority Critical patent/CN110135951B/en
Publication of CN110135951A publication Critical patent/CN110135951A/en
Application granted granted Critical
Publication of CN110135951B publication Critical patent/CN110135951B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Finance (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

According to the method, the device and the readable storage medium for recommending the game commodities, the current state set of the player is formed by acquiring the attribute feature vectors of the current game commodities browsed by the player and the feature vectors of the player; inputting the current state set of the player into a reinforcement learning algorithm model so as to enable the reinforcement learning algorithm model to call an attribute prediction matrix set corresponding to the characteristic vector of the player and output each attribute prediction characteristic vector; and taking the game commodity matched with each attribute prediction feature vector as a recommended game commodity and recommending the recommended game commodity, so that when the game commodity is recommended for the player, the used reinforcement learning algorithm model comprehensively considers factors of historical game commodities browsed by the player and the recommended game commodities caused by the browsed current game commodities, and the game commodity capable of meeting the real requirements of the player is recommended for the player.

Description

Game commodity recommendation method and device and readable storage medium
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a method and a device for recommending game commodities and a readable storage medium.
Background
With the development of computer technology, it becomes possible to provide more accurate commodity recommendation service for users by using data analysis technology. Particularly, in the field of games, unlike general commodities, attributes of game commodities are more diversified, which makes it difficult to recommend more precise game commodities to players.
In the prior art, game commodities are generally recommended to players based on a clustering algorithm, and distances between current game commodities browsed by the players and other game commodities are analyzed by using a distance algorithm, so that game commodities most similar to the current game commodities are found from the distances and are recommended as recommended game commodities.
However, in the recommendation of game commodities based on the clustering algorithm, although the overall attributes of the recommended game commodities can be kept high in similarity with the current game commodities browsed by the player, the attributes of the game commodities are more diversified, and the player is more concerned about a certain sub-attribute of the game commodities. That is, the recommended game pieces do not take into account the current game piece sub-attributes that the player is actually interested in, which would make the recommended game pieces unmatchable to the player's actual needs.
Disclosure of Invention
In order to solve the above mentioned problems, the present invention provides a method, an apparatus and a readable storage medium for recommending game goods.
In one aspect, the present invention provides a method for recommending game merchandise, including:
acquiring attribute feature vectors of current game commodities browsed by a player and feature vectors of the player to form a current state set of the player;
inputting the current state set of the player into a reinforcement learning algorithm model so that the reinforcement learning algorithm model calls an attribute prediction matrix set corresponding to the current state set of the player and outputs attribute prediction characteristic vectors; the attribute prediction matrix set is determined by the reinforcement learning algorithm model according to each attribute feature vector of historical game commodities browsed by a player;
and taking the game commodity matched with each attribute prediction feature vector as a recommended game commodity and recommending the recommended game commodity.
In an alternative embodiment, before inputting the current state set of the player into the reinforcement learning algorithm model, the method further includes:
judging whether the player triggers a recommendation request for game commodities;
and if so, executing the step of inputting the current state set of the player into a reinforcement learning algorithm model.
In an optional implementation manner, when the player does not trigger a recommendation request for game goods, the method for recommending game goods further includes:
acquiring the behavior of a player on current game commodities, and calling a last state set of the player; wherein, the previous state set comprises the attribute feature vectors of the previous game commodity browsed by the player;
inputting the last state set and the current state set of the player into a reinforcement learning algorithm model so that the reinforcement learning algorithm model takes the behavior of the current game commodity as a model reward, and updating an attribute prediction matrix set corresponding to the player in the reinforcement learning algorithm model.
In an alternative embodiment, the inputting the previous state set and the current state set of the player into a reinforcement learning algorithm model, so that the reinforcement learning algorithm model uses the behavior of the current game commodity as a model reward, and updating the attribute prediction matrix set corresponding to the player in the reinforcement learning algorithm model includes:
determining a corresponding reward value of the behavior of the current game commodity in a preset reward function;
updating the probability matrix of each attribute in the attribute prediction matrix set corresponding to the player by using an updating formula, wherein the updating formula is Qnew(s,α)=(1-lr)·Q(s,α)+lr·[R+γ·maxQ(α,α')];
Wherein, Q isnew(s, α) represents an updated probability value when the feature vector of the previous game commodity is s and the feature vector of the current game commodity is α, Q (s, α) represents a probability value when the feature vector of the previous game commodity is s and the feature vector of the current game commodity is α, maxQ (α, α') represents a probability maximum value among probability values of attribute feature vectors of the current game commodity when the feature vector of the previous game commodity is α of a probability matrix Q, lr is a preset algorithm learning rate, R is the award value, and γ is a preset discount factor.
In an optional implementation manner, the inputting the current state set of the player into a reinforcement learning algorithm model, so that the reinforcement learning algorithm model invokes a set of attribute prediction matrices corresponding to the current state set of the player, and outputs each attribute prediction feature vector includes:
calling a corresponding attribute prediction matrix set according to the characteristic vector of the player in the current state set; wherein, the attribute prediction matrix set comprises a probability matrix of each attribute;
and aiming at each attribute feature vector of the current game commodity, performing prediction processing by using a corresponding probability matrix to obtain each attribute prediction feature vector.
In one optional implementation, the recommending, as a recommended game item, a game item that matches each of the attribute prediction feature vectors, includes:
and taking the attribute prediction feature vectors as constraint conditions, and obtaining recommended game commodities in a preset game commodity library by utilizing the constraint conditions so as to recommend the recommended game commodities.
In one optional implementation, taking the attribute prediction feature vectors as constraint conditions, and obtaining recommended game commodities in a preset game commodity library by using the constraint conditions, the method includes:
taking the attribute prediction characteristic vectors as constraint conditions, and acquiring the weight of each prediction characteristic vector;
and obtaining recommended game commodities in a preset game commodity library according to each constraint condition and the corresponding weight.
In still another aspect, the present invention provides a game commodity recommendation apparatus, including:
the interactive module is used for acquiring attribute feature vectors of current game commodities browsed by the player and feature vectors of the player to form a current state set of the player;
the processing module is used for inputting the current state set of the player into a reinforcement learning algorithm model so that the reinforcement learning algorithm model calls an attribute prediction matrix set corresponding to the current state set of the player and outputs attribute prediction characteristic vectors; the attribute prediction matrix set is determined by the reinforcement learning algorithm model according to each attribute feature vector of historical game commodities browsed by a player;
the interaction module is also used for taking the game commodity matched with each attribute prediction feature vector as a recommended game commodity and recommending the recommended game commodity.
In one optional implementation, the processing module is further configured to, before inputting the current state set of the player into the reinforcement learning algorithm model, execute the step of inputting the current state set of the player into the reinforcement learning algorithm model when the player triggers a recommendation request for a game commodity;
the processing module acquires the behavior of a player on the current game commodity when the player does not trigger a recommendation request for the game commodity, and calls a last state set of the player; wherein, the previous state set comprises the attribute feature vectors of the previous game commodity browsed by the player; inputting the last state set and the current state set of the player into a reinforcement learning algorithm model so that the reinforcement learning algorithm model takes the behavior of the current game commodity as a model reward, and updating an attribute prediction matrix set corresponding to the player in the reinforcement learning algorithm model.
In still another aspect, the present invention provides a game commodity recommendation apparatus, including: a memory, a processor, and a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any of the preceding claims.
In a final aspect, the invention provides a readable storage medium having stored thereon a computer program for execution by a process to perform the method of any preceding claim.
According to the method, the device and the readable storage medium for recommending the game commodities, the current state set of the player is formed by acquiring the attribute feature vectors of the current game commodities browsed by the player and the feature vectors of the player; inputting the current state set of the player into a reinforcement learning algorithm model so that the reinforcement learning algorithm model calls an attribute prediction matrix set corresponding to the current state set of the player and outputs attribute prediction characteristic vectors; the attribute prediction matrix set is determined by the reinforcement learning algorithm model according to each attribute feature vector of historical game commodities browsed by a player; taking the game commodity matched with each attribute prediction feature vector as a recommended game commodity, recommending and acquiring each attribute feature vector of the current game commodity browsed by the player and the feature vector of the player to form a current state set of the player; inputting the current state set of the player into a reinforcement learning algorithm model so as to enable the reinforcement learning algorithm model to call an attribute prediction matrix set corresponding to the characteristic vector of the player and output each attribute prediction characteristic vector; and taking the game commodity matched with each attribute prediction feature vector as a recommended game commodity and recommending the recommended game commodity, so that when the game commodity is recommended for the player, the used reinforcement learning algorithm model comprehensively considers factors of historical game commodities browsed by the player and the recommended game commodities caused by the browsed current game commodities, and the game commodity capable of meeting the real requirements of the player is recommended for the player.
Drawings
With the foregoing drawings in mind, certain embodiments of the disclosure have been shown and described in more detail below. These drawings and written description are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the concepts of the disclosure to those skilled in the art by reference to specific embodiments.
FIG. 1 is a schematic diagram of a network architecture on which the present invention is based;
fig. 2 is a schematic flow chart illustrating a method for recommending game merchandise according to an embodiment of the present invention;
fig. 3 is a flowchart illustrating a method for recommending game merchandise according to a second embodiment of the present invention;
fig. 4 is a schematic structural diagram of a game commodity recommendation device according to a third embodiment of the present invention;
fig. 5 is a hardware schematic diagram of a game commodity recommendation device according to a fourth embodiment of the present invention.
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
Detailed Description
Embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present application. It should be understood that the drawings and embodiments of the present application are for illustration purposes only and are not intended to limit the scope of the present application.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the embodiments of the application and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
With the development of computer technology, it becomes possible to provide more accurate commodity recommendation service for users by using data analysis technology. Particularly, in the field of games, unlike general commodities, attributes of game commodities are more diversified, which makes it difficult to recommend more precise game commodities to players.
In the prior art, game commodities are generally recommended to players based on a clustering algorithm, and distances between current game commodities browsed by the players and other game commodities are analyzed by using a distance algorithm, so that game commodities most similar to the current game commodities are found from the distances and are recommended as recommended game commodities.
However, in the recommendation of game commodities based on the clustering algorithm, although the overall attributes of the recommended game commodities can be kept high in similarity with the current game commodities browsed by the player, the attributes of the game commodities are more diversified, and the player is more concerned about a certain sub-attribute of the game commodities. That is, the recommended game pieces do not take into account the current game piece sub-attributes that the player is actually interested in, which would make the recommended game pieces unmatchable to the player's actual needs.
Of course, in other prior art, a model may be established according to the historical browsing records of the player to directly predict the attributes of the game goods that may be of interest to the player to determine the recommended game goods. However, due to the diversification of the attributes of the game commodities and the interaction between the recommended system and the browsing behavior of the player, the historical browsing record model cannot be updated well in a timing manner by adopting the method, and the recommended effect of the recommended game commodities is poor.
In order to solve the above mentioned problems, the present invention provides a method, an apparatus and a readable storage medium for recommending game goods. Fig. 1 is a schematic diagram of a network architecture based on the present invention, and as shown in fig. 1, the network architecture based on the present invention at least includes a game commodity recommending device 1 and a terminal 2.
The recommendation device 1 for game commodities can be a server or a server cluster erected at the cloud end, and can be used for storing data and calculating and processing the data according to preset processing logic.
The terminal 2 may specifically be a hardware device that can be used for a player to perform a game experience, such as a smart phone, a tablet computer, a desktop computer, a smart game machine, and the like, where a game client may be installed on the terminal 2 or a game experience interface may be provided on the terminal, that is, a player may trigger a corresponding game operation on the client or the interface, where the game operation includes, but is not limited to, controlling a game character to perform a game experience, browsing game goods, purchasing game goods, and the like.
The game commodity recommending apparatus 1 and the terminal 2 can be connected by wireless communication or wired communication to perform data interaction.
Fig. 2 is a schematic flow chart of a method for recommending game commodities, according to an embodiment of the present invention, as shown in fig. 2, the method for recommending game commodities includes:
step 101, acquiring attribute feature vectors of current game commodities browsed by a player and feature vectors of the player to form a current state set of the player;
102, inputting the current state set of the player into a reinforcement learning algorithm model so that the reinforcement learning algorithm model calls an attribute prediction matrix set corresponding to the current state set of the player and outputs each attribute prediction characteristic vector;
the attribute prediction matrix set is determined by the reinforcement learning algorithm model according to each attribute feature vector of historical game commodities browsed by a player;
and 103, taking the game commodity matched with each attribute prediction feature vector as a recommended game commodity and recommending the recommended game commodity.
The execution subject of the method for recommending a game commodity according to the present embodiment is a game commodity recommending apparatus shown in fig. 1. In addition, the reinforcement learning algorithm model described in this embodiment may be a variety of models, and in particular, an improved Q-learning algorithm model may be used.
Specifically, the present embodiment provides a more accurate game commodity recommendation service to a player. First, the game commodity recommending device acquires attribute feature vectors of current game commodities browsed by a player and feature vectors of the player, and forms a current state set of the player.
The player and the current goods browsed by the player are acquired through the terminal, namely the terminal records a series of game data such as the ID of the player, the identity of the player, the game behavior of the player and the like and sends the game data to the recommendation device of the game goods for analysis.
The game commodity recommending device analyzes the image of the player by using the user image system for the obtained game data to obtain the characteristic vector of the player. Wherein, the player is labeled by the image analysis of the player, so that each player can be described by a plurality of labels, and the labels can be used for reflecting the basic characteristics of the player, such as sex, age group, consumption level in the game and the like, and can also be used for reflecting the characteristics of interest or character of the player, such as favorite game type, favorite game commodity type and the like; and can also be used for reflecting the action characteristics related to the game action, such as the game operation level, the game attitude, the game operation style and the like of the player in the game process. That is, by performing portrait analysis on game data of a player, portrait tags of the player in different feature dimensions are obtained, and a feature vector of the player itself is obtained based on the portrait tags. The image analysis of the game data of the player can be specifically realized by using a conventional image analysis algorithm, and the present invention is not limited thereto.
In addition, the recommending device for the game commodity analyzes the game data according to the browsing behavior of the player on the game commodity so as to obtain each attribute feature vector of the current game commodity browsed by the player.
Wherein, the game goods generally refer to the props in the game, such as the weapons, the armour equipment, the medicament, fashion clothing, etc. of the characters. For different game commodities, the attributes are generally different: for example, for weapons, the attributes are typically injury value, attack scope, special skills, element attributes, cooling duration, pricing information, and the like; for another example, for fashion clothing, the attributes thereof are typically applicable location, applicable duration, price information, applicable role position, applicable role occupation, and the like; also for example, for a pharmaceutical agent, the attribute may be a damage or recovery value, a duration, an element attribute, a drug resistance attribute, and the like.
That is, based on the diversification of game commodities, the attributes of each game commodity are different, and in the present invention, the game commodity recommending apparatus acquires data of a current game commodity viewed by a player and acquires attribute feature vectors of the current game commodity. For example, for a weapon, the injury value, the attack range, the special skill, the element attribute, the cooling duration, and the price information are all different attributes of the weapon, and the attribute information for each attribute can be described by a feature vector, for example, the injury value in the injury attribute is 20 to 25, and the feature vector can be represented as [20,25 ]. It should be noted that the foregoing example is only one implementation manner provided by the present invention, and those skilled in the art may set the specific content and the representation manner represented by the attribute and the feature vector according to the difference of the game commodity and the difference of the game type, which is not limited in this respect.
When the recommending device of the game commodity obtains the characteristic vector of the player and each attribute characteristic vector of the current game commodity browsed by the player, the current state set of the player is constructed based on the vectors, so that the reinforcement learning algorithm model can process the current state set of the player.
And then, inputting the current state set of the player into a reinforcement learning algorithm model so that the reinforcement learning algorithm model calls an attribute prediction matrix set corresponding to the current state set of the player and outputs each attribute prediction characteristic vector.
In the recommendation method adopted by the invention, an improved Q-learning algorithm model in the reinforcement learning algorithm model is utilized to process the current state set so as to realize the output of the characteristic vector of each attribute prediction of the game commodity which is possibly interested by the player.
Specifically, in the improved Q-learning algorithm model, different attribute prediction matrix sets can be preset for different types of players for data processing. The different types of players may be embodied as feature vectors of current states of different players, that is, for any two players, when their own feature vectors are the same, the types corresponding to the two players are the same, and at this time, the attribute prediction matrix sets adopted by the two players are also the same. Meanwhile, the attribute prediction matrix set is determined by the reinforcement learning algorithm model according to the attribute characteristic vectors of historical game commodities browsed by the player, and can reflect the preference of the player on each attribute in the process of browsing the game commodities in a period of time.
Therefore, the recommending device of the game commodity calls the attribute prediction matrix set matched with or associated with the characteristic vector of the player from a plurality of pre-stored attribute prediction matrix sets according to the characteristic vector of the player so as to correspondingly process each acquired attribute characteristic vector in the current state set of the player by using the attribute prediction matrix set. Namely, calling a corresponding attribute prediction matrix set according to the characteristic vector of the player; wherein, the attribute prediction matrix set comprises a probability matrix of each attribute; and aiming at each attribute feature vector of the current game commodity, performing prediction processing by using a corresponding probability matrix to obtain each attribute prediction feature vector.
It should be noted that any attribute prediction matrix set may specifically be composed of probability matrices of all attributes, and each probability matrix corresponds to an attribute of a game commodity, for example, an attribute prediction feature vector of an attribute that is used for calculating a damage value by using a damage value probability matrix.
In addition, the probability matrix in this embodiment may be a two-dimensional matrix, where the coordinate in one direction of each element of each probability matrix may be used to represent the attribute feature vector of the current game commodity, the coordinate in the other direction may be used to represent the predicted or recommended attribute prediction feature vector, and the element value corresponding to each coordinate group is a probability value.
Table 1 shows an injury probability matrix, as shown in table 1 below, when the injury value of the current weapon is [20,25], the probability of the player triggering the effective browsing behavior with the injury value of [20,25] next time is 0.6, the probability of triggering the effective browsing behavior with the injury value of [1,24] next time is 0.1, and the probability of triggering the effective browsing behavior with the injury value of [26,50] next time is 0.3.
TABLE 1
Figure BDA0002061535230000091
Figure BDA0002061535230000101
By using each probability matrix, the attribute prediction feature vector having the highest probability value can be output as the output attribute prediction feature vector. At this time, the game commodity recommending device obtains the attribute prediction feature vector corresponding to each attribute of the current game commodity.
Then, the game commodity recommending device takes the attribute prediction feature vectors as constraint conditions, and obtains recommended game commodities from a preset game commodity library by using the constraint conditions so as to recommend the recommended game commodities. Specifically, in the game commodity library, each game commodity can be classified, screened and searched according to the attribute. Therefore, the obtained attribute prediction feature vectors can be used as constraint conditions, namely query conditions, so that the game commodity recommending device can query and filter in the game commodity library to obtain recommended game commodities meeting the constraint conditions.
In a preferred embodiment, in some cases, the game commodity library does not include game commodities that satisfy all of the constraint conditions, and at this time, the recommendation device for game commodities may use the attribute prediction feature vectors as the constraint conditions, obtain the weight of each prediction feature vector, and obtain a recommended game commodity in a preset game commodity library according to each constraint condition and the corresponding weight. Specifically, if the probability value corresponding to the attribute prediction feature vector of the obtained damage value is 0.2, the probability value corresponding to the attribute prediction feature vector of the element attribute is 0.9, and the probability value corresponding to the attribute prediction feature vector of the price information is 0.8, at this time, the weight of the attribute prediction feature vector of the damage value with a lower probability value may be reduced or set to zero, so that the recommended game product can satisfy the attribute prediction feature vector corresponding to the attribute with a higher probability value, that is, a higher weight.
The method for recommending the game commodities comprises the steps of obtaining attribute feature vectors of current game commodities browsed by a player and feature vectors of the player, and forming a current state set of the player; inputting the current state set of the player into a reinforcement learning algorithm model so that the reinforcement learning algorithm model calls an attribute prediction matrix set corresponding to the current state set of the player and outputs attribute prediction characteristic vectors; the attribute prediction matrix set is determined by the reinforcement learning algorithm model according to each attribute feature vector of historical game commodities browsed by a player; taking the game commodity matched with each attribute prediction feature vector as a recommended game commodity, recommending and acquiring each attribute feature vector of the current game commodity browsed by the player and the feature vector of the player to form a current state set of the player; inputting the current state set of the player into a reinforcement learning algorithm model so as to enable the reinforcement learning algorithm model to call an attribute prediction matrix set corresponding to the characteristic vector of the player and output each attribute prediction characteristic vector; and taking the game commodity matched with each attribute prediction feature vector as a recommended game commodity and recommending the recommended game commodity, so that when the game commodity is recommended for the player, the used reinforcement learning algorithm model comprehensively considers factors of historical game commodities browsed by the player and the recommended game commodities caused by the browsed current game commodities, and the game commodity capable of meeting the real requirements of the player is recommended for the player.
Fig. 3 is a schematic flow chart of a method for recommending game commodities, according to a second embodiment of the present invention, as shown in fig. 3, the method for recommending game commodities includes:
step 201, acquiring attribute feature vectors of current game commodities browsed by a player and feature vectors of the player to form a current state set of the player;
step 202, judging whether the player triggers a recommendation request for game commodities;
if yes, go to step 203, otherwise go to step 205.
Step 203, inputting the current state set of the player into a reinforcement learning algorithm model, so that an attribute prediction matrix set corresponding to the feature vector of the player called by the reinforcement learning algorithm model outputs each attribute prediction feature vector;
the attribute prediction matrix set is determined by the reinforcement learning algorithm model according to each attribute feature vector of historical game commodities browsed by a player;
step 204, taking the game commodity matched with each attribute prediction feature vector as a recommended game commodity and recommending the recommended game commodity;
step 205, acquiring the behavior of a player on the current game commodity, and calling the last state set of the player; wherein, the previous state set comprises the attribute feature vectors of the previous game commodity browsed by the player;
and step 206, inputting the last state set and the current state set of the player into a reinforcement learning algorithm model, so that the reinforcement learning algorithm model takes the behavior of the current game commodity as a model reward, and updates an attribute prediction matrix set corresponding to the characteristic vector of the player in the reinforcement learning algorithm model.
The execution subject of the method for recommending a game commodity according to the present embodiment is the game commodity recommending apparatus 1 shown in fig. 1.
First, the game commodity recommending apparatus 1 acquires attribute feature vectors of the current game commodity viewed by the player and feature vectors of the player, and configures a current state set of the player. The specific implementation manner is similar to that of the embodiment, and is not described herein.
Unlike the previous embodiment, in the second embodiment, it is further determined whether the player triggers a recommendation request for game merchandise.
That is, if and only if a player sends a request for recommending game commodities to a device for recommending game commodities via a terminal, the device for recommending game commodities inputs the current state set of the player into a reinforcement learning algorithm model, so that an attribute prediction matrix set corresponding to the feature vector of the player called by the reinforcement learning algorithm model outputs each attribute prediction feature vector; and taking the game commodity matched with each attribute prediction feature vector as a recommended game commodity and recommending the recommended game commodity. Of course, the recommendation process in the second embodiment is similar to the foregoing embodiment, and is not described herein again.
Further, unlike the foregoing embodiment, when the player does not trigger a recommendation request for a game commodity, the game commodity recommendation device acquires the behavior of the player for the current game commodity and invokes the last state set of the player; wherein, the previous state set comprises the attribute feature vectors of the previous game commodity browsed by the player. Specifically, similar to the current state set, the attribute feature vectors of the last game item browsed by the player are included in the last state set of the player.
And the game commodity recommending device inputs the last state set and the current state set of the player into a reinforcement learning algorithm model, so that the reinforcement learning algorithm model takes the behavior of the current game commodity as model reward, and updates an attribute prediction matrix set corresponding to the characteristic vector of the player in the reinforcement learning algorithm model.
In particular, the Q-learning algorithm model is a reinforcement learning algorithm based on a reward mechanism. If the player is considered as the environment of the model action, if the player clicks or purchases the game goods recommended by the recommending device, the algorithm model of the recommending device will receive an award. The goal of the recommender is to optimize the recommendation strategy of the Q-learning algorithm model to obtain the maximum jackpot.
Further, a bonus function may be defined for each attribute of the game item, such as:
Figure BDA0002061535230000121
in the reward function, freward(S) a prize value of attribute S, e.g., 100 when the recommended game item is placed by the player; when the recommended game item is viewed only for hover details, the prize value is 1.
Therefore, the obtained previous state set, current state set and reward function of the player can be used for updating the attribute prediction matrix set corresponding to the player in the reinforcement learning algorithm model.
In an optional implementation manner, the previous state set and the current state set of the player are input into a reinforcement learning algorithm model, so that the reinforcement learning algorithm model uses the behavior of the current game commodity as a model reward, and updates an attribute prediction matrix set corresponding to a feature vector of the player in the reinforcement learning algorithm model, which may adopt the following manner:
first, the corresponding bonus value for the behavior of the current game commodity is determined in a preset bonus function, wherein the bonus function is as described above and is not described herein. The action on the current game commodity may specifically include viewing floating details, viewing recommended item details, placing an order, and other circumstances.
Updating the probability matrix of each attribute in the attribute prediction matrix set in the reinforcement learning algorithm model by using an updating formula, wherein the updating formula is as follows:
Qnew(s,α)=(1-lr)·Q(s,α)+lr·[R+γ·maxQ(α,α')];
wherein, Q isnew(s, α) represents an updated probability value when the feature vector of the previous game commodity is s and the feature vector of the current game commodity is α, Q (s, α) represents a probability value when the feature vector of the previous game commodity is s and the feature vector of the current game commodity is α, maxQ (α, α') represents a probability maximum value among the probability values of the attribute feature vectors of the current game commodity when the attribute feature vector of the previous game commodity is α of a probability matrix Q, lr is a preset algorithm learning rate, R is the award value, and γ is a preset discount factor.
Taking the probability matrix shown in table 1 as an example, if the attribute feature vector in the current state set is [20,25], and the attribute feature vector in the previous state set is [1,24], then:
Qnew([1,24],[20,25])=(1-lr)·Q([1,24],[20,25])+lr·[R+γ·Q([1,24],[1,24])];
namely, Qnew([1,24],[20,25])=(1-lr)·0.2+lr·[R+γ·0.8];
That is, in table 1 after updating, the attribute feature vector is [1,24], and the probability value corresponding to the attribute prediction feature vector [20,25] is [ (1-lr) · 0.2+ lr · R · 0.8 ].
Through the mode, the probability value in the probability matrix can be updated rapidly, so that the recommending device can recommend more accurate game commodities to the player.
The method for recommending the game commodities comprises the steps of obtaining attribute feature vectors of current game commodities browsed by a player and feature vectors of the player, and forming a current state set of the player; inputting the current state set of the player into a reinforcement learning algorithm model so that the reinforcement learning algorithm model calls an attribute prediction matrix set corresponding to the current state set of the player and outputs attribute prediction characteristic vectors; the attribute prediction matrix set is determined by the reinforcement learning algorithm model according to each attribute feature vector of historical game commodities browsed by a player; taking the game commodity matched with each attribute prediction feature vector as a recommended game commodity, recommending and acquiring each attribute feature vector of the current game commodity browsed by the player and the feature vector of the player to form a current state set of the player; inputting the current state set of the player into a reinforcement learning algorithm model so as to enable the reinforcement learning algorithm model to call an attribute prediction matrix set corresponding to the characteristic vector of the player and output each attribute prediction characteristic vector; and taking the game commodity matched with each attribute prediction feature vector as a recommended game commodity and recommending the recommended game commodity, so that when the game commodity is recommended for the player, the used reinforcement learning algorithm model comprehensively considers factors of historical game commodities browsed by the player and the recommended game commodities caused by the browsed current game commodities, and the game commodity capable of meeting the real requirements of the player is recommended for the player.
Fig. 4 is a schematic structural diagram of a game commodity recommendation device according to a third embodiment of the present invention, and as shown in fig. 4, the game commodity recommendation device includes:
the interactive module 10 is configured to obtain attribute feature vectors of current game commodities browsed by a player and feature vectors of the player, and form a current state set of the player;
the processing module 20 is configured to input the current state set of the player into a reinforcement learning algorithm model, so that an attribute prediction matrix set corresponding to a feature vector of the player called by the reinforcement learning algorithm model outputs each attribute prediction feature vector;
the interaction module 10 is further configured to take the game commodity matched with each attribute prediction feature vector as a recommended game commodity and recommend the recommended game commodity.
In an optional implementation manner, the apparatus for recommending game commodities further includes a determining module, configured to determine whether the player triggers a request for recommending game commodities before inputting the current state set of the player into the reinforcement learning algorithm model;
if so, processing module 20 performs the step of inputting the set of current states of the player into a reinforcement learning algorithm model.
In an alternative embodiment, when the player does not trigger a recommendation request for game merchandise, the processing module 20 is further configured to: acquiring the behavior of a player on current game commodities, and calling a last state set of the player; wherein, the previous state set comprises the attribute feature vectors of the previous game commodity browsed by the player; inputting the last state set and the current state set of the player into a reinforcement learning algorithm model so that the reinforcement learning algorithm model takes the behavior of the current game commodity as a model reward, and updating an attribute prediction matrix set corresponding to the characteristic vector of the player in the reinforcement learning algorithm model.
In an optional implementation manner, the processing module 20 is specifically configured to: determining a corresponding reward value of the behavior of the current game commodity in a preset reward function; updating the probability matrix of each attribute in the attribute prediction matrix set corresponding to the player by using an updating formula, wherein the updating formula is Qnew(s,α)=(1-lr)·Q(s,α)+lr·[R+γ·maxQ(α,α')](ii) a Wherein, Q isnew(s, α) represents an updated probability value when the feature vector of the previous game commodity is s and the feature vector of the current game commodity is α, Q (s, α) represents a probability value when the feature vector of the previous game commodity is s and the feature vector of the current game commodity is α, maxQ (α, α') represents a probability maximum value among probability values of attribute feature vectors of the current game commodity when the feature vector of the previous game commodity is α of a probability matrix Q, lr is a preset algorithm learning rate, R is the award value, and γ is a preset discount factor.
In an optional implementation manner, the processing module 20 is specifically configured to: calling a corresponding attribute prediction matrix set according to the characteristic vector of the player; wherein, the attribute prediction matrix set comprises a probability matrix of each attribute; and aiming at each attribute feature vector of the current game commodity, performing prediction processing by using a corresponding probability matrix to obtain each attribute prediction feature vector.
In an optional implementation manner, the processing module 20 is further configured to use the attribute prediction feature vectors as constraints, and obtain recommended game commodities in a preset game commodity library by using the constraints, so that the recommended game commodities are recommended by the interaction module 10.
In an optional implementation manner, the processing module 20 is specifically configured to use the attribute prediction feature vectors as constraint conditions, and obtain a weight of each prediction feature vector; and obtaining recommended game commodities in a preset game commodity library according to each constraint condition and the corresponding weight.
The invention provides a game commodity recommending device, which is characterized in that a current state set of a player is formed by acquiring attribute feature vectors of current game commodities browsed by the player and feature vectors of the player; inputting the current state set of the player into a reinforcement learning algorithm model so as to enable the reinforcement learning algorithm model to call an attribute prediction matrix set corresponding to the characteristic vector of the player and output each attribute prediction characteristic vector; and recommending the game commodity matched with each attribute prediction feature vector as a recommended game commodity, so that the characteristics of the player and the attributes of the current game commodity browsed by the player are fully considered when the game commodity is recommended for the player, and the game commodity meeting the current requirement can be accurately recommended for the player.
Fig. 5 is a hardware schematic diagram of a game commodity recommendation device according to a fourth embodiment of the present invention, and as shown in fig. 5, the game commodity recommendation device includes: a processor 42 and a computer program stored on the memory 41 and executable on the processor 42, the processor 42 executing the method of the above embodiment when executing the computer program.
The present invention also provides a readable storage medium comprising a program which, when run on a terminal, causes the terminal to perform the method of any of the above embodiments.
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a load programmable logic device (CPLD), and the like.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
Further, while operations are depicted in a particular order, this should be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims (10)

1. A method for recommending game merchandise, comprising:
acquiring attribute feature vectors of current game commodities browsed by a player and feature vectors of the player to form a current state set of the player;
inputting the current state set of the player into a reinforcement learning algorithm model so that the reinforcement learning algorithm model calls an attribute prediction matrix set corresponding to the current state set of the player and outputs attribute prediction characteristic vectors; the attribute prediction matrix set is determined by the reinforcement learning algorithm model according to each attribute feature vector of historical game commodities browsed by a player;
taking the game commodity matched with each attribute prediction feature vector as a recommended game commodity and recommending;
inputting the current state set of the player into a reinforcement learning algorithm model, so that the reinforcement learning algorithm model calls an attribute prediction matrix set corresponding to the current state set of the player, and outputs each attribute prediction feature vector, wherein the method comprises the following steps:
calling a corresponding attribute prediction matrix set according to the characteristic vector of the player in the current state set; wherein, the attribute prediction matrix set comprises a probability matrix of each attribute;
and aiming at each attribute feature vector of the current game commodity, performing prediction processing by using a corresponding probability matrix to obtain each attribute prediction feature vector.
2. A method for recommending game merchandise according to claim 1, wherein said inputting the current state set of the player into a reinforcement learning algorithm model further comprises:
judging whether the player triggers a recommendation request for game commodities;
and if so, executing the step of inputting the current state set of the player into a reinforcement learning algorithm model.
3. The method of recommending game merchandise according to claim 2, wherein when said player does not trigger a request for recommending game merchandise, said method of recommending game merchandise further comprises:
acquiring the behavior of a player on current game commodities, and calling a last state set of the player; wherein, the previous state set comprises the attribute feature vectors of the previous game commodity browsed by the player;
inputting the last state set and the current state set of the player into a reinforcement learning algorithm model so that the reinforcement learning algorithm model takes the behavior of the current game commodity as a model reward, and updating an attribute prediction matrix set corresponding to the player in the reinforcement learning algorithm model.
4. A method for recommending game items according to claim 3, wherein said inputting a previous state set and a current state set of said player into a reinforcement learning algorithm model, so that said reinforcement learning algorithm model uses said behavior on said current game items as a model reward, and updating a set of attribute prediction matrices corresponding to said player in said reinforcement learning algorithm model, comprises:
determining a corresponding reward value of the behavior of the current game commodity in a preset reward function;
updating the probability matrix of each attribute in the attribute prediction matrix set corresponding to the player by using an updating formula, wherein the updating formula is Qnew(s,α)=(1-lr)·Q(s,α)+lr·[R+γ·maxQ(α,α')];
Wherein, Q isnew(s, α) represents an updated probability value when the feature vector of the previous game commodity is s and the feature vector of the current game commodity is α, Q (s, α) represents a probability value when the feature vector of the previous game commodity is s and the feature vector of the current game commodity is α, maxQ (α, α') represents a probability maximum value among probability values of attribute feature vectors of the current game commodity when the feature vector of the previous game commodity is α of a probability matrix Q, lr is a preset algorithm learning rate, R is the award value, and γ is a preset discount factor.
5. The method of recommending a game commodity according to claim 1, wherein said recommending a game commodity that matches each of the attribute prediction feature vectors as a recommended game commodity comprises:
and taking the attribute prediction feature vectors as constraint conditions, and obtaining recommended game commodities in a preset game commodity library by utilizing the constraint conditions so as to recommend the recommended game commodities.
6. The method of claim 5, wherein the step of obtaining a recommended game commodity from a preset game commodity library by using the attribute prediction feature vectors as constraints comprises:
taking the attribute prediction characteristic vectors as constraint conditions, and acquiring the weight of each prediction characteristic vector;
and obtaining recommended game commodities in a preset game commodity library according to each constraint condition and the corresponding weight.
7. A game item recommendation device, comprising:
the interactive module is used for acquiring attribute feature vectors of current game commodities browsed by the player and feature vectors of the player to form a current state set of the player;
the processing module is used for inputting the current state set of the player into a reinforcement learning algorithm model so that the reinforcement learning algorithm model calls an attribute prediction matrix set corresponding to the current state set of the player and outputs attribute prediction characteristic vectors; the attribute prediction matrix set is determined by the reinforcement learning algorithm model according to each attribute feature vector of historical game commodities browsed by a player;
the interaction module is also used for taking the game commodity matched with each attribute prediction feature vector as a recommended game commodity and recommending the recommended game commodity;
the processing module is specifically used for calling a corresponding attribute prediction matrix set according to the characteristic vector of the player in the current state set; wherein, the attribute prediction matrix set comprises a probability matrix of each attribute;
and aiming at each attribute feature vector of the current game commodity, performing prediction processing by using a corresponding probability matrix to obtain each attribute prediction feature vector.
8. A recommendation device for game merchandise according to claim 7, wherein the processing module is further configured to perform the step of inputting the current state set of the player into the reinforcement learning algorithm model when the player triggers a recommendation request for game merchandise before inputting the current state set of the player into the reinforcement learning algorithm model;
the processing module acquires the behavior of a player on the current game commodity when the player does not trigger a recommendation request for the game commodity, and calls a last state set of the player; wherein, the previous state set comprises the attribute feature vectors of the previous game commodity browsed by the player; inputting the last state set and the current state set of the player into a reinforcement learning algorithm model so that the reinforcement learning algorithm model takes the behavior of the current game commodity as a model reward, and updating an attribute prediction matrix set corresponding to the player in the reinforcement learning algorithm model.
9. A game item recommendation device, comprising: a memory, a processor, and a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any one of claims 1-6.
10. A readable storage medium, having stored thereon a computer program which is processed to execute to implement the method according to any one of claims 1-6.
CN201910406926.3A 2019-05-15 2019-05-15 Game commodity recommendation method and device and readable storage medium Active CN110135951B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910406926.3A CN110135951B (en) 2019-05-15 2019-05-15 Game commodity recommendation method and device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910406926.3A CN110135951B (en) 2019-05-15 2019-05-15 Game commodity recommendation method and device and readable storage medium

Publications (2)

Publication Number Publication Date
CN110135951A CN110135951A (en) 2019-08-16
CN110135951B true CN110135951B (en) 2021-07-27

Family

ID=67574516

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910406926.3A Active CN110135951B (en) 2019-05-15 2019-05-15 Game commodity recommendation method and device and readable storage medium

Country Status (1)

Country Link
CN (1) CN110135951B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110570287B (en) * 2019-09-27 2022-02-08 网易(杭州)网络有限公司 Virtual commodity recommendation method, device, system and server
CN110782288A (en) * 2019-10-25 2020-02-11 广州凌鑫达实业有限公司 Cloud computing aggregate advertisement data processing method, device, equipment and medium
CN112712161B (en) * 2019-10-25 2023-02-24 上海哔哩哔哩科技有限公司 Data generation method and system
CN113440859B (en) * 2021-07-04 2023-05-16 王禹豪 Game item value generation and detection method, device and storage medium
CN113509727B (en) * 2021-07-09 2024-06-04 网易(杭州)网络有限公司 Method and device for displaying props in game, electronic equipment and medium
CN113763038A (en) * 2021-08-23 2021-12-07 广州快批信息科技有限公司 Method, device and system for promotion management of service
CN114612126A (en) * 2021-12-08 2022-06-10 江苏众亿国链大数据科技有限公司 Information pushing method based on big data

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102201026A (en) * 2010-03-23 2011-09-28 上海美你德软件有限公司 Method and system for recommending information to players in virtual environment
CN102902691B (en) * 2011-07-28 2016-09-07 上海拉手信息技术有限公司 Recommend method and system
CN105469263A (en) * 2014-09-24 2016-04-06 阿里巴巴集团控股有限公司 Commodity recommendation method and device
CN108230057A (en) * 2016-12-09 2018-06-29 阿里巴巴集团控股有限公司 A kind of intelligent recommendation method and system
CN107145506B (en) * 2017-03-22 2020-11-06 无锡中科富农物联科技有限公司 Improved content-based agricultural commodity recommendation method
CN107515909B (en) * 2017-08-11 2020-05-19 深圳市云网拜特科技有限公司 Video recommendation method and system
CN108304440B (en) * 2017-11-01 2021-08-31 腾讯科技(深圳)有限公司 Game pushing method and device, computer equipment and storage medium
CN109471963A (en) * 2018-09-13 2019-03-15 广州丰石科技有限公司 A kind of proposed algorithm based on deeply study
CN109543840B (en) * 2018-11-09 2023-01-10 北京理工大学 Dynamic recommendation system design method based on multidimensional classification reinforcement learning

Also Published As

Publication number Publication date
CN110135951A (en) 2019-08-16

Similar Documents

Publication Publication Date Title
CN110135951B (en) Game commodity recommendation method and device and readable storage medium
US9082086B2 (en) Adaptively learning a similarity model
CN110008973B (en) Model training method, method and device for determining target user based on model
CN110008397B (en) Recommendation model training method and device
US20120296776A1 (en) Adaptive interactive search
US10672055B2 (en) Method and system for presenting personalized products based on digital signage for electronic commerce
CN111260449B (en) Model training method, commodity recommendation device and storage medium
JP7130991B2 (en) ADVERTISING DISPLAY SYSTEM, DISPLAY DEVICE, ADVERTISING OUTPUT DEVICE, PROGRAM AND ADVERTISING DISPLAY METHOD
CN110598120A (en) Behavior data based financing recommendation method, device and equipment
US20150235264A1 (en) Automatic entity detection and presentation of related content
CN111651669A (en) Information recommendation method and device, electronic equipment and computer-readable storage medium
CN113077317A (en) Item recommendation method, device and equipment based on user data and storage medium
CN109961351A (en) Information recommendation method, device, storage medium and computer equipment
CN111861605A (en) Business object recommendation method
CN115907868A (en) Advertisement delivery analysis method and device
CN114820123A (en) Group purchase commodity recommendation method, device, equipment and storage medium
CN111651679A (en) Recommendation method and device based on reinforcement learning
CN111915414A (en) Method and device for displaying target object sequence to target user
CN109299378B (en) Search result display method and device, terminal and storage medium
US10740815B2 (en) Searching device, searching method, recording medium, and program
CN110851708A (en) Negative sample extraction method and device, computer equipment and storage medium
CN112015970A (en) Product recommendation method, related equipment and computer storage medium
US20220207584A1 (en) Learning device, computer-readable information storage medium, and learning method
US20130146654A1 (en) Fireworks information systems and methods
KR102354982B1 (en) Method and apparatus for providing clothing platform service based on big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant