CN110825956A - Information flow recommendation method and device, computer equipment and storage medium - Google Patents

Information flow recommendation method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN110825956A
CN110825956A CN201910873490.9A CN201910873490A CN110825956A CN 110825956 A CN110825956 A CN 110825956A CN 201910873490 A CN201910873490 A CN 201910873490A CN 110825956 A CN110825956 A CN 110825956A
Authority
CN
China
Prior art keywords
user
candidate content
recommendation
candidate
information flow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910873490.9A
Other languages
Chinese (zh)
Inventor
陈辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Life Insurance Company of China Ltd
Original Assignee
Ping An Life Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Life Insurance Company of China Ltd filed Critical Ping An Life Insurance Company of China Ltd
Priority to CN201910873490.9A priority Critical patent/CN110825956A/en
Publication of CN110825956A publication Critical patent/CN110825956A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9574Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application belongs to the technical field of intelligent recommendation of big data, and relates to an information flow recommendation method, which comprises the following steps: acquiring user behavior data according to a recommendation request of a user, wherein the user behavior data records behavior characteristics of the user on various types of candidate contents; screening second candidate content from first candidate content in a candidate content pool according to user behavior data, wherein the first candidate content comprises multiple types of candidate content; splicing the behavior characteristics of the multiple types of candidate contents with the low-dimensional vector of the second candidate content by the user to form input characteristics; inputting the input features into a preset recommendation scoring model supporting the input features, and acquiring a scoring result of the second candidate content output by the recommendation scoring model; and selecting recommended content from the second candidate content according to the score. According to the method and the system, a recommendation model does not need to be specially trained for each type, and online deployment of the recommendation model by the system is simplified.

Description

Information flow recommendation method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of intelligent big data recommendation technologies, and in particular, to an information flow recommendation method and apparatus, a computer device, and a storage medium.
Background
With the development of internet technology, networks are becoming important sources for people to acquire information, and especially in recent years, the amount of network information is rapidly increasing in geometric level. In order to provide information to interested users quickly and specifically, various website systems, internet community systems and the like recommend information streams (generally comprising information, questions and answers, topics, encyclopedias, stickers and the like) to the users according to specific reckoning user behaviors, so that the users can find the interested information streams conveniently.
There are many existing information flow recommendation algorithms, wherein an information flow content recommendation algorithm is a mainstream recommendation algorithm of the existing information flow. However, the existing information flow content recommendation algorithm is to model a user in a certain type of content, when there are multiple types of candidate recommended content, a model needs to be established for each type, and this modeling manner is only to model in each single type of recommended content, and cannot fully utilize the behavior data of the user, which may reduce the accuracy of the recommendation algorithm; and because the online deployment of a plurality of models is complex, the whole recommendation system is complex.
Disclosure of Invention
An embodiment of the application aims to provide an information flow recommendation method, an information flow recommendation device, computer equipment and a storage medium, so as to solve the problem that in the prior art, a model needs to be established for each type, so that a recommendation system is complex.
In order to solve the above technical problem, an embodiment of the present application provides an information flow recommendation method, which adopts the following technical solutions:
acquiring user behavior data according to a recommendation request of a user, wherein the user behavior data records behavior characteristics of the user on various types of candidate contents;
screening out second candidate content from first candidate content in a candidate content pool according to the user behavior data, wherein the first candidate content comprises multiple types of candidate content;
splicing the behavior characteristics of the multiple types of candidate contents with the low-dimensional vector of the second candidate content by the user to form input characteristics;
inputting the input features into a pre-trained recommended scoring model, and acquiring a scoring result of the second candidate content output by the recommended scoring model;
and selecting recommended content from the second candidate content according to the score.
Further, the step of obtaining the user behavior data according to the recommendation request of the user includes:
acquiring a user ID carried in the recommendation request;
acquiring behavior data of a user corresponding to the user ID from a user behavior database;
analyzing the behavior data to obtain user behavior characteristics, wherein the user behavior characteristics comprise click sequences of different types of contents by the user.
Further, the step of "screening out second candidate content from the first candidate content in the candidate content pool according to the user behavior data" includes:
acquiring a user portrait label according to the user behavior data;
recalling second candidate content from the first candidate content according to the user portrait label.
Further, the step of splicing the behavior characteristics of the multiple types of candidate contents of the user with the low-dimensional vector of the second candidate content to form the input characteristics includes:
and splicing the click sequences of different contents and the low-dimensional vector of the second candidate content by the user to form a two-dimensional matrix, wherein the data of the two-dimensional matrix is the input characteristic.
Further, the step of inputting the input features into a pre-trained recommendation score model and obtaining the score result of the second candidate content output by the recommendation score model includes:
inputting data of the two-dimensional matrix as input features into a gate control cycle unit GRU deep neural network model, wherein the GRU deep neural network model supports the input features;
and receiving the predicted click rate of each second candidate content output by the GRU deep neural network model.
Further, the recommendation scoring model is trained by:
acquiring a Feeds exposure click log, and extracting click sequence samples of information streams of various types of contents from the click log and corresponding information stream samples of various types;
mapping each type of information flow sample to a low-latitude vector space through a neural network language model to obtain a low-latitude vector sample of each information flow;
splicing the click sequence sample of the information flow with the low latitude vector sample of the information flow to obtain a two-dimensional matrix of the information flow, and taking the data of the two-dimensional matrix of the information flow as training data;
training a GRU deep neural network model by using the training data to obtain an initial model;
evaluating the initial model through AUC to obtain an evaluation score;
and if the evaluation score does not reach a preset value, iteratively updating the initial model until the evaluation score reaches the preset value to obtain a recommended scoring model.
Further, after the step of selecting the recommended content from the second candidate content according to the scoring result, the method further includes: monitoring the updating times of the user behavior data in real time;
and when the updating times reach the preset times, re-acquiring the historical behavior data of the user and re-training the recommendation scoring model.
In order to solve the above technical problem, an embodiment of the present application further provides an information flow recommendation apparatus, where the information flow recommendation apparatus includes:
the data acquisition module is used for acquiring user behavior data according to a recommendation request of a user, wherein the user behavior data records behavior information of the user on various types of candidate contents;
the candidate content screening module is used for screening out second candidate content from first candidate content in a candidate content pool according to the user behavior data, wherein the first candidate content comprises multiple types of candidate content;
the splicing module is used for splicing the behavior characteristics of the multiple types of candidate contents of the user with the low-dimensional vector of the second candidate content to form input characteristics;
the scoring module is used for inputting the input features into a pre-trained recommended scoring model and acquiring a scoring result of the second candidate content output by the recommended scoring model;
and the recommending module is used for selecting recommended content from the second candidate content according to the score.
In order to solve the above technical problem, an embodiment of the present application further provides a computer device, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the steps of the information flow recommendation method as described above when executing the computer program.
In order to solve the above technical problem, an embodiment of the present application further provides a computer-readable storage medium, which adopts the following technical solutions:
the computer-readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of the information flow recommendation method as set forth above.
Compared with the prior art, the embodiment of the application mainly has the following beneficial effects:
the behavior characteristics of the user on various types of candidate content and the low-dimensional vectors of the candidate content are combined and then input into the recommendation model, so that the behavior data of the user on various types of content can be fully utilized, the user interest can be more accurately described, the recommendation accuracy is improved, and the recommendation model is trained by the low-dimensional vectors of various types of candidate content and the behavior characteristics of the user, so that various types of candidate content can be comprehensively analyzed according to the behavior characteristics of the user, a recommendation model is not required to be specially trained for each type, and the online deployment of the recommendation model by the system is simplified.
Drawings
In order to more clearly illustrate the solution of the present application, the drawings needed for describing the embodiments of the present application will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow diagram of one embodiment of an information flow recommendation method according to the present application;
FIG. 3 is a diagram of mapping candidate content to a vector space at a low latitude through a neural network language model in the present application;
FIG. 4 is a schematic block diagram of one embodiment of an information flow recommendation device according to the present application;
FIG. 5 is a schematic block diagram of one embodiment of a computer device according to the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof, in the description and claims of this application and the description of the above figures are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the above-described drawings are used for distinguishing between different objects and not for describing a particular order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have various communication client applications installed thereon, such as a web browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture experts Group Audio Layer III, mpeg compression standard Audio Layer 3), MP4 players (Moving Picture experts Group Audio Layer IV, mpeg compression standard Audio Layer 4), laptop portable computers, desktop computers, and the like.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.
It should be noted that the information flow recommendation method provided in the embodiment of the present application is generally executed by a server, and accordingly, the information flow recommendation apparatus is generally disposed in the server.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow diagram of one embodiment of an information flow recommendation method in accordance with the present application is shown. The information flow recommendation method comprises the following steps:
step 201, obtaining user behavior data according to a recommendation request of a user, wherein the user behavior data records behavior characteristics of the user on multiple types of candidate contents.
In this embodiment, the electronic device (for example, the server shown in fig. 1) on which the information flow recommendation method operates may be connected to the terminal device by a wired connection or a wireless connection. It should be noted that the wireless connection means may include, but is not limited to, a 3G/4G connection, a WiFi connection, a bluetooth connection, a WiMAX connection, a Zigbee connection, a uwb (ultra wideband) connection, and other wireless connection means now known or developed in the future.
In this embodiment, when a user requests to open or refresh a page at a terminal, the terminal may initiate a recommendation request and then send the recommendation request to a recommendation system. The recommendation request carries a user ID, the behavior data of the user can be obtained from a user behavior database according to the user ID, and the behavior data is analyzed to obtain the behavior characteristics of the user on various types of candidate contents.
In practical applications, different input components (such as a mouse, a keyboard, a touch screen, and the like) are configured for a terminal, and different request initiation entries are different from each other, and related operations triggered in the request initiation entries are different from each other. For example, the related operations include, but are not limited to, clicking, moving, dragging, sliding, and the like.
For example, when the terminal is a smart phone, the request initiation entry may be a session page presented in a touch screen configured for the smart phone, where the session page shows a plurality of recommended candidate information, and the user may pull down the session page to enable the terminal to initiate an information recommendation request, so as to update and show the candidate information returned by the server in the session page. The pull-down operation is a related operation that requests to initiate an entry trigger.
In this embodiment, step 201 includes:
acquiring a user ID carried in the recommendation request;
acquiring behavior data of a user corresponding to the user ID from a user behavior database;
analyzing the behavior data to obtain user behavior characteristics, wherein the user behavior characteristics comprise click sequences of different types of contents by the user.
In practical application, the user behavior database records the behavior information of the user on different types of information streams. The behavior information may include: user ID, behavior category, behavior sending time, behavior object identification and the like; the behavior categories may include: click rate, click volume, visit rate, visit module, collection concerns, and post consultations and comments on the behavioral objects.
In this embodiment, the click sequences of different types of content by the user may be extracted from the behavior data as the behavior characteristics of the user.
Specifically, the behavior information may be extracted from the user behavior data, and then the click rate or click rate of the user on information streams of various types of content may be obtained from the behavior information, so as to obtain a click sequence of the user on information streams of different types of content (such as information, question and answer, topics, encyclopedia, and stickers) as the behavior feature of the user.
Step 202, according to the user behavior data, a second candidate content is screened from first candidate contents in a candidate content pool, wherein the first candidate contents comprise multiple types of candidate contents.
In the step, a small part of content which is matched with the user is screened out from the big data. In practical application, the candidate content pool of this step may be stored in an online cache as a candidate content pool of all users of the online recommendation system. The candidate content (i.e., the first candidate content) in the candidate content pool may be screened from the total candidate content pool of the recommendation system (a specific screening algorithm may adopt a known data screening algorithm, which is a preliminary screening of the big data, and is not described in detail in this embodiment). The candidate content of the total candidate content pool includes all types of candidate content.
Taking a safe Application (APP) of a gold manager as an example, in an information flow recommendation system of the gold manager APP, a content library composed of all article structures (including information, questions and answers, topics, encyclopedias, stickers, and the like) serves as a large total candidate content pool (of hundreds of thousands of orders). The recommendation system screens partial candidate contents from the large total candidate content pool through a preset screening algorithm to construct a smaller candidate content pool (tens of thousands of orders) to be stored in an online cache to serve as the candidate content pool of all users of the online recommendation system. The candidate content pool may include multiple types of candidate content, such as: information, questions and answers, topics, encyclopedia, stickers, etc.
In this embodiment, step 202 includes:
acquiring a user portrait label according to the user behavior data;
recalling second candidate content from the first candidate content according to the user portrait label.
Specifically, the user portrait label may include data of multiple dimensions such as user demographic attributes (gender, academic calendar, etc.), historical behaviors, short-term behaviors, interest content, personal preference, etc., which is a basis for making personalized recommendations for the user.
In practical application, the types of recall strategies are many, and a suitable recall strategy can be selected according to practical situations, and at present, an inverted idea is commonly used:
firstly, an inverted index is maintained offline, the inverted keyword (key) can be article attributes such as classification, topic (topic), keyword, information source and the like, and can also be a result of collaborative filtering calculation, and the click rate, timeliness, relevance and the like need to be considered during sorting;
then, when a recommendation request is received, content is quickly recalled from the inverted list by truncation according to the portrait label of the user, and a small part of content matched with the user is efficiently screened from a large content library.
This step recalls that the second filtering is performed based on the figure of a certain user, and the filtered content is the second candidate content.
Step 203, the behavior characteristics of the multiple types of candidate contents of the user are spliced with the low-dimensional vector of the second candidate content to form input characteristics.
In this embodiment, the click sequences of the multiple types of candidate contents and the low-dimensional vector of the second candidate content may be spliced to form a two-dimensional matrix, and data of the two-dimensional matrix is the input feature.
In this embodiment, both the total candidate pool of the recommendation system and the candidate pool in the online cache include low-dimensional vectors of candidate contents, and one method for obtaining the low-dimensional vectors is to map words of the candidate contents into a low-latitude vector space through a neural network language model, that is, each content is represented by an embedding vector.
The splicing in this step is vector splicing, which is a set of two-dimensional data, and table 1 is an explanation of the two-dimensional matrix.
In one possible scenario, the candidate content types in the candidate content pool include information, question and answer, topic, encyclopedia, and post. Using the user's content click-through sequence as a feature, each article (e.g., information, question and answer, topic, encyclopedia, post, all published in the form of an article) of the second candidate content has been mapped to an embedding vector (as shown in FIG. 3). Thus, the input features that are formed include: information click sequence, question-answer click sequence, topic click sequence, encyclopedic click sequence and sub-click sequence; when the candidate information flow is an article, the embedding vector is the embedding vector of the content of the candidate article. The main features are shown in table 1.
TABLE 1
Figure BDA0002203578380000081
Taking the information click sequence and the embedding of each article as an example, if the acquired information click sequence of the user is a data sequence of 10 × 1 (the data sequence is the click sequence and is obtained in step 201), and the dimension of the embedding vector of each article (i.e., the second candidate content and the content recalled from the candidate pool and obtained in step 202) is 256 dimensions (see table 1), then the information click sequence and the embedding vector of each article form a two-dimensional matrix of 10 × 256.
As can be seen from the above table, the user's behavior on each content type can be represented as a feature of a recommendation scoring model that can be trained to incorporate the user's features on all content types.
And 204, inputting the input features into a pre-trained recommended rating model, and acquiring a rating result of the second candidate content output by the recommended rating model.
In this embodiment, step 204 includes:
inputting data of the two-dimensional matrix into a GRU (Gated Current Unit) deep neural network model as input features;
and receiving the predicted click rate of each second candidate content output by the GRU deep neural network model.
In practical application, the GRU deep neural network model needs to be trained in advance, and the training samples obtain the click sequence samples of various types of contents and the embedding vector samples of the corresponding contents from historical user behavior data, such as Feeds (information stream) exposure click logs.
In this embodiment, data of a two-dimensional matrix, which is characterized by a click sequence of each type of content and a currently scored second candidate content low-dimensional vector, is input, so that a predicted click rate of each second candidate content, which is a scoring result of the current second candidate content, is output.
Step 205, selecting recommended content from the second candidate content according to the scoring result.
Specifically, step 205 may include:
sorting the predicted click rates;
and selecting the second candidate content with the ranking meeting the preset condition as the recommended content.
In practical application, the preset condition can be set according to actual needs. For example, the predicted click rate of each second candidate content may be ranked from high to low, and the second candidate content ranked at the top two may be selected to generate a recommendation list and recommend the recommendation list to the user.
In this embodiment, before step 201, the method may further include the following steps:
mapping all first candidate contents in the candidate pool to a vector space at a low latitude through a neural network language model.
In practical applications, different types of contents are composed of words, and the words can be mapped into a vector space at a low latitude through a neural network language model, that is, each content is represented by embedding (as shown in fig. 3). The neural network language model can be trained based on general large-scale news forecast (in the tens of millions), the dimension (namely vector dimension) of embedding is set to be 256 dimensions at present, and the dimension is set manually according to application needs.
In this embodiment, before step 201, the method may further include the following steps:
and training the recommendation scoring model.
Specifically, the step of training the recommendation score model may include:
acquiring a Feeds exposure click log, and extracting click sequence samples of information streams of various types of contents from the click log and corresponding information stream samples of various types;
mapping each type of information flow sample to a low-latitude vector space through a neural network language model to obtain a low-latitude vector sample of each information flow;
splicing the click sequence sample of the information flow with the low latitude vector sample of the information flow to obtain a two-dimensional matrix of the information flow, and taking the data of the two-dimensional matrix of the information flow as training data;
training a GRU deep neural network model by using the training data to obtain an initial model;
evaluating the initial model through AUC to obtain an evaluation score;
and if the evaluation score does not reach a preset value, iteratively updating the initial model until the evaluation score reaches the preset value to obtain a recommended scoring model.
Specifically, the step of obtaining the historical behavior data of the user and extracting the user behavior data and the related information stream from the historical behavior data may be implemented by the following method:
and acquiring a Feeds exposure click log, and extracting click sequences of information streams of various types of contents from the click log and corresponding information streams of various types.
In practical application, a log is generated for each action of the information flow in the network interface by the user. The log would be stored in the designated server. There are many kinds of user behavior logs, and different logs record different information, for example, a Feeds exposure click log records exposure and click information of Feeds, such as: stream ID, exposure time, click time, operating system, browser, user ID, etc. The click sequence of the information flow of each type of content of the user and the clicked information flow can be obtained by accessing the server.
Specifically, the step of "constructing the training data of the recommendation model to be trained according to the user behavior data and the information flow" may include:
mapping each information stream to a vector space at a low latitude through a neural network language model to obtain a low latitude vector of each information stream;
and splicing the click sequence and the low latitude vector of the information flow to obtain a two-dimensional matrix, and taking the data of the two-dimensional matrix as training data.
In practical applications, in this embodiment, the recommendation model can be evaluated by AUC (Area Under ROC Curve).
According to the evaluation result, optimizing the recommendation model and repeating the training and evaluation steps to obtain an available recommendation model with the evaluation result reaching the standard, wherein the available recommendation model comprises the following steps:
and when the evaluation score does not reach the preset value, iteratively updating the recommendation model until the evaluation score reaches the preset value.
In this embodiment, after the step of selecting the recommended content from the second candidate content according to the scoring result, the method further includes: monitoring the updating times of the user behavior data in real time;
and when the updating times reach the preset times, re-acquiring the historical behavior data of the user and re-training the recommendation scoring model.
In practical application, the step of training the recommendation score model may be repeated according to the update condition of the user behavior data (for example, the update times reach a preset number), and the recommendation model is trained at regular time, so as to improve the recommendation accuracy of the recommendation model.
In the embodiment, the click sequences of various types of content of the user are used as the behavior features of the user to be spliced with the low-dimensional vectors of the candidate content, in practical application, other behavior features such as sequences of the collection amount of different types of content, sequences of the access amount and the like can be adopted, and when the adopted different behavior features are spliced with the low-dimensional vectors of the candidate content, only corresponding types of training samples need to be adopted to train the recommendation model.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the computer program is executed. The storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
With further reference to fig. 4, as an implementation of the method shown in fig. 2, the present application provides an embodiment of an information flow recommendation apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.
As shown in fig. 4, the information flow recommendation apparatus 400 according to this embodiment includes:
a data obtaining module 401, configured to obtain user behavior data according to a recommendation request of a user, where the user behavior data records behavior information of the user on multiple types of candidate content;
a candidate content screening module 402, configured to screen a second candidate content from first candidate contents in a candidate content pool according to the user behavior data, where the first candidate content includes multiple types of candidate contents;
a splicing module 403, configured to splice behavior features of multiple types of candidate content by the user with the low-dimensional vector of the second candidate content to form an input feature;
a scoring module 404, configured to input the input features into a pre-trained recommended scoring model, and obtain a scoring result of the second candidate content output by the recommended scoring model;
a recommending module 405, configured to select recommended content from the second candidate content according to the score.
In this embodiment, the data obtaining module includes:
the ID obtaining submodule is used for obtaining the user ID carried in the recommendation request;
the behavior data acquisition submodule is used for acquiring the behavior data of the user corresponding to the user ID from a user behavior database;
and the behavior characteristic obtaining submodule is used for analyzing the behavior data to obtain the behavior characteristics of the user, and the behavior characteristics of the user comprise click sequences of the user on different types of contents.
In this embodiment, the candidate content screening module 402 includes:
the portrait label acquisition sub-module is used for acquiring a user portrait label according to the user behavior data;
and the retrieving submodule is used for recalling second candidate content from the first candidate content according to the user portrait label.
In this embodiment, the splicing module 403 is further configured to splice the click sequences of different contents and the low-dimensional vector of the second candidate content by the user to form a two-dimensional matrix, where data of the two-dimensional matrix is the input feature.
In this embodiment, the scoring module 404 includes:
the input module is used for inputting the data of the two-dimensional matrix into a gated cyclic unit GRU deep neural network model as input characteristics;
and the receiving module is used for receiving the predicted click rate of each second candidate content output by the GRU deep neural network model.
The information flow recommendation apparatus 400 further includes:
and the training module is used for training the recommendation scoring model.
The training module comprises:
the information flow acquisition submodule is used for acquiring historical behavior data of a user and extracting the user behavior data and related information flow from the historical behavior data;
the training data construction submodule is used for constructing training data of a recommendation model to be trained according to the user behavior data and the information flow;
the training submodule is used for inputting the training data into a recommendation to be trained for training to obtain a recommendation scoring model;
and the evaluation submodule is used for evaluating the recommendation scoring model.
The information flow recommendation device of the embodiment combines the behavior characteristics of the user on various types of candidate content with the low-dimensional vectors of the candidate content and inputs the combined behavior characteristics into the recommendation model, so that the behavior data of the user on various types of content can be fully utilized, the user interest can be more accurately described, the recommendation accuracy is improved, and the recommendation model is trained by the low-dimensional vectors of various types of candidate content and the behavior characteristics of the user, so that various types of candidate content can be comprehensively analyzed according to the behavior characteristics of the user, a recommendation model is not required to be specially trained for each type, and the online deployment of the system on the recommendation model is simplified.
In order to solve the technical problem, an embodiment of the present application further provides a computer device. Referring to fig. 5, fig. 5 is a block diagram of a basic structure of a computer device according to the present embodiment.
The computer device 5 comprises a memory 51, a processor 52, a network interface 53 communicatively connected to each other via a system bus. It is noted that only a computer device 5 having components 51-53 is shown, but it is understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead. As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable gate array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer equipment can carry out man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.
The memory 51 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the memory 51 may be an internal storage unit of the computer device 5, such as a hard disk or a memory of the computer device 5. In other embodiments, the memory 51 may also be an external storage device of the computer device 5, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a flash Card (FlashCard), or the like, provided on the computer device 5. Of course, the memory 51 may also comprise both an internal storage unit of the computer device 5 and an external storage device thereof. In this embodiment, the memory 51 is generally used for storing an operating system installed in the computer device 5 and various application software, such as program codes of an information flow recommendation method. Further, the memory 51 may also be used to temporarily store various types of data that have been output or are to be output.
The processor 52 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 52 is typically used to control the overall operation of the computer device 5. In this embodiment, the processor 52 is configured to execute the program code stored in the memory 51 or process data, for example, execute the program code of the information flow recommendation method.
The network interface 53 may comprise a wireless network interface or a wired network interface, and the network interface 53 is generally used for establishing communication connections between the computer device 5 and other electronic devices.
The present application provides another embodiment, which is to provide a computer-readable storage medium storing an information flow recommendation program, which is executable by at least one processor to cause the at least one processor to perform the steps of the information flow recommendation method as described above.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.
It is to be understood that the above-described embodiments are merely illustrative of some, but not restrictive, of the broad invention, and that the appended drawings illustrate preferred embodiments of the invention and do not limit the scope of the invention. This application is capable of embodiments in many different forms and is provided for the purpose of enabling a thorough understanding of the disclosure of the application. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to one skilled in the art that the present application may be practiced without modification or with equivalents of some of the features described in the foregoing embodiments. All equivalent structures made by using the contents of the specification and the drawings of the present application are directly or indirectly applied to other related technical fields and are within the protection scope of the present application.

Claims (10)

1. An information flow recommendation method, comprising the steps of:
acquiring user behavior data according to a recommendation request of a user, wherein the user behavior data records behavior characteristics of the user on various types of candidate contents;
screening out second candidate content from first candidate content in a candidate content pool according to the user behavior data, wherein the first candidate content comprises multiple types of candidate content;
splicing the behavior characteristics of the multiple types of candidate contents with the low-dimensional vector of the second candidate content by the user to form input characteristics;
inputting the input features into a pre-trained recommended scoring model, and acquiring a scoring result of the second candidate content output by the recommended scoring model;
and selecting recommended content from the second candidate content according to the grading result.
2. The information flow recommendation method according to claim 1, wherein the step of obtaining user behavior data according to the recommendation request of the user comprises:
acquiring a user ID carried in the recommendation request;
acquiring behavior data of a user corresponding to the user ID from a user behavior database;
analyzing the behavior data to obtain user behavior characteristics, wherein the user behavior characteristics comprise click sequences of different types of contents by the user.
3. The information flow recommendation method of claim 1, wherein the step of filtering out second candidate content from first candidate content in a candidate content pool according to the user behavior data comprises:
acquiring a user portrait label according to the user behavior data;
recalling second candidate content from the first candidate content according to the user portrait label.
4. The information flow recommendation method according to claim 2, wherein the step of splicing the behavior features of the plurality of types of candidate contents with the low-dimensional vector of the second candidate content by the user to form the input features comprises:
and splicing the click sequences of different contents and the low-dimensional vector of the second candidate content by the user to form a two-dimensional matrix, wherein the data of the two-dimensional matrix is the input characteristic.
5. The information flow recommendation method according to claim 4, wherein the step of inputting the input features into a pre-trained recommendation score model and obtaining the score result of the second candidate content output by the recommendation score model comprises:
inputting the data of the two-dimensional matrix as input features into a gated cyclic unit GRU deep neural network model;
and receiving the predicted click rate of each second candidate content output by the GRU deep neural network model as the grading result.
6. The information flow recommendation method of claim 1, wherein the recommendation scoring model is trained by:
acquiring a Feeds exposure click log, and extracting click sequence samples of information streams of various types of contents from the click log and corresponding information stream samples of various types;
mapping each type of information flow sample to a low-latitude vector space through a neural network language model to obtain a low-latitude vector sample of each information flow;
splicing the click sequence sample of the information flow with the low latitude vector sample of the information flow to obtain a two-dimensional matrix of the information flow, and taking the data of the two-dimensional matrix of the information flow as training data;
training a GRU deep neural network model by using the training data to obtain an initial model;
evaluating the initial model through AUC to obtain an evaluation score;
and if the evaluation score does not reach a preset value, iteratively updating the initial model until the evaluation score reaches the preset value to obtain a recommended scoring model.
7. The information flow recommendation method according to claim 6, wherein after the step of selecting the recommended content from the second candidate content according to the scoring result, the method further comprises: monitoring the updating times of the user behavior data in real time;
and when the updating times reach the preset times, re-acquiring the historical behavior data of the user and re-training the recommendation scoring model.
8. An information flow recommendation apparatus, comprising:
the data acquisition module is used for acquiring user behavior data according to a recommendation request of a user, wherein the user behavior data records behavior information of the user on various types of candidate contents;
the candidate content screening module is used for screening out second candidate content from first candidate content in a candidate content pool according to the user behavior data, wherein the first candidate content comprises multiple types of candidate content;
the splicing module is used for splicing the behavior characteristics of the multiple types of candidate contents of the user with the low-dimensional vector of the second candidate content to form input characteristics;
the scoring module is used for inputting the input features into a pre-trained recommended scoring model and acquiring a scoring result of the second candidate content output by the recommended scoring model;
and the recommending module is used for selecting recommended content from the second candidate content according to the score.
9. A computer device comprising a memory in which a computer program is stored and a processor which, when executing the computer program, carries out the steps of the information flow recommendation method according to any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the information flow recommendation method according to any one of claims 1 to 7.
CN201910873490.9A 2019-09-17 2019-09-17 Information flow recommendation method and device, computer equipment and storage medium Pending CN110825956A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910873490.9A CN110825956A (en) 2019-09-17 2019-09-17 Information flow recommendation method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910873490.9A CN110825956A (en) 2019-09-17 2019-09-17 Information flow recommendation method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN110825956A true CN110825956A (en) 2020-02-21

Family

ID=69547926

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910873490.9A Pending CN110825956A (en) 2019-09-17 2019-09-17 Information flow recommendation method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110825956A (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111460292A (en) * 2020-03-30 2020-07-28 百度在线网络技术(北京)有限公司 Model evaluation method, apparatus, device, and medium
CN111488527A (en) * 2020-04-17 2020-08-04 滴图(北京)科技有限公司 Position recommendation method and device, electronic equipment and computer-readable storage medium
CN111626832A (en) * 2020-06-05 2020-09-04 中国银行股份有限公司 Product recommendation method and device and computer equipment
CN111666462A (en) * 2020-04-28 2020-09-15 百度在线网络技术(北京)有限公司 Geographical position recommendation method, device, equipment and computer storage medium
CN111858873A (en) * 2020-04-21 2020-10-30 北京嘀嘀无限科技发展有限公司 Method and device for determining recommended content, electronic equipment and storage medium
CN111898028A (en) * 2020-08-07 2020-11-06 北京小米移动软件有限公司 Entity object recommendation method, device and storage medium
CN112084408A (en) * 2020-09-08 2020-12-15 中国平安财产保险股份有限公司 List data screening method and device, computer equipment and storage medium
CN112650931A (en) * 2021-01-04 2021-04-13 杭州情咖网络技术有限公司 Content recommendation method
CN112989187A (en) * 2021-02-25 2021-06-18 平安科技(深圳)有限公司 Recommendation method and device for creative materials, computer equipment and storage medium
CN113157898A (en) * 2021-05-26 2021-07-23 中国平安人寿保险股份有限公司 Method and device for recommending candidate questions, computer equipment and storage medium
CN113407846A (en) * 2021-07-09 2021-09-17 北京沃东天骏信息技术有限公司 Recommendation model updating method and device
CN113420018A (en) * 2021-06-22 2021-09-21 平安科技(深圳)有限公司 User behavior data analysis method, device, equipment and storage medium
CN114117193A (en) * 2020-08-26 2022-03-01 腾讯科技(深圳)有限公司 Control method, device, equipment and medium for updating content in content recommendation pool
CN114626434A (en) * 2022-01-28 2022-06-14 腾讯科技(深圳)有限公司 Training method of feature extraction model and object data processing method
CN116167829A (en) * 2023-04-26 2023-05-26 湖南惟客科技集团有限公司 Multidimensional and multi-granularity user behavior analysis method

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108875916A (en) * 2018-06-27 2018-11-23 北京工业大学 A kind of ad click rate prediction technique based on GRU neural network
CN109190791A (en) * 2018-07-25 2019-01-11 广州优视网络科技有限公司 Using the appraisal procedure of recommended models, device and electronic equipment
CN109241431A (en) * 2018-09-07 2019-01-18 腾讯科技(深圳)有限公司 A kind of resource recommendation method and device
CN109299396A (en) * 2018-11-28 2019-02-01 东北师范大学 Convolutional neural network collaborative filtering recommendation method and system fused with attention model
CN109345302A (en) * 2018-09-27 2019-02-15 腾讯科技(深圳)有限公司 Machine learning model training method, device, storage medium and computer equipment
CN109408724A (en) * 2018-11-06 2019-03-01 北京达佳互联信息技术有限公司 Multimedia resource estimates the determination method, apparatus and server of clicking rate
US20190121526A1 (en) * 2010-07-30 2019-04-25 International Business Machines Corporation Efficiently sharing user selected information with a set of determined recipients
CN109829116A (en) * 2019-02-14 2019-05-31 北京达佳互联信息技术有限公司 A kind of content recommendation method, device, server and computer readable storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190121526A1 (en) * 2010-07-30 2019-04-25 International Business Machines Corporation Efficiently sharing user selected information with a set of determined recipients
CN108875916A (en) * 2018-06-27 2018-11-23 北京工业大学 A kind of ad click rate prediction technique based on GRU neural network
CN109190791A (en) * 2018-07-25 2019-01-11 广州优视网络科技有限公司 Using the appraisal procedure of recommended models, device and electronic equipment
CN109241431A (en) * 2018-09-07 2019-01-18 腾讯科技(深圳)有限公司 A kind of resource recommendation method and device
CN109345302A (en) * 2018-09-27 2019-02-15 腾讯科技(深圳)有限公司 Machine learning model training method, device, storage medium and computer equipment
CN109408724A (en) * 2018-11-06 2019-03-01 北京达佳互联信息技术有限公司 Multimedia resource estimates the determination method, apparatus and server of clicking rate
CN109299396A (en) * 2018-11-28 2019-02-01 东北师范大学 Convolutional neural network collaborative filtering recommendation method and system fused with attention model
CN109829116A (en) * 2019-02-14 2019-05-31 北京达佳互联信息技术有限公司 A kind of content recommendation method, device, server and computer readable storage medium

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111460292A (en) * 2020-03-30 2020-07-28 百度在线网络技术(北京)有限公司 Model evaluation method, apparatus, device, and medium
CN111488527A (en) * 2020-04-17 2020-08-04 滴图(北京)科技有限公司 Position recommendation method and device, electronic equipment and computer-readable storage medium
CN111488527B (en) * 2020-04-17 2021-02-02 滴图(北京)科技有限公司 Position recommendation method and device, electronic equipment and computer-readable storage medium
CN111858873A (en) * 2020-04-21 2020-10-30 北京嘀嘀无限科技发展有限公司 Method and device for determining recommended content, electronic equipment and storage medium
CN111858873B (en) * 2020-04-21 2024-06-04 北京嘀嘀无限科技发展有限公司 Recommended content determining method and device, electronic equipment and storage medium
CN111666462A (en) * 2020-04-28 2020-09-15 百度在线网络技术(北京)有限公司 Geographical position recommendation method, device, equipment and computer storage medium
CN111626832A (en) * 2020-06-05 2020-09-04 中国银行股份有限公司 Product recommendation method and device and computer equipment
CN111626832B (en) * 2020-06-05 2023-10-03 中国银行股份有限公司 Product recommendation method and device and computer equipment
CN111898028B (en) * 2020-08-07 2024-04-19 北京小米移动软件有限公司 Entity object recommendation method, device and storage medium
CN111898028A (en) * 2020-08-07 2020-11-06 北京小米移动软件有限公司 Entity object recommendation method, device and storage medium
CN114117193A (en) * 2020-08-26 2022-03-01 腾讯科技(深圳)有限公司 Control method, device, equipment and medium for updating content in content recommendation pool
CN112084408A (en) * 2020-09-08 2020-12-15 中国平安财产保险股份有限公司 List data screening method and device, computer equipment and storage medium
CN112084408B (en) * 2020-09-08 2023-11-21 中国平安财产保险股份有限公司 List data screening method, device, computer equipment and storage medium
CN112650931A (en) * 2021-01-04 2021-04-13 杭州情咖网络技术有限公司 Content recommendation method
CN112650931B (en) * 2021-01-04 2023-05-30 杭州情咖网络技术有限公司 Content recommendation method
CN112989187A (en) * 2021-02-25 2021-06-18 平安科技(深圳)有限公司 Recommendation method and device for creative materials, computer equipment and storage medium
CN113157898A (en) * 2021-05-26 2021-07-23 中国平安人寿保险股份有限公司 Method and device for recommending candidate questions, computer equipment and storage medium
CN113157898B (en) * 2021-05-26 2022-10-14 中国平安人寿保险股份有限公司 Method and device for recommending candidate questions, computer equipment and storage medium
CN113420018A (en) * 2021-06-22 2021-09-21 平安科技(深圳)有限公司 User behavior data analysis method, device, equipment and storage medium
CN113407846A (en) * 2021-07-09 2021-09-17 北京沃东天骏信息技术有限公司 Recommendation model updating method and device
CN114626434A (en) * 2022-01-28 2022-06-14 腾讯科技(深圳)有限公司 Training method of feature extraction model and object data processing method
CN116167829A (en) * 2023-04-26 2023-05-26 湖南惟客科技集团有限公司 Multidimensional and multi-granularity user behavior analysis method
CN116167829B (en) * 2023-04-26 2023-08-29 湖南惟客科技集团有限公司 Multidimensional and multi-granularity user behavior analysis method

Similar Documents

Publication Publication Date Title
CN110825956A (en) Information flow recommendation method and device, computer equipment and storage medium
CN112632385B (en) Course recommendation method, course recommendation device, computer equipment and medium
US11620455B2 (en) Intelligently summarizing and presenting textual responses with machine learning
US11080340B2 (en) Systems and methods for classifying electronic information using advanced active learning techniques
CN107256267A (en) Querying method and device
CN107220386A (en) Information-pushing method and device
CN112653798A (en) Intelligent customer service voice response method and device, computer equipment and storage medium
CN112995414B (en) Behavior quality inspection method, device, equipment and storage medium based on voice call
CN111625715A (en) Information extraction method and device, electronic equipment and storage medium
CN111459959B (en) Method and apparatus for updating event sets
CN117952584A (en) Information recommendation method and device, electronic equipment and storage medium
CN113961811B (en) Event map-based conversation recommendation method, device, equipment and medium
CN116796729A (en) Text recommendation method, device, equipment and storage medium based on feature enhancement
CN116166858A (en) Information recommendation method, device, equipment and storage medium based on artificial intelligence
CN115525192A (en) User-oriented quotation charging method and device, computer equipment and storage medium
CN114580533A (en) Method, apparatus, device, medium, and program product for training feature extraction model
CN113569741A (en) Answer generation method and device for image test questions, electronic equipment and readable medium
CN112069807A (en) Text data theme extraction method and device, computer equipment and storage medium
CN112364649A (en) Named entity identification method and device, computer equipment and storage medium
US20240370478A1 (en) Recursive data analysis through automated database query generation
CN113792549B (en) User intention recognition method, device, computer equipment and storage medium
CN119398014A (en) A dynamic form generation method, device, computer equipment and storage medium
CN119127693A (en) Product recommendation test set generation method, device, computer equipment and medium
CN119293159A (en) Query processing method, device, computer equipment and medium based on artificial intelligence
CN119202195A (en) Question processing method, device, equipment and medium based on large language model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200221