CN116521972B - Information prediction method, device, electronic equipment and storage medium - Google Patents

Information prediction method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116521972B
CN116521972B CN202210062089.9A CN202210062089A CN116521972B CN 116521972 B CN116521972 B CN 116521972B CN 202210062089 A CN202210062089 A CN 202210062089A CN 116521972 B CN116521972 B CN 116521972B
Authority
CN
China
Prior art keywords
item
sequence
feature
predicted
items
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210062089.9A
Other languages
Chinese (zh)
Other versions
CN116521972A (en
Inventor
王刘鄞
段焕中
路彦雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202210062089.9A priority Critical patent/CN116521972B/en
Publication of CN116521972A publication Critical patent/CN116521972A/en
Application granted granted Critical
Publication of CN116521972B publication Critical patent/CN116521972B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Artificial Intelligence (AREA)
  • Algebra (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the application discloses an information prediction method, an information prediction device, electronic equipment and a storage medium; the embodiment of the application can acquire the object characteristics of the object to be predicted and the item characteristics of each item in the item sequence of the current session of the object to be predicted; determining fusion characteristics of each item based on the item characteristics and the position of the item in the item sequence, and determining session interest characteristics of the object to be predicted under the current session based on the fusion characteristics; calculating the attention characteristic of each item based on the object characteristic, the item characteristic and the association relation among the items in the item sequence corresponding to the object to be predicted, and determining the item conversion characteristic of the object to be predicted under the current session based on the attention characteristic; and predicting the next item of the object to be predicted based on the session interest feature and the item conversion feature of the object to be predicted, so as to obtain a prediction result of the object to be predicted. The scheme can effectively improve the accuracy of information prediction.

Description

Information prediction method, device, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to an information prediction method, an information prediction device, an electronic device, and a storage medium.
Background
With the rapid development of the current internet, more and more information layers are endless. Information resources on the internet are expanded exponentially, and massive information resources have various characteristics of isomerism, multiple elements, distribution and the like. Thus, it is very important to be able to accurately predict the operation of the user based on the session information of the user and to pertinently recommend the content of interest to the user. However, in the existing scheme, the session information of the learning user is relatively single, and the prediction accuracy is low.
Disclosure of Invention
The embodiment of the application provides an information prediction method, an information prediction device, electronic equipment and a storage medium, which can effectively improve the accuracy of information prediction.
The embodiment of the application provides an information prediction method, which comprises the following steps:
Acquiring object characteristics of an object to be predicted and item characteristics of each item in an item sequence of a current session of the object to be predicted;
Determining fusion characteristics of each item based on the item characteristics of each item and the position of the item in the item sequence, and determining session interest characteristics of an object to be predicted under the current session based on the fusion characteristics;
calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the object to be predicted, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the object to be predicted under the current session based on the attention characteristic;
And predicting the next item of the object to be predicted based on the session interest feature and the item conversion feature of the object to be predicted to obtain a prediction result of the object to be predicted.
Correspondingly, the embodiment of the application also provides an information prediction device, which comprises:
the acquisition unit is used for acquiring object characteristics of the object to be predicted and item characteristics of each item in the item sequence of the current session of the object to be predicted;
A sequence unit, configured to determine a fusion feature of each item based on the item feature of each item and the position of the item in the item sequence, and determine a session interest feature of the object to be predicted under the current session based on the fusion feature;
The conversion unit is used for calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the object to be predicted, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the object to be predicted under the current session based on the attention characteristic;
And the prediction unit is used for predicting the next item of the object to be predicted based on the session interest feature and the item conversion feature of the object to be predicted to obtain a prediction result of the object to be predicted.
Optionally, in some embodiments, the sequential unit may include a position subunit and a fusion subunit, as follows:
The position subunit is used for determining the position characteristics of each item based on the position of each item in the item sequence;
and the fusion subunit is used for fusing the item characteristics of each item in the item sequence of the current session and the item position characteristics corresponding to each item to obtain the fusion characteristics of each item.
Optionally, in some embodiments, the location subunit may be specifically configured to obtain a location index of each item in the sequence of items of the current session, where the location index is generated based on an operation time of the item; and vector mapping is carried out on the position indexes to obtain item position features corresponding to each item.
Optionally, in some embodiments, the sequential unit may include a first enhancement subunit, a second enhancement subunit, and a sequential subunit, as follows:
The first enhancement subunit is configured to determine a first enhancement feature of each item in the sequence of items based on the fusion feature of each item in the sequence of items and a correlation between items in the sequence of items;
The second enhancement subunit is configured to determine a second enhancement feature of each item in the sequence of items based on the first enhancement feature of the last item in the sequence of items, the fusion feature of each item in the sequence of items, and the correlation between items in the sequence of items;
And the sequence subunit is used for fusing the first enhancement characteristic of the last item in the item sequence with the second enhancement characteristic of the last item in the item sequence to obtain the session interest characteristic of the object to be predicted under the current session.
Optionally, in some embodiments, the first enhancer unit may be specifically configured to calculate, based on the fusion feature of each item in the sequence of items, a first correlation between the items in the sequence of items, to obtain a first correlation feature corresponding to each item; and enhancing the first relevance feature of each item to obtain a first enhancement feature of each item in the sequence of items.
Optionally, in some embodiments, the second enhancer unit may be specifically configured to calculate, based on a fusion feature of a first enhancement feature of a last item in the sequence of items and each item in the sequence of items, a second correlation between items in the sequence of items, to obtain a second correlation feature corresponding to each item; and enhancing the second relatedness characteristic of each item to obtain a second enhancement characteristic of each item in the sequence of items.
Optionally, in some embodiments, the conversion unit may include a construction subunit, a calculation subunit, and an aggregation subunit, as follows:
The construction subunit is configured to construct a directed graph based on an association relationship between items in the item sequence of the current session, where each node in the directed graph corresponds to an item, and two nodes connected by each edge represent a pair of items that are clicked by an object to be predicted in the current session;
the computing subunit is used for computing attention weights among nodes with directed edges in the directed graph, and determining the attention characteristics of each item in the item sequence according to the attention weights;
And the aggregation subunit is used for aggregating the attention characteristics of each item in the item sequence to obtain the item conversion characteristics of the object to be predicted under the current session.
Optionally, in some embodiments, the computing subunit may be specifically configured to initialize each node in the directed graph node based on the object feature and the item feature of the object to be predicted, to obtain an initialized item feature of each node; and calculating the attention weight among the nodes with directed edges in the directed graph based on the attention mechanism, and updating the initialized project characteristics of the nodes according to the attention weight to obtain the attention characteristics of each project in the project sequence.
Optionally, in some embodiments, the obtaining unit may include an obtaining subunit and a searching subunit, as follows:
The obtaining subunit is configured to obtain current session information of an object to be predicted, where the current session information includes a sequence of items of the object to be predicted in a current session;
The searching subunit is configured to search the object feature of the object to be predicted and the item feature of each item in the item sequence of the current session of the object to be predicted from a graph embedding database, where the graph embedding database is obtained by graph embedding a heterogeneous graph, and the heterogeneous graph is constructed based on the social relationship of the object and the item sequence in the historical session of the object.
Optionally, in some embodiments, the information prediction apparatus may further include a building unit, where the building unit may specifically be configured to obtain a sample object set, historical session information of each sample object in the sample object set, and social relationship information between sample objects; constructing a heterogeneous graph of the sample object set based on the social relation information and the historical session information, wherein nodes in the heterogeneous graph represent objects or projects, and edges between the nodes represent association relations between projects, between the objects and between the projects and the objects; determining object features and item features of the sample object set based on the heterogeneous map; object features and item features of the sample object set are stored in a graph-embedded database.
Optionally, in some embodiments, the establishing unit may be specifically configured to map each node in the heterogram to obtain a node characteristic of each node; for each node, calculating the attention weight of the neighbor node of the node through an attention mechanism based on the initial node characteristics of the node and the neighbor node of the node; performing feature fusion based on the attention weight and the node features of each neighbor node to obtain influence features of all neighbor nodes of the node on the node; and fusing the node characteristics of the nodes and the influence characteristics corresponding to the nodes, and processing the fused characteristics based on a multi-layer perception mechanism to obtain final node characteristics of the nodes, wherein the final node characteristics of the nodes corresponding to the items are item characteristics, the nodes corresponding to the objects are object characteristics, and the final node characteristics of the nodes are object characteristics.
Optionally, in some embodiments, the prediction unit may be specifically configured to obtain a project characteristic of a last project of the object to be predicted in the project sequence of the current session; fusing the object characteristics, the session interest characteristics, the project conversion characteristics and the project characteristics of the last project of the object to be predicted to obtain the current session characteristics of the object to be predicted; and predicting the next item of the object to be predicted by utilizing a trained prediction model based on the current session characteristics of the object to be predicted, so as to obtain a prediction result of the object to be predicted.
Optionally, in some embodiments, the information prediction apparatus may further include a training unit, where the training unit may specifically be configured to obtain an object feature of the sample object, and an item feature of each item in the sequence of items of the current session of the sample object; determining fusion characteristics of each item based on the item characteristics of each item and the position of the item in the item sequence, and determining session interest characteristics of the sample object under the current session based on the fusion characteristics of each item in the item sequence; calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the sample object, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the sample object under the current session based on the attention characteristic of each item; predicting the next item of the sample object based on the session interest feature and the item conversion feature of the sample object to obtain a prediction result of the sample object; and training a preset prediction model by using the prediction result and the real result of the sample object to obtain a trained prediction model.
Optionally, in some embodiments, the session interest feature includes a first session interest feature and a second session interest feature, and the item conversion feature includes a first item conversion feature and a second item conversion feature;
In the process of determining the fusion characteristic of each item based on the item characteristic of each item and the position of the item in the item sequence and determining the session interest characteristic of the sample object under the current session based on the fusion characteristic of each item in the item sequence, the method further comprises: discarding hidden layer features appearing in the process for two rounds to respectively obtain a first session interest feature and a second session interest feature;
the method comprises the steps of calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the sample object, the item characteristic of each item in the item sequence and the association relation among the items in the item sequence, and determining the item conversion characteristic of the sample object in the current session based on the attention characteristic of each item, and further comprises the following steps: discarding hidden layer features appearing in the process for two rounds to respectively obtain a first project conversion feature and a second project conversion feature;
The training unit may be specifically configured to predict a next item of the sample object based on an object feature, an item feature, a first session interest feature, and a first item conversion feature of the sample object, to obtain a first prediction probability of the sample object; predicting the next item of the sample object based on the object feature, item feature, second session interest feature and second item conversion feature of the sample object to obtain a second prediction probability of the sample object; calculating a difference between the first prediction probability and the second prediction probability, and calculating a similarity of the first prediction probability and the second prediction probability; and determining a prediction result of the sample object based on the difference value and the similarity.
In addition, the embodiment of the application also provides a computer readable storage medium, which stores a plurality of instructions, wherein the instructions are suitable for being loaded by a processor to execute the steps in any information prediction method provided by the embodiment of the application.
In addition, the embodiment of the application also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the steps in any information prediction method provided by the embodiment of the application when executing the program.
According to one aspect of the present application there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium, the computer instructions being read from the computer readable storage medium by a processor of a computer device, the computer instructions being executed by the processor to cause the computer device to perform the methods provided in the various alternative implementations of the above-described information prediction aspects.
The embodiment can acquire the object characteristics of the object to be predicted and the item characteristics of each item in the item sequence of the current session of the object to be predicted; then, determining fusion characteristics of each item based on the item characteristics of each item and the position of the item in the item sequence, and determining session interest characteristics of the object to be predicted under the current session based on the fusion characteristics; then, calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the object to be predicted, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the object to be predicted under the current session based on the attention characteristic; and then, predicting the next item of the object to be predicted based on the session interest feature and the item conversion feature of the object to be predicted, so as to obtain a prediction result of the object to be predicted. The scheme can effectively improve the accuracy of information prediction.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1a is a schematic view of a scenario of an information prediction method according to an embodiment of the present application;
FIG. 1b is a first flowchart of an information prediction method according to an embodiment of the present application;
FIG. 1c is a diagram illustrating a diagram embedding provided by an embodiment of the present application;
FIG. 1d is a schematic diagram of a training model provided by an embodiment of the present application;
FIG. 2a is an overall schematic diagram of a graph-embedded database and model training provided by an embodiment of the present application;
FIG. 2b is a schematic diagram of probability distribution provided by an embodiment of the present application;
FIG. 2c is a second flowchart of an information prediction method according to an embodiment of the present application;
FIG. 2d is a schematic diagram of session information provided by an embodiment of the present application;
fig. 3 is a schematic structural diagram of an information prediction apparatus according to an embodiment of the present application;
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The principles of the present application are illustrated as implemented in a suitable computing environment. In the description that follows, specific embodiments of the application will be described with reference to steps and symbols performed by one or more computers, unless otherwise indicated. Thus, these steps and operations will be referred to in several instances as being performed by a computer, which as referred to herein performs operations that include processing units by the computer that represent electronic signals that represent data in a structured form. This operation transforms the data or maintains it in place in the computer's memory system, which may reconfigure or otherwise alter the computer's operation in a manner well known to those skilled in the art. The data structure maintained by the data is the physical location of the memory, which has specific characteristics defined by the data format. However, the principles of the present application are described in the foregoing text and are not meant to be limiting, and one skilled in the art will recognize that various steps and operations described below may also be implemented in hardware.
The term "unit" as used herein may be regarded as a software object executing on the computing system. The various components, units, engines, and services described herein may be viewed as implementing objects on the computing system. The apparatus and method may be implemented in software, but may also be implemented in hardware, which is within the scope of the present application.
The terms "first," "second," and "third," etc. in this disclosure are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to the list of steps or elements but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The embodiment of the application provides an information prediction method, an information prediction device, electronic equipment and a storage medium. The information prediction device may be integrated in an electronic device, which may be a server or a terminal.
The information prediction method provided by the embodiment of the application relates to a machine learning technology in the field of artificial intelligence, and can utilize machine learning to learn according to the characteristics of an object to be predicted, thereby realizing the prediction of the object to be predicted.
Wherein artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is the theory, method, technique, and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision. The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. The artificial intelligence software technology mainly comprises the directions of computer vision technology, machine learning/deep learning and the like.
The machine learning (MACHINE LEARNING, ML) is a multi-domain interdisciplinary, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.
For example, as shown in fig. 1a, first, the electronic device integrated with the information prediction apparatus may acquire an object feature of an object to be predicted, and an item feature of each item in an item sequence of a current session of the object to be predicted; then, determining fusion characteristics of each item based on the item characteristics of each item and the position of the item in the item sequence, and determining session interest characteristics of the object to be predicted under the current session based on the fusion characteristics; then, calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the object to be predicted, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the object to be predicted under the current session based on the attention characteristic; and then, predicting the next item of the object to be predicted based on the session interest feature and the item conversion feature of the object to be predicted, so as to obtain a prediction result of the object to be predicted. Since the scheme can learn object features and project features from historical conversations and social networks by constructing an ad hoc graph. Then, based on the item characteristics of each item and the position of the item in the item sequence, the session interest characteristics of the object to be predicted under the current session can be obtained through an item sequence channel, and based on the object characteristics corresponding to the object to be predicted, the item characteristics of each item in the item sequence and the association relation among the items in the item sequence, the item conversion characteristics of the object to be predicted under the current session can be obtained through an item conversion channel. Finally, the next item of the object to be predicted is predicted by combining the characteristics obtained by the two channels. Because the two-channel mode can better learn the sequential mode and the project conversion mode, the accuracy of information prediction is effectively improved, and therefore more accurate recommendation suggestions are provided.
The following will describe in detail. The following description of the embodiments is not intended to limit the preferred embodiments.
The present embodiment will be described from the viewpoint of an information prediction apparatus, which may be integrated in an electronic device, which may be a server or a terminal, or other devices; the terminal may include a mobile phone, a tablet computer, a notebook computer, a personal computer (Personal Computer, PC), and the like.
An information prediction method, comprising: acquiring object characteristics of an object to be predicted and item characteristics of each item in an item sequence of a current session of the object to be predicted; then, determining fusion characteristics of each item based on the item characteristics of each item and the position of the item in the item sequence, and determining session interest characteristics of the object to be predicted under the current session based on the fusion characteristics; then, calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the object to be predicted, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the object to be predicted under the current session based on the attention characteristic; and then, predicting the next item of the object to be predicted based on the session interest feature and the item conversion feature of the object to be predicted, so as to obtain a prediction result of the object to be predicted.
As shown in fig. 1b, the specific flow of the information prediction method may be as follows:
101. and obtaining object characteristics of the object to be predicted and item characteristics of each item in the item sequence of the current session of the object to be predicted.
The object to be predicted may refer to an object that needs to be predicted, for example, an object that needs to be predicted for a next click item.
Where a session may refer to a set of items (e.g., any object, such as a product, song, or movie) collected or used in an event (e.g., a transaction) or over a period of time, or a set of actions or events (e.g., listening to a song) that occur over a period of time (e.g., an hour).
For example, a group of items purchased in one transaction and a list of songs listened to by the user within an hour may both be considered a session; in addition, web pages that the user continuously clicks on within an hour may also be considered conversations, and so on.
Where the sequence of items of the current session may refer to a collection of items that the object operates in the current session. For example, the user continuously clicks on the web page within one hour, where the web page may be the item of the current session, and all the web pages clicked within one hour are the sequence of items of the current session.
For example, specifically, when information prediction needs to be performed on the object to be predicted, a prediction request may be generated and sent to the information prediction device, so that the information prediction device obtains the object to be predicted and related data of the object to be predicted, for example, obtains current session information of the object to be predicted.
The item characteristics of each item can be obtained by embedding a query graph into a database, for example, the current session information of an object to be predicted can be obtained, and the current session information comprises an item sequence of the object to be predicted in a current session; searching object characteristics of the object to be predicted and item characteristics of each item in an item sequence of a current session of the object to be predicted from a graph embedding database, wherein the graph embedding database is obtained by graph embedding of heterogeneous graphs, and the heterogeneous graphs are constructed based on social relations of the object and the item sequence in a historical session of the object.
The current session information may refer to information generated by a series of operations of the object in the current session, for example, may include a sequence of items of the object in the current session. The graph embedding (Graph Embedding, also called Network Embedding) is a process of mapping graph data (usually a high-dimensional dense matrix) into low-micro dense vectors, which can well solve the problem that the graph data is difficult to input into a machine learning algorithm efficiently.
In order to improve the efficiency of information prediction, a graph embedded database can be built first, and then project features corresponding to project query projects are embedded in the database by using the built graph. Alternatively, the graph embedded database may be provided to the information prediction apparatus after being established by other devices, or may be established by the information prediction apparatus, that is, before the "searching the object feature of the object to be predicted from the graph embedded database, and the item feature of each item in the item sequence of the current session of the object to be predicted", the information prediction method may further include:
Acquiring a sample object set, historical session information of each sample object in the sample object set, and social relationship information among the sample objects; constructing a heterogeneous graph of the sample object set based on the social relation information and the historical session information, wherein nodes in the heterogeneous graph represent objects or projects, and edges between the nodes represent association relations between projects, between the objects and between the projects and the objects; determining object features and item features of the sample object set based on the heterogeneous map; object features and item features of the sample object set are stored in a graph-embedded database.
For example, the step of determining object features and project features of the sample object set based on the heterogeneous graph may specifically map each node in the heterogeneous graph to obtain a node feature of each node; for each node, calculating the attention weight of the neighbor node of the node through an attention mechanism based on the initial node characteristics of the node and the neighbor node of the node; performing feature fusion based on the attention weight and the node features of each neighbor node to obtain influence features of all neighbor nodes of the node on the node; and fusing the node characteristics of the nodes and the influence characteristics corresponding to the nodes, and processing the fused characteristics based on a multi-layer perception mechanism to obtain final node characteristics of the nodes, wherein the final node characteristics of the nodes corresponding to the items are item characteristics, the nodes corresponding to the objects are object characteristics, and the final node characteristics of the nodes are object characteristics.
The history session information may refer to information generated by a series of operations of the object in the history session, for example, may include a sequence of items of the object in the history session. The social relationship information may refer to relationship information of the object in social interaction, for example, may be data of friend relationship, social group relationship, and the like of the object.
Because tasks require both social relationships and historical session information, the social relationships can be combined with the historical sessions to construct heterograms. For example, as shown in FIG. 1c, the embedding of objects and items may be achieved with one and the same iso-graph, wherein the edge set represents the transition between the historical interactions between the objects and items, and between items, such as (u 1,u2),(i1,i2) and (u 1,i1). Wherein HGNN is a heterogeneous graph neural network, i.e., a hypergraph neural network (HYPERGRAPH NEURAL NERWORKS, HGNN). The influence of social relationships is weak because direct node aggregation may lead to information imbalance between social knowledge and historical interactions. Thus, to better obtain representations of objects and items, an attention mechanism can be employed in the component to mitigate imbalance by learning hidden weights of different nodes. Since one heterogeneous graph is maintained, information from heterogeneous neighbor nodes of objects and projects is needed in order to update the representation of the objects or projects.
For example, the principle of node aggregation may be briefly described using the process of generating an object representation (i.e., object features), and knowledge of objects and items may be represented as social impact and object preference, respectively, to distinguish heterogeneous neighbors of an object. To generate a new representation of the current layer, the importance of the neighbors may be computed and an aggregation operation performed. First, we need to calculate the importance of each connection in K:
wherein, the connection operation is recorded as I, And W 2 is a parameter that can be learned,The representation of node u at layer l-1 is represented and then the attention mechanism is used to identify the importance of node u.
Different nodes, and whether the importance score α uv is normalized to the attention weight e uv. v denotes the neighbors of node u. And the node aggregation process is as follows:
At W 3, b is a learnable parameter, and MLP is a multilayer sensor (Multilayer Perceptron). By the method, the node of each node in the captured heterogeneous graph neural network layer represents the social influence of adjacent users and the preference of the object on adjacent projects. By superimposing the heterogeneous graph neural network layers, the final node representation can be achieved by superimposing the heterogeneous graph neural network on multiple layers. Finally, the final representation of each node is the last layer representation G emb of the graph, which can be provided to subsequent modules for initialization of object and project feature representations.
Wherein, reLU is a linear rectifying function (Linear rectification function), also called a modified linear unit, which is a commonly used activation function (activation function) in artificial neural networks, and generally refers to a nonlinear function represented by a ramp function and its variants.
102. Based on the item characteristics of each item and the position of the item in the item sequence, determining the fusion characteristics of each item, and determining the session interest characteristics of the object to be predicted under the current session based on the fusion characteristics.
For example, the step of "determining a fusion feature for each item based on the item feature of each item and the position of the item in the sequence of items" may specifically determine the item position feature for each item based on the position of each item in the sequence of items; and fusing the item characteristics of each item in the item sequence of the current session and the item position characteristics corresponding to each item to obtain the fusion characteristics of each item.
Wherein determining item location characteristics of an item may be accomplished by incorporating a learnable location embedding module for mapping location indices onto dense vectors to capture temporal effects of the input. For example, the step of "determining the item position feature of each item based on the position of each item in the item sequence", specifically, the position index of each item in the item sequence of the current session, which is generated based on the operation time of the item, may be obtained; and vector mapping is carried out on the position indexes to obtain item position features corresponding to each item.
To capture order information in a session, interest preferences of a user are captured over a time-series, session interest features of an object may be acquired through a project order channel. For example, the step of determining a session interest feature of the object to be predicted under the current session based on the fusion feature may specifically determine a first enhancement feature of each item in the sequence of items based on the fusion feature of each item in the sequence of items and the correlation between items in the sequence of items; determining a second enhancement feature for each item in the sequence of items based on the first enhancement feature for the last item in the sequence of items, the converged feature for each item in the sequence of items, and the correlation between items in the sequence of items; and fusing the first enhancement feature of the last item in the item sequence with the second enhancement feature of the last item in the item sequence to obtain the session interest feature of the object to be predicted under the current session.
The step of determining a first enhancement feature of each item in the item sequence based on the fusion feature of each item in the item sequence and the correlation between items in the item sequence may specifically calculate a first correlation between items in the item sequence based on the fusion feature of each item in the item sequence to obtain a first correlation feature corresponding to each item; and enhancing the first relevance feature of each item to obtain a first enhancement feature of each item in the sequence of items.
The step of determining a second enhancement feature of each item in the item sequence based on a first enhancement feature of the last item in the item sequence, a fusion feature of each item in the item sequence, and a correlation between items in the item sequence, specifically, the step of calculating a second correlation between items in the item sequence based on the first enhancement feature of the last item in the item sequence and the fusion feature of each item in the item sequence to obtain a second correlation feature corresponding to each item; and enhancing the second relatedness characteristic of each item to obtain a second enhancement characteristic of each item in the sequence of items.
For example, a session interest feature of an object may be acquired using a project-based order channel. For example, an embedding layer is proposed that can translate an incoming current session into item embedding (i.e., item features) and location embedding (i.e., item location features). First, a learnable location embedding module can be introduced for mapping the location index onto dense vectors to capture the temporal impact of the input. For example, a hidden representation of a given input current session s= [ i 0i1…in-1 ] is
xi=([ei||pi])
Where e i is the embedded vector of the item (i.e., item feature) obtained by looking up from G emb, and p i is the item location embedded vector (i.e., item location feature), after obtaining the hidden representation x= { X 1,x2,…,xn-1 } of session s, the sequential sequence information is modeled using a multi-headed attention network.
For example, a multi-headed self-attention network may be used to capture relationships between items and obtain an enhanced representation e= { E 1,e2,…,en-1 } (i.e., a first enhanced feature) for each item in the session s, where E is the network-extracted information that is embedded in the session and the new item E n-1 is the learning target embedded that contains a special item index and fuses information from the entire session to represent the current real preference of the subject. The formula of the multi-head attention network can be as follows:
headi=Attention(QWi Q,KWi K,VWi V,)
MultiHead(Q,K,V)=Concat(head1,…,headh)
X′=Droput(X)
E=MultiHead(X′,X,X)
Wherein, AndIs the projection matrix and d k is the dimension of head i. The attention mechanism may be formalized as:
Where Q is the query vector, K T is the key vector, V is the value vector, and d is the dimension of the key vector.
Since the last click helps predict the next item based on the suggestion for the session. Thus, to extract the historical interest of an object, the last enhancement representation E n-1 of E and the hidden representation X 1={x1,x2,…,xn-2 are used as inputs to another multi-headed attention network:
E′=MultiHead(en-1,X1,X1)
Here E '= { E' 1,e′2,…,e′n-2 } (i.e. second enhancement feature)
Hs=([e′n-2||en-1])
Finally, the last state of E' and the last state of E are connected together to obtain a representation H s of the final interest of the object (i.e., a session interest feature).
103. And calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the object to be predicted, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the object to be predicted under the current session based on the attention characteristic.
For example, a directed graph may be specifically configured based on an association relationship between items in the item sequence of the current session, where each node in the directed graph corresponds to an item, and two nodes connected by each edge represent a pair of items that are clicked by an object to be predicted in the current session; calculating attention weights among nodes with directed edges in the directed graph, and determining the attention characteristics of each item in the sequence of items according to the attention weights; and aggregating the attention characteristics of each item in the item sequence to obtain the item conversion characteristics of the object to be predicted under the current session.
The step of calculating attention weights among nodes with directed edges in the directed graph, and determining attention characteristics of each item in an item sequence according to the attention weights, wherein each node in the directed graph node can be initialized based on the object characteristics and the item characteristics of the object to be predicted to obtain initialized item characteristics of each node; and calculating the attention weight among the nodes with directed edges in the directed graph based on the attention mechanism, and updating the initialized project characteristics of the nodes according to the attention weight to obtain the attention characteristics of each project in the project sequence.
For example, to acquire complex item conversion information, item conversion features of an object may be acquired using a graph-based item conversion channel. The goal of graph-based item conversion is to extract dynamic and personalized preferences of the current object for the object in the current session. Previous project-based sequential channels effectively use sequence information of objects, but do not capture advanced structure information of users. Thus, to better utilize existing graph structure information, a lightweight graph neural network-based model is proposed to learn a specific embedded representation of a personalized session. First, each session can be modeled as a directed graph. In this figure, each node represents the item, and each edge represents a pair of items that the user clicks on in the session.
First, the embedded H u (i.e., object features) for each object and the embedded f i (i.e., item features) for each item can be looked up by the previous graph-embedded database lookup G emb, and then passed through BatchNorm and Dropout, respectively, as follows:
H′u=BatchNorm(Hu)
fi′=BatchNorm(fi)
H″u=Dropout(H′u,γ)
f″i=Dropout(fi′,γ)
Wherein BatchNorm (batch normalization) can be used to solve the problem of covariate displacement inside the MLP middle layer. BatchNorm may keep the inputs of each layer of neural network equally distributed during the deep neural network training process. While Dropout (discard) method is a powerful and widely used technique for normalizing deep neural network training, it is a random Dropout with a probability of p for each neuron during the training of the network. Gamma means that during each iteration of training, neurons in this layer will be randomly discarded with the probability of gamma and will not participate in the training. H "u and f i" are then passed through two linear transformation networks as follows:
fi *=MLP(f″i)
because each item is of different importance to the other items, an attention mechanism may be used to update the state of the node. For example, the object may be embedded by Each f i * item is connected to initialize node h i (i.e., initialize item features) as follows:
The node representation is then updated using the graph attention layer. For each pair of nodes (i, j) their attention coefficients are calculated, wherein the attention coefficients can be calculated as follows:
Where a T and W are learnable parameters, j e N i represents a neighbor node where j is i, and S represents the length of the sequence. Finally, advanced information H g (i.e., item conversion features) is obtained through the graph network. Based on the updated node vectors, each session may be represented by an embedded vector of H g, which consists of the node vectors used in the graph.
104. And predicting the next item of the object to be predicted based on the session interest feature and the item conversion feature of the object to be predicted to obtain a prediction result of the object to be predicted.
For example, the item characteristics of the last item of the object to be predicted in the item sequence of the current session can be obtained; fusing the object characteristics, the session interest characteristics, the project conversion characteristics and the project characteristics of the last project of the object to be predicted to obtain the current session characteristics of the object to be predicted; and predicting the next item of the object to be predicted by utilizing a trained prediction model based on the current session characteristics of the object to be predicted, so as to obtain a prediction result of the object to be predicted.
In order to improve the efficiency of information prediction, a preset prediction model can be trained first, and then the object to be predicted is predicted by using the trained prediction model. Alternatively, the predictive model may be trained from a plurality of training samples. Specifically, the information prediction method may be provided to the information prediction device after training by other devices, or may be performed by the information prediction device by itself, that is, before the "predicting the next item of the object to be predicted by using the trained prediction model", the information prediction method may further include:
acquiring object characteristics of a sample object and item characteristics of each item in an item sequence of a current session of the sample object; determining fusion characteristics of each item based on the item characteristics of each item and the position of the item in the item sequence, and determining session interest characteristics of the sample object under the current session based on the fusion characteristics of each item in the item sequence; calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the sample object, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the sample object under the current session based on the attention characteristic of each item; predicting the next item of the sample object based on the session interest feature and the item conversion feature of the sample object to obtain a prediction result of the sample object; and training a preset prediction model by using the prediction result and the real result of the sample object to obtain a trained prediction model. For example, a specific training process may be as shown in FIG. 1 d.
For example, in some embodiments, the session interest features include a first session interest feature and a second session interest feature, and the item conversion features include a first item conversion feature and a second item conversion feature;
In the process of determining the fusion characteristic of each item based on the item characteristic of each item and the position of the item in the item sequence and determining the session interest characteristic of the sample object under the current session based on the fusion characteristic of each item in the item sequence, the method further comprises: discarding hidden layer features appearing in the process for two rounds to respectively obtain a first session interest feature and a second session interest feature;
the method comprises the steps of calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the sample object, the item characteristic of each item in the item sequence and the association relation among the items in the item sequence, and determining the item conversion characteristic of the sample object in the current session based on the attention characteristic of each item, and further comprises the following steps: discarding hidden layer features appearing in the process for two rounds to respectively obtain a first project conversion feature and a second project conversion feature;
The step of predicting a next item of the sample object based on the session interest feature and the item conversion feature of the sample object to obtain a prediction result of the sample object may specifically predict the next item of the sample object based on the object feature, the item feature, the first session interest feature and the first item conversion feature of the sample object to obtain a first prediction probability of the sample object; predicting the next item of the sample object based on the object feature, item feature, second session interest feature and second item conversion feature of the sample object to obtain a second prediction probability of the sample object; calculating a difference between the first prediction probability and the second prediction probability, and calculating a similarity of the first prediction probability and the second prediction probability; and determining a prediction result of the sample object based on the difference value and the similarity.
For example, first, the project transformation feature H g, the session interest feature H s, the object feature H u, and the last project feature H l may be stitched to obtain a final representation of the model, then the feedforward neural network is used to obtain a final output of the model,
Hfinal=[Hg||Hs||Hu||Hl]
z=f(WzHfinal+bz)
Z represents the final output of the model, W z and b z represent the learnable parameters. Then processing through softmax function to obtain outputRepresenting the probability of the next click of the item in the session. Finally, the loss function is defined as the cross entropy of the prediction result y and the true label, which can be as follows:
Since the previous model uses the entire session to model and obtain the final result, considering all items in the session may be detrimental to the final result, including the impact of unrelated items. Thus, some features may be randomly deleted during the model input process. Because this model is to learn to discard these useless features, the two randomly discarded features should have little effect on their final results, which lets the model know what items should be deleted as a final representation. Thus, the dropout method, a powerful and widely used technique to normalize the training of deep neural networks, can be employed. It appears efficient and good in training networks, but the resulting randomness leads to non-negligible inconsistencies between training and reasoning. Therefore, the feature discontinue one's studies can be randomly generated using this method and then two prediction distributions can be obtained from the model. The output distribution of the model predictions should be as consistent as possible. For measuring both distributions, KL and JS divergences may be generally used. KL divergence is a measure of the asymmetry of the difference between two probability distributions.
The JS divergence measures the similarity of the two probability distributions. Based on the variation of KL divergence, the problem of asymmetry of KL divergence is solved.
After the model has obtained two distributions, P 1 and P 2, P 1 and P 2 are calculated in two ways by two dropouts.
Refers to two methods for calculating divergence of KL and JS,
Beta is a scale factor that controls the tradeoff between cross entropy loss and self-supervised learning loss. In the inference phase, the average of the sum of the probabilities of P 1 and P 2 can be used as the final predicted probability distribution.
The KL divergence (Kullback-Leibler divergence), also known as relative entropy or information divergence (information divergence), is a measure of asymmetry of the difference between two probability distributions (probability distribution). JS divergence (Jensen-Shannon divergence) is used to describe the degree of similarity of two probability distributions.
In the above method, the data storage is stored in the blockchain to improve the security of information prediction. The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. The blockchain (Blockchain), essentially a de-centralized database, is a string of data blocks that are generated in association using cryptographic methods, each of which contains information from a batch of network transactions for verifying the validity (anti-counterfeit) of its information and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, and an application services layer.
The blockchain underlying platform may include processing modules for user management, basic services, smart contracts, and operation detection. The user management module is responsible for identity information management of all blockchain participants, including maintenance of public and private key generation (account management), key management, maintenance of corresponding relation between the real identity of the user and the blockchain address (authority management) and the like, and under the condition of authorization, supervision and audit of transaction conditions of certain real identities, and provision of rule configuration (wind control audit) of risk control; the basic service module is deployed on all block chain node devices, is used for verifying the validity of a service request, recording the service request on a storage after the effective request is identified, for a new service request, the basic service firstly analyzes interface adaptation and authenticates the interface adaptation, encrypts service information (identification management) through an identification algorithm, and transmits the encrypted service information to a shared account book (network communication) in a complete and consistent manner, and records and stores the service information; the intelligent contract module is responsible for registering and issuing contracts, triggering contracts and executing contracts, a developer can define contract logic through a certain programming language, issue the contract logic to a blockchain (contract registering), invoke keys or other event triggering execution according to the logic of contract clauses to complete the contract logic, and simultaneously provide a function of registering contract upgrading; the operation module is mainly responsible for deployment in the product release process, modification of configuration, contract setting, cloud adaptation and visual output of real-time states in product operation, for example: alarms, detecting network conditions, detecting node device health status, etc.
The platform product service layer provides basic capabilities and implementation frameworks of typical applications, and developers can complete the blockchain implementation of business logic based on the basic capabilities and the characteristics of the superposition business. The application service layer provides the application service based on the block chain scheme to the business participants for use.
As can be seen from the above, the present embodiment may obtain the object feature of the object to be predicted, and the item feature of each item in the item sequence of the current session of the object to be predicted; then, determining fusion characteristics of each item based on the item characteristics of each item and the position of the item in the item sequence, and determining session interest characteristics of the object to be predicted under the current session based on the fusion characteristics; then, calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the object to be predicted, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the object to be predicted under the current session based on the attention characteristic; and then, predicting the next item of the object to be predicted based on the session interest feature and the item conversion feature of the object to be predicted, so as to obtain a prediction result of the object to be predicted. Since the scheme can learn object features and project features from historical conversations and social networks by constructing an ad hoc graph. Then, based on the item characteristics of each item and the position of the item in the item sequence, the session interest characteristics of the object to be predicted under the current session can be obtained through an item sequence channel, and based on the object characteristics corresponding to the object to be predicted, the item characteristics of each item in the item sequence and the association relation among the items in the item sequence, the item conversion characteristics of the object to be predicted under the current session can be obtained through an item conversion channel. Finally, the next item of the object to be predicted is predicted by combining the characteristics obtained by the two channels. Because the two-channel mode can better learn the sequential mode and the project conversion mode, the accuracy of information prediction is effectively improved, and therefore more accurate recommendation suggestions are provided.
The method described in the previous embodiment is described in further detail below by way of example.
In this embodiment, the information prediction apparatus is specifically integrated in an electronic device, and the information prediction may specifically be predicting an item, such as an article, that is clicked next by a user, and the object to be predicted is specifically described by taking the user to be predicted as an example. For example, as shown in fig. 2 a.
Firstly, a graph embedded database can be established firstly, and the method can be concretely as follows:
In order to improve the efficiency of information prediction, a graph embedded database can be built first, and then project features corresponding to project query projects are embedded in the database by using the built graph. Alternatively, the map-embedded database may be provided to the information prediction apparatus after being established by another device, or may be established by the information prediction apparatus itself.
For example, the electronic device may specifically obtain a sample user set, historical session information of each sample user in the sample user set, and social relationship information between sample users; constructing a heterogeneous graph of the sample user set based on the social relation information and the historical session information, wherein nodes in the heterogeneous graph represent users or items, and edges between the nodes represent association relations between the items, between the users and between the items and the users; determining object features and project features of the sample user set based on the heterogeneous map; storing the object features and the item features of the sample user set in a graph-embedded database.
For example, the electronic device may specifically map each node in the heterogram to obtain a node feature of each node; for each node, calculating the attention weight of the neighbor node of the node through an attention mechanism based on the initial node characteristics of the node and the neighbor node of the node; performing feature fusion based on the attention weight and the node features of each neighbor node to obtain influence features of all neighbor nodes of the node on the node; and fusing the node characteristics of the nodes and the influence characteristics corresponding to the nodes, and processing the fused characteristics based on a multi-layer perception mechanism to obtain final node characteristics of the nodes, wherein the final node characteristics of the nodes corresponding to the items are item characteristics, the nodes corresponding to the users are object characteristics.
The history session information may refer to information generated by a series of operations of the user in the history session, for example, may include a sequence of items of the user in the history session. The social relationship information may refer to relationship information of the user in social interaction, for example, may be data of friend relationship, social group relationship, and the like of the user.
Since the interest preferences of the user are affected not only by the click period, but also by social relationships, graphical embedding is the most common way to utilize social networks and all historical user operations. And tasks need social relations and historical session information, so that the social relations and the historical sessions can be combined to construct the heterograms. For example, the same iso-graph may be used to obtain the embedding of the user and the item, wherein the edge set represents the transition between the user and the item, and between the items, as in (u 1,u2),(i1,i2) and (u 1,i1). Generally, the total number of clicks of the user is greater than the social relationship. The influence of social relationships is weak because direct node aggregation may lead to information imbalance between social knowledge and historical interactions. Thus, to better obtain a representation of the user and item, an attention mechanism can be employed in the component to mitigate imbalance by learning hidden weights of different nodes. Since one heterogeneous graph is maintained, information from heterogeneous neighbor nodes of a user and an item is required in order to update the representation of the user or item.
For example, the principle of node aggregation may be briefly described using the process of generating user representations (i.e., object features), and the knowledge of users and items may be represented as social impact and user preferences, respectively, to distinguish heterogeneous neighbors of users. To generate a new representation of the current layer, the importance of the neighbors may be computed and an aggregation operation performed. First, we need to calculate the importance of each connection in K:
wherein, the connection operation is recorded as I, And W 2 is a parameter that can be learned,The representation of node u at layer l-1 is represented and then the attention mechanism is used to identify the importance of node u.
Different nodes, and whether the importance score α uv is normalized to the attention weight e uv. v denotes the neighbors of node u. And the node aggregation process is as follows:
at W 3, b is a learnable parameter, and MLP is a multi-layer sensor. By the method, the node of each node in the captured heterogeneous graph neural network layer represents the social influence of the adjacent user and the preference of the user on the adjacent projects. By superimposing the heterogeneous graph neural network layers, the final node representation can be achieved by superimposing the heterogeneous graph neural network on multiple layers. Finally, the final representation of each node is the last layer representation G emb of the graph, which can be provided to subsequent modules for initialization of object and project feature representations.
Secondly, training a preset prediction model can be performed, which can be specifically as follows:
In order to improve the efficiency of information prediction, a preset prediction model can be trained first, and then the user to be predicted is predicted by using the trained prediction model. Alternatively, the predictive model may be trained from a plurality of training samples. Specifically, the training may be performed by other devices and then provided to the information prediction apparatus, or the training may be performed by the information prediction apparatus itself.
For example, the electronic device may specifically obtain an object feature of the sample user, and a project feature of each project in a project sequence of the current session of the sample user; determining fusion characteristics of each item based on the item characteristics of each item and the position of the item in the item sequence, and determining session interest characteristics of a sample user under the current session based on the fusion characteristics of each item in the item sequence; calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the sample user, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the sample user under the current session based on the attention characteristic of each item; predicting the next item of the sample user based on the session interest feature and the item conversion feature of the sample user to obtain a prediction result of the sample user; and training a preset prediction model by using the prediction result and the real result of the sample user to obtain a trained prediction model.
For example, the session interest feature includes a first session interest feature and a second session interest feature, and the item conversion feature includes a first item conversion feature and a second item conversion feature;
In the process of determining the fusion characteristic of each item based on the item characteristic of each item and the position of the item in the item sequence and determining the session interest characteristic of the sample user under the current session based on the fusion characteristic of each item in the item sequence, the method further comprises: discarding hidden layer features appearing in the process for two rounds to respectively obtain a first session interest feature and a second session interest feature; for example, as shown in fig. 2a, Q (query vector) may be discarded in the process of determining the first enhancement feature of each item in the sequence of items and determining the second enhancement feature of each item in the sequence of items, where the first round of discarding at random results in a first session interest feature and the second round of discarding at random results in a second session interest feature.
Calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the sample user, the item characteristic of each item in the item sequence and the association relation among the items in the item sequence, and determining the item conversion characteristic of the sample user in the current session based on the attention characteristic of each item, wherein the method further comprises the following steps: discarding hidden layer features appearing in the process for two rounds to respectively obtain a first project conversion feature and a second project conversion feature; for example, as shown in fig. 2a, after BatchNorm is performed on the object feature and the item feature, random discarding may be performed, where a first item conversion feature is obtained after a first round of random discarding, and a second item conversion feature is obtained after a second round of random discarding.
The electronic device may specifically predict a next item of the sample user based on the object feature, the item feature, the first session interest feature, and the first item conversion feature of the sample user, to obtain a first prediction probability of the sample user; predicting the next item of the sample user based on the object feature, the item feature, the second session interest feature and the second item conversion feature of the sample user to obtain a second prediction probability of the sample user; calculating a difference between the first prediction probability and the second prediction probability, and calculating a similarity of the first prediction probability and the second prediction probability; and determining a prediction result of the sample user based on the difference value and the similarity.
For example, based on the item characteristics of each item and the position of the item in the item sequence, the fusion characteristics of each item are determined, and based on the fusion characteristics of each item in the item sequence, the session interest characteristics of the sample user under the current session are determined, so that the session interest characteristics of the user can be obtained by using the channel based on the item sequence. For example, an embedding layer is proposed that can convert an incoming current session into item embedding and location embedding (i.e., item features and item location features). First, a learnable location embedding module can be introduced for mapping the location index onto dense vectors to capture the temporal impact of the input. For example, a hidden representation of a given input current session s= [ i 0i1…in-1 ] is
xi=([ei||pi])
Where e i is the embedded vector of the item (i.e., item feature) obtained by looking up from G emb, and p i is the item location embedded vector (i.e., item location feature), after obtaining the hidden representation x= { X 1,x2,…,xn-1 } of session s, the sequential sequence information is modeled using a multi-headed attention network.
For example, a multi-headed self-attention network may be used to capture relationships between items and obtain an enhanced representation e= { E 1,e2,…,en-1 } for each item in session s, where E is the target embedding of the network after extracting information embedded in the session, and new item E n-1 is the target embedding of learning, which contains a special item index and fuses information from the entire session to represent the current real preference of the user. The formula of the multi-head attention network can be as follows:
headi=Attention(QWi Q,KWi K,VWi V,)
MultiHead(Q,K,V)=Concat(head1,…,headh)
X′=Droput(X)
E=MultiHead(X′,X,X)
Wherein, AndIs the projection matrix and d k is the dimension of head i. The attention mechanism may be formalized as:
Since the last click helps predict the next item based on the suggestion for the session. Thus, to extract the historical interests of the user, the last enhancement representation E n-1 of E and the hidden representation X 1={x1,x2,…,xn-2 are used as inputs to another multi-headed attention network:
E′=MultiHead(en-1,X1,X1)
Here E '= { E' 1,e′2,…,e′n-2 }
Hs=([e′n-2||en-1])
Finally, the last state of E' and the last state of E are connected together to obtain a representation H s of the final interest of the user (i.e. the session interest feature of the sample user in the current session).
For example, based on the object features corresponding to the sample user, the item features of each item in the item sequence, and the association relationship between the items in the item sequence, the attention features of each item in the item sequence are calculated, and the item conversion features of the sample user under the current session are determined based on the attention features of each item, so that the item conversion features of the user can be obtained by using the item conversion channel based on the graph. The goal of the graph-based project conversion is to extract the current user's dynamic and personalized preferences for the user in the current session. Previous project-based sequential channels effectively use the user's sequence information, but do not capture the user's high-level structural information. Thus, to better utilize existing graph structure information, a lightweight graph neural network-based model is proposed to learn a specific embedded representation of a personalized session. First, each session can be modeled as a directed graph. In this figure, each node represents the item, and each edge represents a pair of items that the user clicks on in the session.
First, each user's embedded H u (i.e., object features) and each item's embedded f i (i.e., item features) can be looked up by the previous graph embedded database lookup G emb, and then passed through BatchNorm and Dropout, respectively, as follows:
H′u=BatchNorm(Hu)
fi′=BatchNorm(fi)
H″u=Dropout(H′u,γ)
fi″=Dropout(fi′,γ)
Wherein BatchNorm can be used to solve the problem of covariate displacement inside the MLP intermediate layer. While Dropout is a powerful and widely used technique for normalizing deep neural network training, gamma means that during each iteration of training, neurons in that layer will be randomly discarded with the probability of gamma and will not participate in the training. H "u and f" i are then passed through two linear transformation networks as follows:
fi *=MLP(f″i)
Because each item is of different importance to the other items, an attention mechanism may be used to update the state of the node. For example, by embedding the user Each f i * item is connected to initialize node h i as follows:
The node representation is then updated using the graph attention layer. For each pair of nodes (i, j) their attention coefficients are calculated, wherein the attention coefficients can be calculated as follows:
Where a T and W are learnable parameters, j e N i represents a neighbor node where j is i, and S represents the length of the sequence. Finally, advanced information H g (i.e., sample user's item conversion characteristics under the current session) is obtained through the graph network. Based on the updated node vectors, each session may be represented by an embedded vector of H g, which consists of the node vectors used in the graph.
For example, based on the session interest feature and the item conversion feature of the sample user, predicting the next item of the sample user to obtain a prediction result of the sample user, the item conversion feature H g, the session interest feature H s, the object feature H u and the last item feature H l of the sample user can be spliced to obtain a final representation of the model, and then the feedforward neural network is used to obtain a final output of the model,
Hfinal=[Hg||Hs||Hu||Hl]
z=f(WzHfinal+bz)
Z represents the final output of the model, W z and b z represent the learnable parameters. Then processing through softmax function to obtain outputRepresenting the probability of the next click of the item in the session.
For example, training a preset prediction model by using the prediction result and the real result of the sample user to obtain a trained prediction model, and defining the loss function as the cross entropy of the prediction result y and the real label can be as follows:
Since the previous model uses the entire session to model and obtain the final result, considering all items in the session may be detrimental to the final result, including the impact of unrelated items. Thus, some features may be randomly deleted during the model input process. Because this model is to learn to discard these useless features, the two randomly discarded features should have little effect on their final results, which lets the model know what items should be deleted as a final representation. Thus, the dropout method, a powerful and widely used technique to normalize the training of deep neural networks, can be employed. It appears efficient and good in training networks, but the resulting randomness leads to non-negligible inconsistencies between training and reasoning. Therefore, the feature discontinue one's studies can be randomly generated using this method and then two prediction distributions can be obtained from the model. The output distribution of the model predictions should be as consistent as possible. For measuring both distributions, KL and JS divergences may be generally used. KL divergence is a measure of the asymmetry of the difference between two probability distributions.
The JS divergence measures the similarity of the two probability distributions. Based on the variation of KL divergence, the problem of asymmetry of KL divergence is solved.
For example, as shown in FIG. 2b, after the model has obtained two distributions P 1 and P 2, P 1 and P 2 are calculated in two ways by two dropouts.
Refers to two methods for calculating divergence of KL and JS,
Beta is a scale factor that controls the tradeoff between cross entropy loss and self-supervised learning loss. In the inference phase, the average of the sum of the probabilities of P 1 and P 2 can be used as the final predicted probability distribution.
And thirdly, embedding the built graph into a database and a trained prediction model, so that the next click item of the user to be predicted can be predicted, and the method can be particularly shown in fig. 2 c.
As shown in fig. 2c, a specific process of the user identification method may be as follows:
201. The electronic equipment acquires object characteristics of the user to be predicted and item characteristics of each item in the item sequence of the current conversation of the user to be predicted.
The user to be predicted may refer to a user needing to be predicted, for example, a user needing to predict a next click item.
Wherein the sequence of items of the current session may refer to a collection of items operated by the user in the current session. For example, the user continuously clicks on the web page within one hour, where the web page may be the item of the current session, and all the web pages clicked within one hour are the sequence of items of the current session.
For example, the electronic device may specifically obtain current session information of a user to be predicted, where the current session information includes a sequence of items of the user to be predicted in a current session; searching object characteristics of the user to be predicted and item characteristics of each item in an item sequence of a current session of the user to be predicted from a graph embedding database, wherein the graph embedding database is obtained by graph embedding of heterogeneous graphs, and the heterogeneous graphs are constructed based on social relations of the user and the item sequences in historical sessions of the user. For example, as shown in fig. 2d, three users, each including two session information, and their social relationship may be acquired.
The current session information may refer to information generated by a series of operations of the user in the current session, for example, may include a sequence of items of the user in the current session.
202. The electronic device determines a converged feature for each item based on the item features for each item and the location of the item in the sequence of items.
For example, the electronic device may specifically determine an item location characteristic for each item based on the location of each item in the sequence of items; and fusing the item characteristics of each item in the item sequence of the current session and the item position characteristics corresponding to each item to obtain the fusion characteristics of each item.
Wherein determining item location characteristics of an item may be accomplished by incorporating a learnable location embedding module for mapping location indices onto dense vectors to capture temporal effects of the input. For example, the electronic device may specifically obtain a location index for each item in a sequence of items of a current session, the location index being generated based on an operation time of the item; and vector mapping is carried out on the position indexes to obtain item position features corresponding to each item.
203. And the electronic equipment determines the session interest characteristics of the user to be predicted in the current session based on the fusion characteristics.
For example, the electronic device may specifically determine a first enhancement feature for each item in the sequence of items based on the converged features for each item in the sequence of items and the relevance between the items in the sequence of items; determining a second enhancement feature for each item in the sequence of items based on the first enhancement feature for the last item in the sequence of items, the converged feature for each item in the sequence of items, and the correlation between items in the sequence of items; and fusing the first enhancement feature of the last item in the item sequence with the second enhancement feature of the last item in the item sequence to obtain the session interest feature of the user to be predicted under the current session.
For example, the electronic device may specifically calculate, based on the fusion feature of each item in the sequence of items, a first correlation between the items in the sequence of items, to obtain a first correlation feature corresponding to each item; and enhancing the first relevance feature of each item to obtain a first enhancement feature of each item in the sequence of items.
For example, the electronic device may specifically calculate a second correlation between the items in the sequence of items based on the first enhancement feature of the last item in the sequence of items and the fusion feature of each item in the sequence of items, to obtain a second correlation feature corresponding to each item; and enhancing the second relatedness characteristic of each item to obtain a second enhancement characteristic of each item in the sequence of items.
204. The electronic equipment calculates the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the user to be predicted, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence.
For example, the electronic device may specifically construct a directed graph based on an association relationship between items in the item sequence of the current session, where each node in the directed graph corresponds to an item, and two nodes connected by each edge represent a pair of items that are clicked by the user to be predicted in the current session; calculating attention weights among nodes with directed edges in the directed graph, and determining the attention characteristics of each item in the sequence of items according to the attention weights; and aggregating the attention characteristics of each item in the item sequence to obtain the item conversion characteristics of the user to be predicted under the current session.
For example, the electronic device may initialize each node in the directed graph nodes based on the object feature and the item feature of the user to be predicted, to obtain an initialized item feature of each node; and calculating the attention weight among the nodes with directed edges in the directed graph based on the attention mechanism, and updating the initialized project characteristics of the nodes according to the attention weight to obtain the attention characteristics of each project in the project sequence.
205. The electronic device determines project conversion characteristics of the user under the current session to be predicted based on the attention characteristics.
For example, the electronic device may specifically aggregate the attention features of each item in the sequence of items to obtain the item conversion feature of the user to be predicted in the current session.
206. And the electronic equipment predicts the next item of the user to be predicted based on the conversation interest feature and the item conversion feature of the user to be predicted, and obtains a prediction result of the user to be predicted.
For example, the electronic device may specifically obtain the item feature of the last item in the item sequence of the current session of the user to be predicted; fusing the object features, the session interest features, the project conversion features and the project features of the last project of the user to be predicted to obtain the current session features of the user to be predicted; and predicting the next item of the user to be predicted by utilizing a trained prediction model based on the current session characteristics of the user to be predicted, so as to obtain a prediction result of the user to be predicted. And recommending items, such as recommending articles, to the user to be predicted based on the prediction result of the user to be predicted.
Alternatively, in some embodiments, other sequential modeling methods may be used in addition to project sequential modeling that may be implemented using a multi-headed attention mechanism.
Alternatively, in some embodiments, complex item transfer modeling may be implemented in addition to lightweight graph attention mechanisms, other graph networks.
Alternatively, in some embodiments, instead of obtaining the different probability distributions twice by the dropout method, training may be measured by KS versus JS divergence, and may be measured by other means. This embodiment is only one implementation and is not limited herein.
As can be seen from the above, the present embodiment may obtain the object feature of the object to be predicted, and the item feature of each item in the item sequence of the current session of the object to be predicted; then, determining fusion characteristics of each item based on the item characteristics of each item and the position of the item in the item sequence, and determining session interest characteristics of the object to be predicted under the current session based on the fusion characteristics; then, calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the object to be predicted, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the object to be predicted under the current session based on the attention characteristic; and then, predicting the next item of the object to be predicted based on the session interest feature and the item conversion feature of the object to be predicted, so as to obtain a prediction result of the object to be predicted. Since the scheme builds a two-channel model (i.e., a two-channel attention network) through the attention layer and the multi-head attention mechanism of a graph to capture the user's preferences. First, by building an ad hoc graph, the embedding of users and items is learned from historical conversations and social networks. The attention-based channel is then aimed at capturing the order pattern and interest preferences of the session. Also, for each session, it can be converted into a graph and the project conversion patterns of the graphic attention channel captured with the embedding of the corresponding user. Finally, the features obtained for the two channels are combined to obtain a representation of the session. In order to study the influence of Self-supervised learning (Self-Supervised Learning, SSL) under a social Session recommendation scenario (Session-based Social Recommendation, SSR), a new Self-supervised learning method was introduced that does not require negative sampling of SSR and is applied in a dual channel attention network. Specifically, since there is a dropout technique in the two-channel model, the two-channel model can be run twice using the same session. In this way, two different item probability distributions are obtained, making them positive samples of each other to minimize their variance to learn SSL losses. Therefore, the two-channel model can better learn the sequential mode and the project conversion mode, and effectively improves the accuracy of information prediction, thereby providing more accurate suggestions. The scheme captures sequence information and item transformations through the two channels of item sequence and item transformations, respectively, and takes advantage of the characteristics of both aspects to better represent user preferences. In addition, a new self-supervision training framework is provided, and the performance of the model can be greatly improved without undersampling. The scheme can be used in a sequence recommendation scene with a social network, and has good universality.
In order to better implement the method, correspondingly, the embodiment of the application also provides an information prediction device which can be integrated in electronic equipment, wherein the electronic equipment can be a server or a terminal and other equipment.
For example, as shown in fig. 3, the information prediction apparatus may include an acquisition unit 301, a sequence unit 302, a conversion unit 303, and a prediction unit 304, as follows:
An obtaining unit 301, configured to obtain an object feature of an object to be predicted, and an item feature of each item in an item sequence of a current session of the object to be predicted;
A sequence unit 302, configured to determine a fusion feature of each item based on the item feature of each item and the position of the item in the item sequence, and determine a session interest feature of the object to be predicted under the current session based on the fusion feature;
A conversion unit 303, configured to calculate an attention characteristic of each item in the item sequence based on an object characteristic corresponding to the object to be predicted, an item characteristic of each item in the item sequence, and an association relationship between items in the item sequence, and determine an item conversion characteristic of the object to be predicted under the current session based on the attention characteristic;
And the prediction unit 304 is configured to predict a next item of the object to be predicted based on the session interest feature and the item conversion feature of the object to be predicted, so as to obtain a prediction result of the object to be predicted.
Alternatively, in some embodiments, the sequence unit 302 may include a location subunit and a fusion subunit, as follows:
The position subunit is used for determining the position characteristics of each item based on the position of each item in the item sequence;
and the fusion subunit is used for fusing the item characteristics of each item in the item sequence of the current session and the item position characteristics corresponding to each item to obtain the fusion characteristics of each item.
Optionally, in some embodiments, the location subunit may be specifically configured to obtain a location index of each item in the sequence of items of the current session, where the location index is generated based on an operation time of the item; and vector mapping is carried out on the position indexes to obtain item position features corresponding to each item.
Optionally, in some embodiments, the sequential unit 302 may include a first enhancement subunit, a second enhancement subunit, and a sequential subunit, as follows:
The first enhancement subunit is configured to determine a first enhancement feature of each item in the sequence of items based on the fusion feature of each item in the sequence of items and a correlation between items in the sequence of items;
The second enhancement subunit is configured to determine a second enhancement feature of each item in the sequence of items based on the first enhancement feature of the last item in the sequence of items, the fusion feature of each item in the sequence of items, and the correlation between items in the sequence of items;
And the sequence subunit is used for fusing the first enhancement characteristic of the last item in the item sequence with the second enhancement characteristic of the last item in the item sequence to obtain the session interest characteristic of the object to be predicted under the current session.
Optionally, in some embodiments, the first enhancer unit may be specifically configured to calculate, based on the fusion feature of each item in the sequence of items, a first correlation between the items in the sequence of items, to obtain a first correlation feature corresponding to each item; and enhancing the first relevance feature of each item to obtain a first enhancement feature of each item in the sequence of items.
Optionally, in some embodiments, the second enhancer unit may be specifically configured to calculate, based on a fusion feature of a first enhancement feature of a last item in the sequence of items and each item in the sequence of items, a second correlation between items in the sequence of items, to obtain a second correlation feature corresponding to each item; and enhancing the second relatedness characteristic of each item to obtain a second enhancement characteristic of each item in the sequence of items.
Alternatively, in some embodiments, the conversion unit 303 may include a construction subunit, a calculation subunit, and an aggregation subunit, as follows:
The construction subunit is configured to construct a directed graph based on an association relationship between items in the item sequence of the current session, where each node in the directed graph corresponds to an item, and two nodes connected by each edge represent a pair of items that are clicked by an object to be predicted in the current session;
the computing subunit is used for computing attention weights among nodes with directed edges in the directed graph, and determining the attention characteristics of each item in the item sequence according to the attention weights;
And the aggregation subunit is used for aggregating the attention characteristics of each item in the item sequence to obtain the item conversion characteristics of the object to be predicted under the current session.
Optionally, in some embodiments, the computing subunit may be specifically configured to initialize each node in the directed graph node based on the object feature and the item feature of the object to be predicted, to obtain an initialized item feature of each node; and calculating the attention weight among the nodes with directed edges in the directed graph based on the attention mechanism, and updating the initialized project characteristics of the nodes according to the attention weight to obtain the attention characteristics of each project in the project sequence.
Optionally, in some embodiments, the obtaining unit 301 may include an obtaining subunit and a searching subunit as follows:
The obtaining subunit is configured to obtain current session information of an object to be predicted, where the current session information includes a sequence of items of the object to be predicted in a current session;
The searching subunit is configured to search the object feature of the object to be predicted and the item feature of each item in the item sequence of the current session of the object to be predicted from a graph embedding database, where the graph embedding database is obtained by graph embedding a heterogeneous graph, and the heterogeneous graph is constructed based on the social relationship of the object and the item sequence in the historical session of the object.
Optionally, in some embodiments, the information prediction apparatus may further include a building unit, where the building unit may specifically be configured to obtain a sample object set, historical session information of each sample object in the sample object set, and social relationship information between sample objects; constructing a heterogeneous graph of the sample object set based on the social relation information and the historical session information, wherein nodes in the heterogeneous graph represent objects or projects, and edges between the nodes represent association relations between projects, between the objects and between the projects and the objects; determining object features and item features of the sample object set based on the heterogeneous map; object features and item features of the sample object set are stored in a graph-embedded database.
Optionally, in some embodiments, the establishing unit may be specifically configured to map each node in the heterogram to obtain a node characteristic of each node; for each node, calculating the attention weight of the neighbor node of the node through an attention mechanism based on the initial node characteristics of the node and the neighbor node of the node; performing feature fusion based on the attention weight and the node features of each neighbor node to obtain influence features of all neighbor nodes of the node on the node; and fusing the node characteristics of the nodes and the influence characteristics corresponding to the nodes, and processing the fused characteristics based on a multi-layer perception mechanism to obtain final node characteristics of the nodes, wherein the final node characteristics of the nodes corresponding to the items are item characteristics, the nodes corresponding to the objects are object characteristics, and the final node characteristics of the nodes are object characteristics.
Optionally, in some embodiments, the prediction unit 304 may be specifically configured to obtain a project characteristic of a last project of the object to be predicted in the project sequence of the current session; fusing the object characteristics, the session interest characteristics, the project conversion characteristics and the project characteristics of the last project of the object to be predicted to obtain the current session characteristics of the object to be predicted; and predicting the next item of the object to be predicted by utilizing a trained prediction model based on the current session characteristics of the object to be predicted, so as to obtain a prediction result of the object to be predicted.
Optionally, in some embodiments, the information prediction apparatus may further include a training unit, where the training unit may specifically be configured to obtain an object feature of the sample object, and an item feature of each item in the sequence of items of the current session of the sample object; determining fusion characteristics of each item based on the item characteristics of each item and the position of the item in the item sequence, and determining session interest characteristics of the sample object under the current session based on the fusion characteristics of each item in the item sequence; calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the sample object, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the sample object under the current session based on the attention characteristic of each item; predicting the next item of the sample object based on the session interest feature and the item conversion feature of the sample object to obtain a prediction result of the sample object; and training a preset prediction model by using the prediction result and the real result of the sample object to obtain a trained prediction model.
Optionally, in some embodiments, the session interest feature includes a first session interest feature and a second session interest feature, and the item conversion feature includes a first item conversion feature and a second item conversion feature;
In the process of determining the fusion characteristic of each item based on the item characteristic of each item and the position of the item in the item sequence and determining the session interest characteristic of the sample object under the current session based on the fusion characteristic of each item in the item sequence, the method further comprises: discarding hidden layer features appearing in the process for two rounds to respectively obtain a first session interest feature and a second session interest feature;
the method comprises the steps of calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the sample object, the item characteristic of each item in the item sequence and the association relation among the items in the item sequence, and determining the item conversion characteristic of the sample object in the current session based on the attention characteristic of each item, and further comprises the following steps: discarding hidden layer features appearing in the process for two rounds to respectively obtain a first project conversion feature and a second project conversion feature;
The training unit may be specifically configured to predict a next item of the sample object based on an object feature, an item feature, a first session interest feature, and a first item conversion feature of the sample object, to obtain a first prediction probability of the sample object; predicting the next item of the sample object based on the object feature, item feature, second session interest feature and second item conversion feature of the sample object to obtain a second prediction probability of the sample object; calculating a difference between the first prediction probability and the second prediction probability, and calculating a similarity of the first prediction probability and the second prediction probability; and determining a prediction result of the sample object based on the difference value and the similarity.
In the implementation, each unit may be implemented as an independent entity, or may be implemented as the same entity or several entities in any combination, and the implementation of each unit may be referred to the foregoing method embodiment, which is not described herein again.
As can be seen from the above, in this embodiment, the obtaining unit 301 may obtain the object feature of the object to be predicted, and the item feature of each item in the item sequence of the current session of the object to be predicted; then, determining, by the order unit 302, a fusion feature for each item based on the item feature of each item and the position of the item in the sequence of items, and determining a session interest feature of the object to be predicted under the current session based on the fusion feature; next, the conversion unit 303 calculates an attention characteristic of each item in the item sequence based on the object characteristic corresponding to the object to be predicted, the item characteristic of each item in the item sequence, and the association relationship between the items in the item sequence, and determines an item conversion characteristic of the object to be predicted under the current session based on the attention characteristic; then, the prediction unit 304 predicts the next item of the object to be predicted based on the session interest feature and the item conversion feature of the object to be predicted, and obtains the prediction result of the object to be predicted. Since the scheme can learn object features and project features from historical conversations and social networks by constructing an ad hoc graph. Then, based on the item characteristics of each item and the position of the item in the item sequence, the session interest characteristics of the object to be predicted under the current session can be obtained through an item sequence channel, and based on the object characteristics corresponding to the object to be predicted, the item characteristics of each item in the item sequence and the association relation among the items in the item sequence, the item conversion characteristics of the object to be predicted under the current session can be obtained through an item conversion channel. Finally, the next item of the object to be predicted is predicted by combining the characteristics obtained by the two channels. Because the two-channel mode can better learn the sequential mode and the project conversion mode, the accuracy of information prediction is effectively improved, and therefore more accurate recommendation suggestions are provided.
In addition, the embodiment of the application further provides an electronic device, as shown in fig. 4, which shows a schematic structural diagram of the electronic device according to the embodiment of the application, specifically:
The electronic device may include one or more processing cores 'processors 401, one or more computer-readable storage media's memory 402, power supply 403, and input unit 404, among other components. Those skilled in the art will appreciate that the electronic device structure shown in fig. 4 is not limiting of the electronic device and may include more or fewer components than shown, or may combine certain components, or may be arranged in different components. Wherein:
The processor 401 is a control center of the electronic device, connects various parts of the entire electronic device using various interfaces and lines, and performs various functions of the electronic device and processes data by running or executing software programs and/or modules stored in the memory 402, and calling data stored in the memory 402. Optionally, processor 401 may include one or more processing cores; preferably, the processor 401 may integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user interface, an application program, etc., and the modem processor mainly processes wireless communication. It will be appreciated that the modem processor described above may not be integrated into the processor 401.
The memory 402 may be used to store software programs and modules, and the processor 401 executes various functional applications and data processing by executing the software programs and modules stored in the memory 402. The memory 402 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data created according to the use of the electronic device, etc. In addition, memory 402 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 402 may also include a memory controller to provide the processor 401 with access to the memory 402.
The electronic device further comprises a power supply 403 for supplying power to the various components, preferably the power supply 403 may be logically connected to the processor 401 by a power management system, so that functions of managing charging, discharging, and power consumption are performed by the power management system. The power supply 403 may also include one or more of any of a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
The electronic device may further comprise an input unit 404, which input unit 404 may be used for receiving input digital or character information and generating keyboard, mouse, joystick, optical or trackball signal inputs in connection with user settings and function control.
Although not shown, the electronic device may further include a display unit or the like, which is not described herein. In particular, in this embodiment, the processor 401 in the electronic device loads executable files corresponding to the processes of one or more application programs into the memory 402 according to the following instructions, and the processor 401 executes the application programs stored in the memory 402, so as to implement various functions as follows:
Acquiring object characteristics of an object to be predicted and item characteristics of each item in an item sequence of a current session of the object to be predicted; then, determining fusion characteristics of each item based on the item characteristics of each item and the position of the item in the item sequence, and determining session interest characteristics of the object to be predicted under the current session based on the fusion characteristics; then, calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the object to be predicted, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the object to be predicted under the current session based on the attention characteristic; and then, predicting the next item of the object to be predicted based on the session interest feature and the item conversion feature of the object to be predicted, so as to obtain a prediction result of the object to be predicted.
The specific implementation of each operation above may be referred to the previous embodiments, and will not be described herein.
As can be seen from the above, the present embodiment may obtain the object feature of the object to be predicted, and the item feature of each item in the item sequence of the current session of the object to be predicted; then, determining fusion characteristics of each item based on the item characteristics of each item and the position of the item in the item sequence, and determining session interest characteristics of the object to be predicted under the current session based on the fusion characteristics; then, calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the object to be predicted, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the object to be predicted under the current session based on the attention characteristic; and then, predicting the next item of the object to be predicted based on the session interest feature and the item conversion feature of the object to be predicted, so as to obtain a prediction result of the object to be predicted. Since the scheme can learn object features and project features from historical conversations and social networks by constructing an ad hoc graph. Then, based on the item characteristics of each item and the position of the item in the item sequence, the session interest characteristics of the object to be predicted under the current session can be obtained through an item sequence channel, and based on the object characteristics corresponding to the object to be predicted, the item characteristics of each item in the item sequence and the association relation among the items in the item sequence, the item conversion characteristics of the object to be predicted under the current session can be obtained through an item conversion channel. Finally, the next item of the object to be predicted is predicted by combining the characteristics obtained by the two channels. Because the two-channel mode can better learn the sequential mode and the project conversion mode, the accuracy of information prediction is effectively improved, and therefore more accurate recommendation suggestions are provided.
Those of ordinary skill in the art will appreciate that all or a portion of the steps of the various methods of the above embodiments may be performed by instructions, or by instructions controlling associated hardware, which may be stored in a computer-readable storage medium and loaded and executed by a processor.
To this end, an embodiment of the present application further provides a storage medium storing a plurality of instructions capable of being loaded by a processor to perform the steps of any one of the information prediction methods provided by the embodiments of the present application. For example, the instructions may perform the steps of:
Acquiring object characteristics of an object to be predicted and item characteristics of each item in an item sequence of a current session of the object to be predicted; then, determining fusion characteristics of each item based on the item characteristics of each item and the position of the item in the item sequence, and determining session interest characteristics of the object to be predicted under the current session based on the fusion characteristics; then, calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the object to be predicted, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the object to be predicted under the current session based on the attention characteristic; and then, predicting the next item of the object to be predicted based on the session interest feature and the item conversion feature of the object to be predicted, so as to obtain a prediction result of the object to be predicted.
The specific implementation of each operation above may be referred to the previous embodiments, and will not be described herein.
Wherein the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
The instructions stored in the storage medium may perform steps in any information prediction method provided by the embodiments of the present application, so that the beneficial effects that any information prediction method provided by the embodiments of the present application can be achieved, which are detailed in the previous embodiments and are not described herein.
The foregoing describes in detail a method, an apparatus, an electronic device, and a storage medium for information prediction provided by the embodiments of the present application, and specific examples are applied to illustrate the principles and implementations of the present application, where the foregoing examples are only used to help understand the method and core idea of the present application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in light of the ideas of the present application, the present description should not be construed as limiting the present application.

Claims (17)

1. An information prediction method, comprising:
acquiring current session information of an object to be predicted, wherein the current session information comprises a project sequence of the object to be predicted in a current session;
searching object characteristics of the object to be predicted and item characteristics of each item in an item sequence of a current session of the object to be predicted from a graph embedding database, wherein the graph embedding database is obtained by graph embedding of heterogeneous graphs, and the heterogeneous graphs are constructed based on social relations of the object and the item sequence in a historical session of the object;
Determining fusion characteristics of each item based on the item characteristics of each item and the position of the item in the item sequence, and determining session interest characteristics of an object to be predicted under the current session based on the fusion characteristics;
calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the object to be predicted, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the object to be predicted under the current session based on the attention characteristic;
And predicting the next item of the object to be predicted based on the session interest feature and the item conversion feature of the object to be predicted to obtain a prediction result of the object to be predicted.
2. The method of claim 1, wherein determining the fusion characteristic for each item based on the item characteristic for each item and the location of the item in the sequence of items comprises:
Determining an item location feature for each item based on the location of each item in the sequence of items;
and fusing the item characteristics of each item in the item sequence of the current session and the item position characteristics corresponding to each item to obtain the fusion characteristics of each item.
3. The method of claim 2, wherein determining the item location characteristics for each item based on the location of each item in the sequence of items comprises:
Acquiring a position index of each item in an item sequence of a current session, wherein the position index is generated based on the operation time of the item;
And vector mapping is carried out on the position indexes to obtain item position features corresponding to each item.
4. The method of claim 1, wherein the determining a session interest feature of the object to be predicted under the current session based on the fused feature comprises:
determining a first enhancement feature for each item in the sequence of items based on the converged features of each item in the sequence of items and the correlation between items in the sequence of items;
Determining a second enhancement feature for each item in the sequence of items based on the first enhancement feature for the last item in the sequence of items, the converged feature for each item in the sequence of items, and the correlation between items in the sequence of items;
and fusing the first enhancement feature of the last item in the item sequence with the second enhancement feature of the last item in the item sequence to obtain the session interest feature of the object to be predicted under the current session.
5. The method of claim 4, wherein determining the first enhancement feature for each item in the sequence of items based on the fused feature for each item in the sequence of items and the correlation between items in the sequence of items comprises:
Based on the fusion characteristics of each item in the item sequence, calculating first correlation among the items in the item sequence to obtain first correlation characteristics corresponding to each item;
And enhancing the first relevance feature of each item to obtain a first enhancement feature of each item in the sequence of items.
6. The method of claim 4, wherein determining the second enhancement feature for each item in the sequence of items based on the first enhancement feature for the last item in the sequence of items, the fusion feature for each item in the sequence of items, and the correlation between items in the sequence of items, comprises:
Calculating second correlation among the items in the item sequence based on the first enhancement feature of the last item in the item sequence and the fusion feature of each item in the item sequence to obtain second correlation features corresponding to each item;
and enhancing the second relatedness characteristic of each item to obtain a second enhancement characteristic of each item in the sequence of items.
7. The method according to claim 1, wherein the calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the object to be predicted, the item characteristic of each item in the item sequence, and the association relationship between the items in the item sequence includes:
Constructing a directed graph based on the association relation between items in the item sequence of the current session, wherein each node in the directed graph corresponds to one item, and two nodes connected by each side represent item pairs clicked by an object to be predicted in the current session;
calculating attention weights among nodes with directed edges in the directed graph, and determining the attention characteristics of each item in the sequence of items according to the attention weights;
The determining, based on the attention feature, item conversion features of an object to be predicted under the current session, including: and aggregating the attention characteristics of each item in the item sequence to obtain the item conversion characteristics of the object to be predicted under the current session.
8. The method of claim 7, wherein calculating attention weights between nodes in the directed graph where directed edges exist, determining attention characteristics for each item in the sequence of items based on the attention weights, comprises:
initializing each node in the directed graph nodes based on the object characteristics and the item characteristics of the object to be predicted to obtain initialized item characteristics of each node;
and calculating the attention weight among the nodes with directed edges in the directed graph based on the attention mechanism, and updating the initialized project characteristics of the nodes according to the attention weight to obtain the attention characteristics of each project in the project sequence.
9. The method of claim 1, further comprising, prior to looking up the object features of the object to be predicted from the graph-embedded database and the item features of each item in the sequence of items of the current session of the object to be predicted:
acquiring a sample object set, historical session information of each sample object in the sample object set, and social relationship information among the sample objects;
constructing a heterogeneous graph of the sample object set based on the social relation information and the historical session information, wherein nodes in the heterogeneous graph represent objects or projects, and edges between the nodes represent association relations between projects, between the objects and between the projects and the objects;
Determining object features and item features of the sample object set based on the heterogeneous map;
Object features and item features of the sample object set are stored in a graph-embedded database.
10. The method of claim 9, wherein the determining object features and item features of the sample object set based on the heterogeneous map comprises:
Mapping is carried out on each node in the heterogram to obtain the node characteristics of each node;
for each node, calculating the attention weight of the neighbor node of the node through an attention mechanism based on the initial node characteristics of the node and the neighbor node of the node;
performing feature fusion based on the attention weight and the node features of each neighbor node to obtain influence features of all neighbor nodes of the node on the node;
And fusing the node characteristics of the nodes and the influence characteristics corresponding to the nodes, and processing the fused characteristics based on a multi-layer perception mechanism to obtain final node characteristics of the nodes, wherein the final node characteristics of the nodes corresponding to the items are item characteristics, the nodes corresponding to the objects are object characteristics, and the final node characteristics of the nodes are object characteristics.
11. The method according to claim 1, wherein predicting the next item of the object to be predicted based on the session interest feature and the item conversion feature of the object to be predicted, to obtain the prediction result of the object to be predicted, includes:
acquiring the item characteristics of the last item of the object to be predicted in the item sequence of the current session;
fusing the object characteristics, the session interest characteristics, the project conversion characteristics and the project characteristics of the last project of the object to be predicted to obtain the current session characteristics of the object to be predicted;
and predicting the next item of the object to be predicted by utilizing a trained prediction model based on the current session characteristics of the object to be predicted, so as to obtain a prediction result of the object to be predicted.
12. The method of claim 11, wherein prior to predicting a next item of the object to be predicted using the trained predictive model, further comprising:
Acquiring object characteristics of a sample object and item characteristics of each item in an item sequence of a current session of the sample object;
determining fusion characteristics of each item based on the item characteristics of each item and the position of the item in the item sequence, and determining session interest characteristics of the sample object under the current session based on the fusion characteristics of each item in the item sequence;
Calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the sample object, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the sample object under the current session based on the attention characteristic of each item;
Predicting the next item of the sample object based on the session interest feature and the item conversion feature of the sample object to obtain a prediction result of the sample object;
and training a preset prediction model by using the prediction result and the real result of the sample object to obtain a trained prediction model.
13. The method of claim 12, wherein the session interest feature comprises a first session interest feature and a second session interest feature, the item conversion feature comprises a first item conversion feature and a second item conversion feature;
In the process of determining the fusion characteristic of each item based on the item characteristic of each item and the position of the item in the item sequence and determining the session interest characteristic of the sample object under the current session based on the fusion characteristic of each item in the item sequence, the method further comprises: discarding hidden layer features appearing in the process for two rounds to respectively obtain a first session interest feature and a second session interest feature;
the method comprises the steps of calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the sample object, the item characteristic of each item in the item sequence and the association relation among the items in the item sequence, and determining the item conversion characteristic of the sample object in the current session based on the attention characteristic of each item, and further comprises the following steps: discarding hidden layer features appearing in the process for two rounds to respectively obtain a first project conversion feature and a second project conversion feature;
The predicting the next item of the sample object based on the session interest feature and the item conversion feature of the sample object to obtain a prediction result of the sample object includes:
predicting the next item of the sample object based on the object feature, item feature, first session interest feature and first item conversion feature of the sample object to obtain a first prediction probability of the sample object;
predicting the next item of the sample object based on the object feature, item feature, second session interest feature and second item conversion feature of the sample object to obtain a second prediction probability of the sample object;
calculating a difference between the first prediction probability and the second prediction probability, and calculating a similarity of the first prediction probability and the second prediction probability;
and determining a prediction result of the sample object based on the difference value and the similarity.
14. An information prediction apparatus, comprising:
The device comprises an acquisition unit, a prediction unit and a prediction unit, wherein the acquisition unit is used for acquiring current session information of an object to be predicted, and the current session information comprises a project sequence of the object to be predicted in a current session; searching object characteristics of the object to be predicted and item characteristics of each item in an item sequence of a current session of the object to be predicted from a graph embedding database, wherein the graph embedding database is obtained by graph embedding of heterogeneous graphs, and the heterogeneous graphs are constructed based on social relations of the object and the item sequence in a historical session of the object;
A sequence unit, configured to determine a fusion feature of each item based on the item feature of each item and the position of the item in the item sequence, and determine a session interest feature of the object to be predicted under the current session based on the fusion feature;
The conversion unit is used for calculating the attention characteristic of each item in the item sequence based on the object characteristic corresponding to the object to be predicted, the item characteristic of each item in the item sequence and the association relation between the items in the item sequence, and determining the item conversion characteristic of the object to be predicted under the current session based on the attention characteristic;
And the prediction unit is used for predicting the next item of the object to be predicted based on the session interest feature and the item conversion feature of the object to be predicted to obtain a prediction result of the object to be predicted.
15. A computer readable storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the steps in the information prediction method of any one of claims 1 to 13.
16. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the information prediction method of any one of claims 1 to 13 when the program is executed.
17. A computer program product comprising a computer program or instructions which, when executed by a processor, carries out the steps in the information prediction method of any one of claims 1 to 13.
CN202210062089.9A 2022-01-19 2022-01-19 Information prediction method, device, electronic equipment and storage medium Active CN116521972B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210062089.9A CN116521972B (en) 2022-01-19 2022-01-19 Information prediction method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210062089.9A CN116521972B (en) 2022-01-19 2022-01-19 Information prediction method, device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN116521972A CN116521972A (en) 2023-08-01
CN116521972B true CN116521972B (en) 2024-07-12

Family

ID=87389064

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210062089.9A Active CN116521972B (en) 2022-01-19 2022-01-19 Information prediction method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116521972B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110119467A (en) * 2019-05-14 2019-08-13 苏州大学 A kind of dialogue-based item recommendation method, device, equipment and storage medium
CN111259243A (en) * 2020-01-14 2020-06-09 中山大学 Parallel recommendation method and system based on session

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111581519B (en) * 2020-05-25 2022-10-18 中国人民解放军国防科技大学 Item recommendation method and system based on user intention in conversation
CN112948681B (en) * 2021-03-12 2024-02-27 北京交通大学 Multi-dimensional feature fused time series data recommendation method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110119467A (en) * 2019-05-14 2019-08-13 苏州大学 A kind of dialogue-based item recommendation method, device, equipment and storage medium
CN111259243A (en) * 2020-01-14 2020-06-09 中山大学 Parallel recommendation method and system based on session

Also Published As

Publication number Publication date
CN116521972A (en) 2023-08-01

Similar Documents

Publication Publication Date Title
Zhu et al. Personalized image aesthetics assessment via meta-learning with bilevel gradient optimization
Lin et al. A survey on reinforcement learning for recommender systems
Mortlock et al. Graph learning for cognitive digital twins in manufacturing systems
JP2022529863A (en) Identity verification methods, identity verification devices, computer equipment, and computer programs
US20060167689A1 (en) System and method for predictive analysis and predictive analysis markup language
CN114386436B (en) Text data analysis method, model training method, device and computer equipment
Gui et al. Depression detection on social media with reinforcement learning
Biswas et al. Hybrid expert system using case based reasoning and neural network for classification
Chen et al. CNFRD: A Few‐Shot Rumor Detection Framework via Capsule Network for COVID‐19
Wu et al. Self-learning and explainable deep learning network toward the security of artificial intelligence of things
CN115953215B (en) Search type recommendation method based on time and graph structure
CN116521972B (en) Information prediction method, device, electronic equipment and storage medium
CN113822412A (en) Graph node marking method, device, equipment and storage medium
CN116932878A (en) Content recommendation method, device, electronic equipment, storage medium and program product
CN116452225A (en) Object classification method, device, computer equipment and storage medium
CN114119997A (en) Training method and device for image feature extraction model, server and storage medium
CN114596612A (en) Configuration method of face recognition model, recognition system, computer equipment and medium
Shen et al. Long-term multivariate time series forecasting in data centers based on multi-factor separation evolutionary spatial–temporal graph neural networks
CN115578100A (en) Payment verification mode identification method and device, electronic equipment and storage medium
CN114358364A (en) Attention mechanism-based short video frequency click rate big data estimation method
CN116522006B (en) Method and system for recommending lessons based on view self-supervision training
Ji et al. A task recommendation model in mobile crowdsourcing
CN117556150B (en) Multi-target prediction method, device, equipment and storage medium
Ni et al. Dynamic Heterogeneous Link Prediction Based on Hierarchical Attention Model
Liu et al. A Graph Neural Network Recommendation Method Integrating Multi Head Attention Mechanism and Improved Gated Recurrent Unit Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40092118

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant