CN113761272A - Data processing method, data processing equipment and computer readable storage medium - Google Patents

Data processing method, data processing equipment and computer readable storage medium Download PDF

Info

Publication number
CN113761272A
CN113761272A CN202110420502.XA CN202110420502A CN113761272A CN 113761272 A CN113761272 A CN 113761272A CN 202110420502 A CN202110420502 A CN 202110420502A CN 113761272 A CN113761272 A CN 113761272A
Authority
CN
China
Prior art keywords
video
identification
heterogeneous
node
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110420502.XA
Other languages
Chinese (zh)
Inventor
张晗
马连洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202110420502.XA priority Critical patent/CN113761272A/en
Publication of CN113761272A publication Critical patent/CN113761272A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/735Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/74Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/75Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/7867Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Abstract

The embodiment of the application discloses a data processing method, equipment and a computer readable storage medium associated with artificial intelligence, wherein the method comprises the following steps: acquiring a video identifier of a video and associated heterogeneous information associated with the video; the data attribute types of the two are different; determining the video identifier and the heterogeneous information identifier associated with the heterogeneous information as identifier nodes, and generating a heterogeneous graph containing the identifier nodes; carrying out identification node sampling on the heterogeneous graph to obtain a heterogeneous sampling sequence and an isomorphic sampling sequence; the heterogeneous sampling sequence comprises at least two identification nodes belonging to different data attribute types, and the homogeneous sampling sequence comprises at least two identification nodes belonging to the same data attribute type; and generating a video feature vector corresponding to the video identifier according to the heterogeneous sampling sequence and the homogeneous sampling sequence. By the method and the device, the video feature vector can contain abundant multivariate information, and the application accuracy of the video in an actual application scene can be improved.

Description

Data processing method, data processing equipment and computer readable storage medium
Technical Field
The present application relates to the field of internet technologies, and in particular, to a data processing method, device, and computer-readable storage medium.
Background
In video recommendation, video recall, video clustering and other scenes, vectorized expression of video features is crucial, such as clustering video feature vectors to mine new video topics, performing similarity calculation on the video feature vectors to perform relevant video recommendation, or applying the video feature vectors to a video recommendation model.
Most of the existing video feature vector construction methods are based on prior information of video content to perform model supervision training, and select intermediate layer features of a model as representation vectors (which can be called as feature vectors) of a video, for example, constructing a video classification model aiming at classification information of the video, firstly training the classification information of the video, and taking high-dimensional output vectors of the intermediate layer of the video classification model as video feature vectors during prediction. In the existing method for outputting video feature vectors according to models, the obtained video feature vectors may include text, visual, audio and other information, but are limited to the video content itself. The video feature vector only covering the text or visual information of the video itself can only be well applied to video classification scenes, and if the video feature vector is to be applied to other actual scenes, such as scenes of video recommendation, video recall, video clustering and the like, it is difficult to accurately represent the association between the video and the actual application scenes due to the uniqueness of the video feature vector (only including the association between the video or the video), so that the application accuracy of the video in the actual application scenes can be reduced.
Disclosure of Invention
Embodiments of the present application provide a data processing method, a device, and a computer-readable storage medium, which can enable a video feature vector to contain rich multivariate information, and thus can improve the application accuracy of a video in an actual application scene.
An embodiment of the present application provides a data processing method, including:
acquiring a video identifier of a video and associated heterogeneous information associated with the video; the data attribute type of the video is different from the data attribute type of the associated heterogeneous information;
determining the video identifier and the heterogeneous information identifier associated with the heterogeneous information as identifier nodes, and generating a heterogeneous graph containing the identifier nodes;
carrying out identification node sampling on the heterogeneous graph to obtain a heterogeneous sampling sequence and an isomorphic sampling sequence; the heterogeneous sampling sequence comprises at least two identification nodes belonging to different data attribute types, and the homogeneous sampling sequence comprises at least two identification nodes belonging to the same data attribute type;
and generating a video feature vector corresponding to the video identifier according to the heterogeneous sampling sequence and the homogeneous sampling sequence.
An embodiment of the present application provides a data processing apparatus, including:
the data acquisition module is used for acquiring a video identifier of a video and associated heterogeneous information associated with the video; the data attribute type of the video is different from the data attribute type of the associated heterogeneous information;
the first generation module is used for determining the video identifier and the heterogeneous information identifier associated with the heterogeneous information as identifier nodes and generating a heterogeneous graph containing the identifier nodes;
the sampling node module is used for carrying out identification node sampling on the heterogeneous graph to obtain a heterogeneous sampling sequence and an isomorphic sampling sequence; the heterogeneous sampling sequence comprises at least two identification nodes belonging to different data attribute types, and the homogeneous sampling sequence comprises at least two identification nodes belonging to the same data attribute type;
and the second generation module is used for generating the video feature vector corresponding to the video identifier according to the heterogeneous sampling sequence and the homogeneous sampling sequence.
The number of the videos is at least two, and the number of the associated heterogeneous information is at least two;
a first generation module comprising:
the first determining unit is used for determining the associated edges between the identification nodes and the edge weights of the associated edges according to the association relationship between every two videos, the association relationship between every two associated heterogeneous information and the association relationship between the videos and the associated heterogeneous information;
and the first generation unit is used for generating the abnormal graph according to the identification node, the associated edge and the edge weight of the associated edge.
The identification nodes comprise video identification nodes belonging to the video identification and heterogeneous identification nodes belonging to the heterogeneous information identification; the associated edges comprise a first associated edge, a second associated edge and a third associated edge;
a first determination unit comprising:
the first determining subunit is used for determining a first associated edge between the video identification nodes and an edge weight of the first associated edge according to the association relationship between every two videos;
the second determining subunit is configured to determine, according to an association relationship between every two pieces of association heterogeneous information, a second association edge between the heterogeneous identifier nodes and an edge weight of the second association edge;
and the third determining subunit is configured to determine, according to the association relationship between the video and the associated heterogeneous information, a third associated edge between the video identifier node and the heterogeneous identifier node, and an edge weight of the third associated edge.
The video comprises a first video and a second video; the video identification nodes comprise a first video identification node corresponding to the first video and a second video identification node corresponding to the second video;
a first determining subunit comprising:
the acquisition sequence subunit is used for acquiring effective video sequences respectively associated with the N video browsing users; n is a positive integer; the N active video sequences include an active video sequence LxX is a positive integer and x is less than or equal to the total number of sequences of N valid video sequences; active video sequence LxThe effective video in (1) is the time sequence of browsing videos by the user according to the associated videosSequencing the sequences; the ratio of the effective browsing time length of the video corresponding to the effective video to the total video time length of the effective video is larger than a browsing ratio threshold value;
a position determining subunit for determining if the first video and the second video are in the valid video sequence L respectivelyxIs a neighboring position, the valid video sequence L is determinedxThe first video and the second video have adjacent position relation;
a sequence determining subunit, configured to determine, among the N effective video sequences, an effective video sequence having an adjacent position relationship as an associated effective video sequence; the associated effective video sequence is used for representing that a first associated edge exists between the first video identification node and the second video identification node;
and the statistical sequence subunit is used for counting the number of the associated sequences associated with the effective video sequence and determining the number of the associated sequences as the edge weight of the first associated edge.
The associated heterogeneous information comprises first associated heterogeneous information and second associated heterogeneous information; the heterogeneous identification nodes comprise a first heterogeneous identification node corresponding to the first associated heterogeneous information and a second heterogeneous identification node corresponding to the second associated heterogeneous information;
a second determining subunit comprising:
a video determining subunit, configured to determine, if the same video exists between the video associated with the first associated heterogeneous information and the video associated with the second associated heterogeneous information, the same video as the associated video; the associated video is used for representing that a second associated edge exists between the first heterogeneous identification node and the second heterogeneous identification node;
and the statistical video subunit is used for counting the video quantity of the associated video and determining the video quantity as the edge weight of the second associated edge.
The associated heterogeneous information comprises a video browsing user group; the heterogeneous identification nodes comprise user identification nodes corresponding to the video browsing user group;
a third determining subunit comprising:
the acquisition frequency subunit is used for acquiring the effective browsing frequency of the video browsing user group aiming at the video in the video browsing period; the effective browsing times refer to the times of effective browsing of the video by video browsing users in the video browsing user group;
the association determining subunit is used for determining that a third association edge exists between the video identifier node and the user identifier node if the effective browsing times are greater than the effective browsing times threshold;
and the weight determining subunit is used for determining the video browsing users who effectively browse the video as the video browsing users associated with the video and determining the number of the users associated with the video browsing users as the edge weight of the third associated edge in the video browsing user group.
The associated heterogeneous information comprises at least two video accounts; the incidence relation comprises an account incidence relation; the heterogeneous identification nodes comprise account identification nodes corresponding to at least two video accounts respectively;
a third determining subunit comprising:
the acquiring account subunit is used for acquiring a related video account which has an account association relationship with the video in at least two video accounts; the account association relation is used for representing that a video publishing user publishes a video through an associated video account;
the account identification determining subunit is used for determining an account identification node corresponding to the associated video account as an associated account identification node; and a third associated edge exists between the video identification node and the associated account identification node, and the edge weight of the third associated edge is a constant parameter.
Wherein the associated heterogeneous information comprises at least two video tags; the incidence relation comprises a label incidence relation; the heterogeneous identification nodes comprise label identification nodes corresponding to at least two video labels respectively;
a third determining subunit comprising:
the acquiring tag subunit is used for acquiring an associated video tag which has a tag association relationship with the video from at least two video tags; the label incidence relation is used for representing that the video is marked with an associated video label;
the label determining subunit is used for determining a label identification node corresponding to the associated video label as an associated label identification node; and a third associated edge exists between the video identification node and the associated label identification node, and the edge weight of the third associated edge is a constant parameter.
Wherein, the sampling node module includes:
the device comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a random sampling heterogeneous path and a random sampling homogeneous path; the random sampling heterogeneous path is used for indicating the type sampling sequence of the sampled data attribute types when the identification nodes of different data attribute types are sampled; the random sampling isomorphic path is used for indicating the data attribute type sampled when the identification nodes of the same data attribute type are sampled;
the first sampling unit is used for randomly sampling the identification nodes in the heterogeneous graph according to the type sampling sequence indicated by the random sampling heterogeneous path to obtain a heterogeneous sampling sequence;
and the second sampling unit is used for determining the data attribute type indicated by the random sampling isomorphic path as a target data attribute type, and randomly sampling the identification nodes belonging to the target data attribute type in the heteromorphic graph to obtain an isomorphic sampling sequence.
Wherein, the first sampling unit includes:
the fourth determining subunit is configured to determine, according to the type sampling order, the data attribute type of the jth required sampling in the random sampling heterogeneous path as the data attribute type to be sampled; j is a positive integer less than or equal to S, and S is the total number of nodes of the identification nodes required to be sampled based on the random sampling heterogeneous path;
the sampling target subunit is used for sampling a target identification node from the heterogeneous graph according to the sampled node set and the attribute type of the data to be sampled; the data attribute type to which the target identification node belongs is a data attribute type to be sampled; the sampled node set includes sampled identification nodes;
the node adding subunit is used for adding the target identification node into the sampled node set if j is smaller than S;
and the sequence generation subunit is configured to generate a heterogeneous sampling sequence according to the sampled node set and the target identification node if j is equal to S, where the target identification node is a last identification node in the heterogeneous sampling sequence.
The sampled node set comprises a front adjacent identification node which is the last identification node in the sampled node set;
a sampling target subunit comprising:
the acquisition node subunit is used for acquiring w target preselected identification nodes with associated edges with the previous adjacent identification nodes from the heterogeneous graph according to the attribute type of the data to be sampled; wherein w is a positive integer; the data attribute types to which the w target preselected identification nodes belong are to-be-sampled data attribute types;
the summation weight subunit is used for respectively acquiring the edge weights of the associated edges between the front adjacent identification nodes and each target preselected identification node, and carrying out summation processing on the acquired edge weights to obtain a total edge weight; the w target preselected identification nodes comprise target preselected identification nodes YmWherein m is a positive integer and m is less than or equal to w;
a probability determining subunit for comparing the former adjacent identification node with the target preselected identification node YmEdge weight Z of the associated edge betweenmThe ratio of the total edge weight to the total edge weight is determined as a target preselected identification node YmRandom sampling probability of (2);
the sampling node subunit is used for randomly sampling the w target preselected identification nodes according to the random sampling probability respectively corresponding to each target preselected identification node to obtain target identification nodes; in the heterogeneous sampling sequence, the former adjacent identification node is the last identification node of the target identification node.
The isomorphic sampling sequence comprises a first isomorphic sampling sequence and a second isomorphic sampling sequence;
a second sampling unit comprising:
the first generation subunit is used for randomly sampling the video identification nodes in the heteromorphic graph if the target data attribute type is the data attribute type corresponding to the video identification nodes, and generating a first isomorphic sampling sequence containing at least two video identification nodes; the first isomorphic sampling sequence is used for representing the topological relation among video identification nodes in the heterogeneous graph;
the second generation subunit is used for randomly sampling the heterogeneous identification nodes in the heterogeneous graph if the target data attribute type is the data attribute type corresponding to the heterogeneous identification nodes, and generating a second isomorphic sampling sequence containing at least two heterogeneous identification nodes; the second isomorphic sampling sequence is used for representing the topological relation among the heterogeneous identification nodes in the heterogeneous graph.
Wherein, the second generation module comprises:
a second determining unit, configured to determine the heterogeneous sampling sequence and the homogeneous sampling sequence as at least two random sampling sequences; the at least two random sampling sequences comprise a random sampling sequence SaA is a positive integer and a is less than or equal to the total number of sequences of the at least two random sampling sequences;
a second obtaining unit for obtaining a random sampling sequence SaThe true code label of each identification node in the node; random sampling sequence SaIncluding identifying node DbAnd with the identification node DbAdjacent identification nodes with position incidence relation, b is a positive integer and is less than or equal to the random sampling sequence SaIdentifying a total number of nodes of the node; the vector dimension of each real encoding label is equal to the total number of nodes of the identification nodes in the heterogeneous graph;
a second obtaining unit, further used for identifying the node DbTrue code label CbInputting the predicted coding labels into an initial word coding model to obtain the predicted coding labels of adjacent identification nodes;
the adjusting model unit is used for adjusting model parameters in the initial word coding model according to the real coding labels of the adjacent identification nodes and the predictive coding labels of the adjacent identification nodes to obtain a target word coding model;
and the second generation unit is used for inputting the video identification to the target word coding model to obtain the video characteristic vector corresponding to the video identification.
One aspect of the present application provides a computer device, comprising: a processor, a memory, a network interface;
the processor is connected to the memory and the network interface, wherein the network interface is used for providing a data communication function, the memory is used for storing a computer program, and the processor is used for calling the computer program to enable the computer device to execute the method in the embodiment of the application.
An aspect of the present embodiment provides a computer-readable storage medium, in which a computer program is stored, where the computer program is adapted to be loaded by a processor and to execute the method in the present embodiment.
An aspect of an embodiment of the present application provides a computer program product or a computer program, where the computer program product or the computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium; the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method in the embodiment of the present application.
In this embodiment of the present application, the video feature vector is generated according to a heterogeneous sampling sequence and a homogeneous sampling sequence, and since the heterogeneous sampling sequence includes at least two identification nodes belonging to different data attribute types, the video feature vector may include an association relationship between the identification nodes belonging to different data attribute types, and similarly, since the homogeneous sampling sequence includes at least two identification nodes belonging to the same data attribute type, the video feature vector may include an association relationship between the identification nodes determined by the associated heterogeneous information; the data attribute type of the associated heterogeneous information is different from the data attribute type of the video. Therefore, the video feature vector in the application can cover the features of the associated heterogeneous information associated with the video and the features of the association relationship between the video and the associated heterogeneous information, namely the video feature vector contains the multi-element information features; if the video feature vector in the application is applied to an actual scene, such as a scene of video recommendation, video recall or video clustering, the association relationship between the video and the actual application scene can be accurately represented due to the fact that the video feature vector contains diversified information features, and therefore the application accuracy rate of the video in the actual application scene can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1a is a schematic diagram of a system architecture according to an embodiment of the present application;
fig. 1b is a schematic view of a data processing method provided in an embodiment of the present application;
fig. 2 is an overall framework diagram of learning a video feature vector based on heterogeneous information according to an embodiment of the present application;
fig. 3 is a schematic flowchart of a data processing method according to an embodiment of the present application;
fig. 4 is a schematic flowchart of a data processing method according to an embodiment of the present application;
fig. 5 is a schematic view of a data processing scenario provided in an embodiment of the present application;
fig. 6 is a schematic view of a data processing scenario provided in an embodiment of the present application;
fig. 7 is a schematic flowchart of a data processing method according to an embodiment of the present application;
fig. 8 is a schematic view of a data processing scenario provided in an embodiment of the present application;
fig. 9 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
For ease of understanding, the following brief explanation of partial nouns is first made:
artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making. The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.
Heterogeneous Graph (Heterogeneous Graph), a topological Graph, includes at least one type of node, and at least one type of edge. In the application, the at least one type of node may include a video identifier node and a heterogeneous identifier node, where the video identifier node is determined according to a video identifier, the heterogeneous identifier node is determined according to a heterogeneous information identifier, and a data attribute type to which the video identifier belongs is different from a data attribute type to which the associated heterogeneous identifier belongs. The at least one type of edge may include a first associated edge between the video identification nodes, a second associated edge between the heterogeneous identification nodes, and a third associated edge between the video identification nodes and the heterogeneous identification nodes.
The scheme provided by the embodiment of the application relates to technologies such as artificial intelligence natural language processing and deep learning, and is specifically explained by the following embodiment.
Referring to fig. 1a, fig. 1a is a schematic diagram of a system architecture according to an embodiment of the present disclosure. As shown in fig. 1a, the system may include a server 10a and a user terminal cluster, and the user terminal cluster may include: user terminal 10b, user terminal 10c, user terminal 10d, it being understood that the system described above may include one or more user terminals, and the number of user terminals will not be limited herein.
There may be a communication connection between the user terminal clusters, for example, there may be a communication connection between the user terminal 10b and the user terminal 10c, and a communication connection between the user terminal 10b and the user terminal 10 d. Meanwhile, any user terminal in the user terminal cluster may have a communication connection with the server 10a, for example, a communication connection exists between the user terminal 10b and the server 10a, and a communication connection exists between the user terminal 10c and the server 10 a. The communication connection is not limited to a connection manner, and may be directly or indirectly connected through a wired communication manner, may be directly or indirectly connected through a wireless communication manner, or may be connected through another manner, which is not limited herein.
It should be understood that each user terminal in the user terminal cluster shown in fig. 1a may be installed with an application client, and when the application client runs in each user terminal, data interaction, i.e. the above-mentioned communication connection, may be performed with the server 10a shown in fig. 1 a. The application client can be an application client with video loading and playing functions, such as a social client, a multimedia client (e.g., a video client), an entertainment client (e.g., a game client), an education client, a live broadcast client, and the like. The application client may be an independent client, or may be an embedded sub-client integrated in a certain client (for example, a social client, an educational client, a multimedia client, and the like), which is not limited herein.
The server 10a provides a service for the user terminal cluster through the communication connection function, and when a user terminal (which may be the user terminal 10b, the user terminal 10C, or the user terminal 10d) acquires the video a and needs to process the video a, for example, queries a video C similar to the video a, or acquires classification information (e.g., sports, movies, etc.) of the video a, the user terminal may send the video a to the server 10a through the application client. Referring to fig. 1b, fig. 1b is a schematic view of a scene of a data processing method according to an embodiment of the present application, and fig. 1b illustrates an example of the scene in which a video 10f is recommended. When the video browsing user browses the video 10f through the user terminal 10b and wants to view a video associated with the video 10f, as shown in fig. 1b, the video browsing user may click on the search control 10g in the user terminal 10b, and then the user terminal 10b sends the video 10f to the server 10a through the communication connection function. After the server 10a obtains the video 10f, a video identifier of the video 10f is obtained, such as the video identifier vid1 illustrated in fig. 1b, and it is understood that the video identifier vid1 is an identifier of the video 10f in the video database 10 h.
The video database 10h includes a large number of videos (including the video 10f) and target word encoding models trained in advance. The server 10a inputs the video identifier vid1 into the target word encoding model, and obtains a video feature vector 10e of the video 10f, and it is noted that the video feature vector 10e is not generated based on the content of the video 10f itself, and the vector may include information features of associated heterogeneous information associated with the video 10f, such as user features (which may be understood as browsing behavior features of the video browsing user) shown in fig. 1b, account features (account features of the publishing video 10f), and tag features (i.e., features of a video tag carried by the video 10f), which may enable the video feature vector 10e to accurately represent the association between the video 10f and other videos.
The server 10a obtains video feature vectors of each video in the video database 10h, such as the video feature vectors 10i, … and 10j illustrated in fig. 1b, and similarly, the video feature vectors 10i, … and 10j, such as the video feature vector 10e, may include diversified information features. Then, the server 10a may perform similarity calculation on the video feature vector 10e and the video feature vector 10i, …, perform similarity calculation on the video feature vector 10e and the video feature vector 10j, use the video feature vector with high similarity to the video feature vector 10e as a target video feature vector, obtain video identifiers corresponding to the target video feature vector, such as the video identifier vid5, the video identifier vid10, and the video identifier vid100 illustrated in fig. 1b, obtain videos corresponding to the video identifiers, such as the video 10m, the video 10n, and the video 10p illustrated in fig. 1b, it may be understood that the videos 10m, 10n, and 10p are not only determined according to the video 10f, but also determined according to a video tag carried by the video 10f, a video publishing account for publishing the video 10f, or a video browsing user browsing the video 10f, namely, the video feature vector generated by the method can recommend hot video topics associated with information such as video tags or video publishing accounts to video browsing users.
Subsequently, the server 10a sends the video 10m, the video 10n, and the video 10p to the application client of the user terminal 10b, and after the application client of the user terminal 10b receives the video 10m, the video 10n, and the video 10p sent by the server 10a, the video 10m, the video 10n, and the video 10p may be displayed on corresponding screens thereof. The server 10a may store the video 10f, the video identifier vid1, the video 10m, the video 10n, and the video 10p in the video database 10h in an associated manner, and when the video 10f sent by the user terminal is obtained again, the server may directly return the video 10m, the video 10n, and the video 10p to the user terminal sending the video 10 f. The video database 10h can be regarded as an electronic file cabinet, i.e., a place for storing electronic files (which may refer to video 10f, video id vid1, video 10m, video 10n, and video 10p in this application), and the server 10a can perform operations such as adding, querying, updating, and deleting on the video 10f, the video id vid1, the video 10m, the video 10n, and the video 10p in the files. A "database" is a collection of data that is stored together in a manner that can be shared by multiple users, has as little redundancy as possible, and is independent of the application.
Optionally, if the target word coding model trained in advance is locally stored in the user terminal, the user terminal may locally input the video identifier to the target word coding model to obtain a video feature vector of the video, and then perform a downstream task according to the video feature vector of the video. Since the training of the target word coding model involves a large amount of off-line computation, the local target word coding model of the user terminal may be sent to the user terminal after being trained by the server 10 a.
Further, please refer to fig. 2 together, and fig. 2 is an overall framework diagram of learning a video feature vector based on heterogeneous information according to an embodiment of the present application. As shown in fig. 2, the frame may comprise three parts:
1) and patterning. Namely, constructing an abnormal image, firstly acquiring a large amount of videos and a large amount of associated heterogeneous information, wherein the associated heterogeneous information is information associated with the videos, but the data attribute types of the videos and the associated heterogeneous information are different. It can be understood that the associated heterogeneous information may include heterogeneous information of one data attribute type or multiple data attribute types, and the number of data attribute types of the associated heterogeneous information is not limited in the present application, and may be set according to an actual application scenario.
For convenience of understanding, the associated heterogeneous information is indicated by using the video browsing user, the video publishing account and the video tag in the whole text, that is, the associated heterogeneous information in the application includes the video browsing user, the video publishing account and the video tag. The data attribute type corresponding to the video browsing user is a user attribute, the data attribute type corresponding to the video publishing account is an account attribute, the data attribute type corresponding to the video tag is a tag attribute, and obviously, the data attribute types corresponding to the various associated heterogeneous information are also different.
The method comprises the steps of obtaining a video identification Vid of a video, a user identification Gid of a video browsing user, an account identification Pid of a video publishing account and a label identification Tid of a video label; it should be noted that the user id Gid in the construction of the metamorphic graph is directed to at least one video browsing user with the same attribute, for example, users in 20 year old Shenzhen female are both an id, so the user id Gid can be understood as an id of a video browsing user group.
The identification nodes are generated by using the video identification Vid, the user identification Gid, the account identification Pid, and the tag identification Tid, and may include video identification nodes (such as the video identification node 1, the video identification node2, and the video identification node 3 illustrated in fig. 2) generated according to the video identification Vid, user identification nodes (such as the user identification node 4 and the user identification node 5 illustrated in fig. 2) generated according to the user identification Gid, account identification nodes (such as the account identification node 6 and the account identification node 7 illustrated in fig. 2) generated according to the account identification Pid, and tag identification nodes (such as the tag identification node 8 and the tag identification node 9 illustrated in fig. 2) generated according to the tag identification Tid.
The method and the device determine the association edges by utilizing the association relationship among the identification nodes, and construct the network heteromorphic graph according to the identification nodes and the association edges. Each associated edge carries an edge weight, and the edge weight is determined according to the association relationship between the two identification nodes corresponding to the associated edge. For a specific process of determining the associated edge and the edge weight of the associated edge according to the association relationship between the identification nodes, reference is made to the following embodiment corresponding to fig. 4, which is not described herein for the moment.
2) And random sampling. The method includes the steps of randomly sampling identification nodes in an abnormal graph according to a preset random sampling heterogeneous path, for example, the random sampling heterogeneous path Gid-Vid-Tid-Vid-Gid illustrated in fig. 2, and randomly sampling the identification nodes in the abnormal graph according to the random sampling heterogeneous path Gid-Vid-Tid-Vid-Gid, so as to generate a heterogeneous sampling sequence, for example, a user identification node 4-a video identification node 1-a label identification node 8-a video identification node 2-a user identification node 5 (which may be abbreviated as Gid4-Vid1-Tid8-Vid2-Gid5), where the sequence may reflect a topological relation among video browsing users, videos and video labels in the heterogeneous graph. Fig. 2 further illustrates a random sampling heterogeneous path Gid-Vid-Pid-Vid-Gid, and according to the random sampling heterogeneous path Gid-Vid-Pid-Vid-Gid, a heterogeneous sampling sequence generated by the random sampling identification node may reflect a topological relationship among the video browsing users, the videos, and the video distribution accounts in the heterogeneous graph.
According to the method, identification nodes belonging to target data attribute types in an abnormal graph are randomly sampled according to a preset random sampling isomorphic path, wherein the target data attribute types are data attribute types indicated by the random sampling isomorphic path, such as a random sampling isomorphic path Vid-Vid-Vid-Vid-Vid illustrated in FIG. 2, namely the target data attribute types are video attributes, and according to the random sampling isomorphic path Vid-Vid-Vid-Vid, video identification nodes in the abnormal graph are randomly sampled to generate an isomorphic sampling sequence, such as Vid1-Vid2-Vid3-Vid2-Vid1, and the sequence can reflect topological relations among videos in the abnormal graph. Fig. 2 also illustrates a random sampling isomorphic path Tid-Tid, and according to the random sampling isomorphic path Tid-Tid, a isomorphic sampling sequence generated by a random sampling label identification node can reflect a topological relation between labels in the heterogeneous graph.
The application takes 2 random sampling heterogeneous paths and 2 random sampling isomorphic paths as examples, such as a random sampling heterogeneous path Gid-Vid-Tid-Vid-Gid, a random sampling heterogeneous path Gid-Vid-Pid-Vid-Gid, a random sampling isomorphic path Vid-Vid, and a random sampling isomorphic path Tid-Tid shown in fig. 2, and the server can obtain a large number of random sampling sequences (including isomorphic sampling sequences and heterogeneous sampling sequences) according to the 4 sampling paths.
3) And training an initial word coding model. The heterogeneous sampling sequence and the isomorphic sampling sequence generated by the random sampling are used as the input of an initial word coding model, the context relation of the identification nodes in each sequence is used as the constraint condition of the initial word coding model, and the expression vectors of the identification nodes in the heterogeneous graph are learned. The initial word coding model illustrated in fig. 2 may be a skip gram model (a word vector learning model that predicts context according to a central word), i.e., a currently identified node may be used to predict its context identified node.
Referring to fig. 2 again, in a training process, an input layer of the initial word coding model obtains a one-hot (one-hot) coding vector identifying a node, and the dimension of the one-hot coding vector is equal to the total number V of the nodes identifying the node in the heterogeneous graph. And reducing the representation characteristics of the identification nodes to h dimension through one layer of hidden layer mapping, and finally obtaining V-dimension output probability distribution after output layer normalization (softmax). Assuming that the window size is k, k identification nodes before and after the current identification node are predicted according to the current identification node, and finally k V-dimensional probability distributions are output, as shown in fig. 2.
Through the training, the target word coding model can be obtained, and the model comprises a mapping matrix W epsilon R aiming at the videoV*hBy mapping the matrix W ∈ RV*hAnd video identification, wherein the video feature vector of each video in training can be obtained.
In summary, the video feature vector generated by the present application may include a topological relation among a plurality of heterogeneous information, so that the vector may be applied to a plurality of downstream service scenarios, such as video clustering, recommended video recall, associated video recommendation, and the like, may assist an operator side in mining a new video topic, and may also be accessed to a video recommendation system to improve consumption indexes such as click, duration, retention, and the like.
The server 10a, the user terminal 10b, the user terminal 10c,. and the user terminal 10d in fig. 1a may each include a mobile phone, a tablet computer, a notebook computer, a palm computer, a smart audio, a mobile internet device (MID, mobile internet device), a POS (Point Of Sales) machine, a wearable device (e.g., a smart watch, a smart bracelet, etc.), and the like.
It is understood that the data processing method provided by the embodiment of the present application can be executed by a computer device, and the computer device includes, but is not limited to, a terminal or a server. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, middleware service, a domain name service, a security service, a CDN, a big data and artificial intelligence platform, and the like. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.
Further, please refer to fig. 3, where fig. 3 is a schematic flow chart of a data processing method according to an embodiment of the present application. The data processing method may be executed by the server or the user terminal described in fig. 1a, or may be executed by both the server and the user terminal. As shown in fig. 3, the data processing procedure may include the following steps.
Step S101, acquiring a video identifier of a video and associated heterogeneous information associated with the video; the data attribute type of the video is different from the data attribute type of the associated heterogeneous information.
Specifically, the data used in the present application may be from an information stream product, where the data includes videos and associated heterogeneous information associated with the videos, where the data may include multiple videos, and the number of the videos is not limited in the present application; the associated heterogeneous information may include heterogeneous information of one data attribute type or multiple data attribute types, and the specific data attribute type of the associated heterogeneous information is not limited in the present application, and may be set according to an actual application scenario, for example, the associated heterogeneous information is a user browsing a video, an account issuing the video, a tag carried by the video, a brief introduction of the video, and a comment of the video.
Step S102, determining the video identification and the heterogeneous information identification of the associated heterogeneous information as identification nodes, and generating a heterogeneous graph containing the identification nodes.
Specifically, the server obtains a video identifier of the video and a heterogeneous information identifier associated with the heterogeneous information, where the video identifier may be a video name, a video website address, and the like of the video, and is not limited herein as long as the video identifier has uniqueness; similarly, the heterogeneous information identifier may be any information that can be used to identify the associated heterogeneous information, and the present application is not limited in any way as long as the information has uniqueness. The server determines the video identifier as a video identifier node and determines the heterogeneous information identifier as a heterogeneous identifier node, and obviously, the data attribute type corresponding to the video identifier node is different from the data attribute type corresponding to the heterogeneous identifier node.
The number of the videos is at least two, and the number of the associated heterogeneous information is at least two; determining an associated edge between the identification nodes and an edge weight of the associated edge according to the association relationship between every two videos, the association relationship between every two associated heterogeneous information and the association relationship between the videos and the associated heterogeneous information; and generating the abnormal graph according to the identification node, the associated edge and the edge weight of the associated edge. The associated edges include a first associated edge, a second associated edge, and a third associated edge. Notably, the heterogeneous graph includes identifying nodes belonging to different data attribute types, as well as multiple types of associated edges.
The specific process of determining the associated edges between the identified nodes and the edge weights of the associated edges may include: the server determines a first associated edge between the video identification nodes and an edge weight of the first associated edge according to an association relationship between every two videos, for example, a first associated edge exists between the video identification node2 and the video identification node 3 in fig. 2; determining a second association edge between the heterogeneous identification nodes and an edge weight of the second association edge according to an association relationship between every two pieces of association heterogeneous information, where it is to be understood that data attribute types respectively corresponding to two heterogeneous identification nodes connected by the second association edge are the same, for example, both the two heterogeneous identification nodes connected by the second association edge are label identification nodes, and a second association edge exists between a label identification node 8 and a label identification node 9 as shown in fig. 2; according to the association relationship between the video and the associated heterogeneous information, a third association edge between the video identification node and the heterogeneous identification node and an edge weight of the third association edge are determined, and as shown in fig. 2, the third association edge exists between the video identification node 1 and the account identification node 6. For determining the edge weight of the first associated edge, the edge weight of the second associated edge, and the edge weight of the third associated edge, reference is made to the following description of the embodiment corresponding to fig. 4, which is not expanded herein. It can be understood that, in actual application, the association heterogeneous information and the association relationship between the video and the association heterogeneous information may be set according to an application scene, and then the second association side and the third association side of different types may be obtained.
Step S103, carrying out identification node sampling on the heterogeneous graph to obtain a heterogeneous sampling sequence and an isomorphic sampling sequence; the heterogeneous sampling sequence comprises at least two identification nodes belonging to different data attribute types, and the homogeneous sampling sequence comprises at least two identification nodes belonging to the same data attribute type.
Specifically, a random sampling heterogeneous path and a random sampling homogeneous path are obtained; the random sampling heterogeneous path is used for indicating the type sampling sequence of the sampled data attribute types when the identification nodes of different data attribute types are sampled; the randomly sampled isomorphic path is used to indicate the data attribute type sampled when the identified nodes of the same data attribute type are sampled.
Randomly sampling the identification nodes in the heterogeneous graph according to the type sampling sequence indicated by the random sampling heterogeneous path to obtain a heterogeneous sampling sequence; and determining the data attribute type indicated by the random sampling isomorphic path as a target data attribute type, and randomly sampling the identification nodes belonging to the target data attribute type in the heteromorphic image to obtain an isomorphic sampling sequence.
The server acquires a random sampling heterogeneous path and a random sampling isomorphic path which are preset in advance, and it can be understood that the random sampling heterogeneous path and the random sampling isomorphic path can be set according to an actual application scene. The random sampling heterogeneous paths comprise different data attribute types, and when the identification nodes in the heterogeneous graph are randomly sampled, the random sampling heterogeneous paths are in a type sampling sequence indicated by the random sampling heterogeneous paths. According to the method and the device, the heterogeneous graph can be sampled for multiple times according to a small number of random sampling heterogeneous paths and random sampling isomorphic paths, a large number of heterogeneous sampling sequences and isomorphic sampling sequences can be obtained, and then a large amount of training data can be prepared for unsupervised training in the step S104.
And step S104, generating a video feature vector corresponding to the video identifier according to the heterogeneous sampling sequence and the homogeneous sampling sequence.
Specifically, the heterogeneous sampling sequence and the homogeneous sampling sequence are determined as at least two random sampling sequences; the at least two random sampling sequences comprise a random sampling sequence SaA is a positive integer and a is less than or equal to the total number of sequences of the at least two random sampling sequences; obtaining a random sampling sequence SaThe true code label of each identification node in the node; random sampling sequence SaIncluding identifying node DbAnd with the identification node DbAdjacent identification nodes with position incidence relation, b is a positive integer and is less than or equal to the random sampling sequence SaIdentifying a total number of nodes of the node; the vector dimension of each real encoding label is equal to the total number of nodes of the identification nodes in the heterogeneous graph; will identify node DbTrue code label CbInputting the predicted coding labels into an initial word coding model to obtain the predicted coding labels of adjacent identification nodes; adjusting model parameters in the initial word coding model according to the real coding labels of the adjacent identification nodes and the predictive coding labels of the adjacent identification nodes to obtain a target word coding model; and inputting the video identification to the target word coding model to obtain the video characteristic vector corresponding to the video identification.
The isomorphic sampling sequence and the heterogeneous sampling sequence obtained in step S103 are both determined as random sampling sequences, and unsupervised training is performed on a word vector learning model (i.e., an initial word coding model) by using the random sampling sequences, it can be understood that the embodiment of the present application does not limit the model type to which the initial word coding model belongs, and may be any word vector learning model, such as a skipgram model and a CBOW model (a word vector learning model that predicts a central word according to context). The server may obtain, in addition to the initial word coding model, a one-hot coding vector of each identified node in each random sampling sequence, or may obtain, through the initial word coding model, a one-hot coding vector of each identified node in each random sampling sequence.
For ease of understanding and description, the embodiments of the present application are described in terms of training a skipgram model. Assume that the random sampling sequence is { gid1, vid4, tid6, vid3, gid2}, k is 2, in the course of one training, if the label identification node 6 is used as the input of the skipgram model (which is equivalent to using the one-hot coded vector corresponding to the label identification node 6 as the input of the skipgram model), then the adjacent 4 identification nodes above and below the label identification node, namely, the user identification node 1, the video identification node 4, the video identification node 3 and the user identification node2, as supervision signals, namely, the one-hot coded vectors corresponding to the 4 identification nodes are used as real coded labels to carry out model training to obtain the predictive coded vectors corresponding to the 4 identification nodes respectively, i.e., predictive coding tags, and then based on the 4 true coding tags and the 4 predictive coding tags described above, and adjusting the model parameters in the skipgram model to obtain a target word coding model.
After the server obtains the target word coding model, the video identifier in training can be input into the target word coding model, and the video feature vector corresponding to the video identifier is obtained.
In this embodiment of the present application, the video feature vector is generated according to a heterogeneous sampling sequence and a homogeneous sampling sequence, and since the heterogeneous sampling sequence includes at least two identification nodes belonging to different data attribute types, the video feature vector may include an association relationship between the identification nodes belonging to different data attribute types, and similarly, since the homogeneous sampling sequence includes at least two identification nodes belonging to the same data attribute type, the video feature vector may include an association relationship between the identification nodes determined by the associated heterogeneous information; the data attribute type of the associated heterogeneous information is different from the data attribute type of the video. Therefore, the video feature vector in the application can cover the features of the associated heterogeneous information associated with the video and the features of the association relationship between the video and the associated heterogeneous information, namely the video feature vector contains the multi-element information features; if the video feature vector in the application is applied to an actual scene, such as a scene of video recommendation, video recall or video clustering, the association relationship between the video and the actual application scene can be accurately represented due to the fact that the video feature vector contains diversified information features, and therefore the application accuracy rate of the video in the actual application scene can be improved. In addition, the topological structure relation between the heterogeneous image learning video and the associated heterogeneous information is constructed, the existing supervision training method is avoided, and the application range of the video feature vector is wider.
Further, please refer to fig. 4, where fig. 4 is a schematic flowchart of a data processing method according to an embodiment of the present application. The data processing method may be executed by the server or the user terminal described in fig. 1a, or may be executed by both the server and the user terminal. As shown in fig. 4, the data processing procedure may include the following steps.
Step S201, acquiring a video and associated heterogeneous information associated with the video; the associated heterogeneous information comprises a video browsing user, a video publishing account and a video tag; the video, the video browsing user, the video publishing account and the video label respectively correspond to different data attribute types.
Specifically, in the embodiment of the application, the user browsing the video is used to publish the account number of the video (which may also be referred to as a video publishing account number), and the tag carried by the video (which may also be referred to as a video tag) indicates the associated heterogeneous information. In consideration of sparseness of user behaviors, the video browsing users are aggregated according to age-gender-geographic city triples, that is, a single user identification node includes all video browsing users with the same gender, age and region, for example, a 20-year-old Shenzhen female is a user identification node, and a 30-year-old Dongguan male is another user identification node, so that a large number of user identification nodes are obtained, and therefore, the video browsing users in the text are a user group including at least one video browsing user without special reference.
Referring to fig. 5, fig. 5 is a schematic view of a data processing scenario according to an embodiment of the present disclosure. As shown in fig. 5, the server 40a obtains 3 videos, that is, a video 401b, a video 401c, and a video 401d shown in fig. 5, where the video 401b is published by an account 404b (e.g., a video publishing account 100000 in fig. 5) and carries a video tag 403b (e.g., sports in fig. 5), and the user group 402b browses the video 401 b; the video 401c is published by an account 404c (e.g., the video publishing account 200000 in fig. 5), and carries a video tag 403b (e.g., sports in fig. 5) and a video tag 403c (e.g., movie in fig. 5), and the user group 402b and the user group 402c respectively browse the video 401 c; the video 401d is published by the account 404c, and carries a video tag 403c (such as a movie in fig. 5), and the user group 402c browses the video 401 d.
It is understood that the numbers illustrated in fig. 5 (e.g., 3 videos, 2 user groups, 2 video distribution accounts, and 2 video tags) are assumed for ease of understanding and description, and in practical applications, the server 40a needs to obtain a larger number of videos and associated heterogeneous information to extract various topological relationships in the heterogeneous composition.
Step S202, determining video identification of the video, user identification of a video browsing user, account identification of a video release account and label identification of a video label as identification nodes; the number of videos is at least two, and the number of video tags is at least two.
Specifically, the identifier node includes a video identifier node belonging to the video identifier, a user identifier node belonging to the user identifier, an account identifier node belonging to the account identifier, and a tag identifier node belonging to the tag identifier.
In the embodiment of the application, the video identifier, the user identifier, the account identifier, and the tag identifier are named by using the serial number, please refer to fig. 5 again, the server 40a identifies the video 401b as the video identifier vid1, identifies the video 401c as the video identifier vid2, and identifies the video 401d as the video identifier vid 3; the server 40a identifies the account 404b (i.e., the video distribution account 100000 in fig. 5) as the account identification pid6, and identifies the account 404c (i.e., the video distribution account 200000 in fig. 5) as the account identification pid 7; the server 40a identifies the user group 402b as user identification gid4 and the user group 402c as user identification gid 5; server 40a identifies video tag 403b (i.e., sports in fig. 5) as tag identification tid8 and video tag 403c (i.e., movie in fig. 5) as tag identification tid 9.
The server 40a determines the video identifier, the user identifier, the account identifier, and the tag identifier as identifier nodes, that is, the video identifier vid1, the video identifier vid2, the video identifier vid3, the account identifier pid6, the account identifier pid7, the user identifier gid4, the user identifier gid5, the tag identifier tid8, and the tag identifier tid9 in fig. 5 as identifier nodes. The server 40a generates video identifier nodes according to the video identifier vid1, the video identifier vid2, and the video identifier vid3, respectively, and the video identifier nodes are illustrated as circular nodes in fig. 5, such as the video identifier node 1, the video identifier node2, and the video identifier node 3 illustrated in fig. 5; the server 40a generates account identification nodes according to the account identification pid6 and the account identification pid7, respectively, and the account identification nodes are illustrated as diamond-shaped nodes in fig. 5, such as the account identification node 6 and the account identification node 7 illustrated in fig. 5; the server 40a generates the user identification nodes according to the user identifications gid4 and the user identifications gid5, respectively, and the user identification nodes are illustrated as rectangular nodes in fig. 5, such as the user identification node 4 and the user identification node 5 illustrated in fig. 5; the server 40a generates tag identification nodes according to the tag identifications tid8 and the tag identifications tid9, respectively, and the tag identification nodes are illustrated by triangle nodes in fig. 5, such as the account identification node 8 and the account identification node 9 illustrated in fig. 5.
Step S203, determining the associated edges between the identification nodes and the edge weights of the associated edges according to the association relationship between every two videos, the association relationship between every two video tags and the association relationship between the videos and the associated heterogeneous information, and generating the heterogeneous image according to the identification nodes, the associated edges and the edge weights of the associated edges.
Specifically, the associated edges include a first associated edge, a second associated edge, and a third associated edge.
The server 40a determines, according to the association relationship between each two videos, a first association edge between two corresponding video identification nodes and an edge weight of the first association edge, for example, the first association edge between the video identification node2 and the video identification node 3 illustrated in fig. 5; determining a second association edge between two corresponding label identification nodes and an edge weight of the second association edge according to an association relationship between every two video labels, and as can be seen from fig. 5, a second association edge exists between the label identification node 1 and the label identification node 2; the server 40a may determine, according to an association relationship between the video and the associated heterogeneous information, a third association edge between the video node and the heterogeneous identification node, and an edge weight of the third association edge, for example, a third association edge exists between the video identification node 1 and the user identification node 4, a third association edge exists between the video identification node 1 and the account identification node 6, and a third association edge exists between the video identification node 1 and the tag identification node 8 as illustrated in fig. 5. It can be understood that, in actual application, the associated heterogeneous information may be set according to an application scenario, and then the third associated edges of different types may be obtained.
Referring to fig. 5 again, the server 40a generates an abnormal graph according to the video identifier node, the user identifier node, the account identifier node, the tag identifier node, the first associated edge, the second associated edge, and the third associated edge. For a specific process of determining the edge weight of the associated edge, please refer to the following embodiment corresponding to fig. 7, which will not be described herein for the moment.
Step S204, a random sampling heterogeneous path and a random sampling homogeneous path are obtained; the random sampling heterogeneous path is used for indicating the type sampling sequence of the sampled data attribute types when the identification nodes of different data attribute types are sampled; the randomly sampled isomorphic path is used to indicate the data attribute type sampled when the identified nodes of the same data attribute type are sampled.
Specifically, the server obtains a random sampling heterogeneous path and a random sampling homogeneous path which are preset in advance, and it can be understood that the random sampling heterogeneous path and the random sampling homogeneous path can be set according to an actual application scenario, and the random sampling heterogeneous path and the random sampling homogeneous path are not limited in the present application. The random sampling heterogeneous path comprises at least two data attribute types, and when the random sampling is carried out on the identification nodes in the heterogeneous graph, the random sampling heterogeneous path is according to the type sampling sequence indicated by the random sampling heterogeneous path.
Step S205, randomly sampling the identification nodes in the heterogeneous graph according to the type sampling sequence indicated by the random sampling heterogeneous path to obtain a heterogeneous sampling sequence; the heterogeneous sampling sequence comprises at least two identification nodes belonging to different data attribute types.
Specifically, according to a type sampling sequence, determining a data attribute type of the jth sampling required in the random sampling heterogeneous path as a data attribute type to be sampled; j is a positive integer less than or equal to S, and S is the total number of nodes of the identification nodes required to be sampled based on the random sampling heterogeneous path; sampling target identification nodes from the heterogeneous graph according to the sampled node set and the attribute type of the data to be sampled; the data attribute type to which the target identification node belongs is a data attribute type to be sampled; the sampled node set includes sampled identification nodes; if j is smaller than S, adding the target identification node into the sampled node set; and if j is equal to S, generating a heterogeneous sampling sequence according to the sampled node set and the target identification node, wherein the target identification node is the last identification node in the heterogeneous sampling sequence.
The sampled node set comprises a front adjacent identification node which is the last identification node in the sampled node set; the specific process of sampling the target identification node from the heterogeneous graph may include: acquiring w target preselected identification nodes with associated edges with the previous adjacent identification nodes from a heterogeneous graph according to the attribute type of the data to be sampled; wherein w is a positive integer; w target preselection targetsIdentifying the data attribute type to which the node belongs as a data attribute type to be sampled; respectively acquiring the edge weight of an associated edge between a front adjacent identification node and each target preselected identification node, and summing the acquired edge weights to obtain a total edge weight; the w target preselected identification nodes comprise target preselected identification nodes YmWherein m is a positive integer and m is less than or equal to w; the front adjacent identification node and the target preselected identification node Y are connectedmEdge weight Z of the associated edge betweenmThe ratio of the total edge weight to the total edge weight is determined as a target preselected identification node YmRandom sampling probability of (2); randomly sampling w target preselected identification nodes according to the random sampling probability respectively corresponding to each target preselected identification node to obtain target identification nodes; in the heterogeneous sampling sequence, the former adjacent identification node is the last identification node of the target identification node.
Referring to fig. 6, fig. 6 is a schematic view of a data processing scenario according to an embodiment of the present disclosure. As shown in fig. 6, in the embodiment of the present application, a random sampling heterogeneous path is assumed to be Gid-Vid-Tid-Vid-Gid, so that the server performs random sampling on an identification node (i.e., a user identification node) belonging to a user attribute, then performs random sampling on an identification node (i.e., a video identification node) belonging to a video attribute, then performs random sampling on an identification node (i.e., a label identification node) belonging to a label attribute, then performs random sampling on an identification node (i.e., a video identification node) belonging to a video attribute, and finally performs random sampling on an identification node (i.e., a user identification node) belonging to a user attribute, and an obtained heterogeneous sampling sequence includes 5 identification nodes and 3 data attribute types.
Referring to fig. 6 again, each associated edge in fig. 6 carries an edge weight, for example, a third associated edge exists between the user identifier node 4 and the video identifier node 1, and an edge weight E of the third associated edge between the user identifier node 4 and the video identifier node 1(gid4,vid1)Equal to 3; a third correlation edge exists between the user identification node 4 and the video identification node2, and the edge weight E of the third correlation edge between the user identification node 4 and the video identification node2(gid4,vid2)Equal to 4; account identification node 6 and video identification nodeA third associated edge exists between the points 1, and the edge weight E of the third associated edge between the account identification node 6 and the video identification node 1(vid1,pid6)Equal to 1; a third associated edge exists between the label identification node 8 and the video identification node 1, and the edge weight E of the third associated edge between the label identification node 8 and the video identification node 1(vid1,tid8)Equal to 1.
In one identification node sampling, as shown in fig. 6, randomly sampling the identification nodes in the heterogeneous graph according to a random sampling heterogeneous path Gid-Vid-Tid-Vid-Gid; according to the type sampling sequence, firstly determining the data attribute type of the 1 st required sampling in the random sampling heterogeneous path Gid-Vid-Tid-Vid-Gid as the data attribute type to be sampled, namely, the attribute type of the data to be sampled is the user attribute, the sampled node set is an empty set because the current randomly sampled identification node is the first identification node, that is, there is no previous neighboring identification node, at this time, the user identification nodes belonging to the user attribute may be sampled averagely and randomly, or may be understood as traversing the user identification nodes one by one, for example, randomly sampling the identification nodes in the heterogeneous graph 100 times according to the random sampling heterogeneous path Gid-Vid-Tid-Vid-Gid, the user identification node 4 in fig. 6 is randomly sampled 50 times and the user identification node 5 in fig. 6 is randomly sampled 50 times.
Assuming that the target identification node sampled from the heterogeneous composition this time is the user identification node 4, as shown in fig. 6, since j is 1 and is smaller than S (S is 5), the user identification node 4 is added to the sampled node set. At this time, according to the type sampling sequence indicated by the random sampling heterogeneous path Gid-Vid-Tid-Vid-Gid, the first identification node is successfully sampled, and then the data attribute type to be sampled at the 2 nd in the random sampling heterogeneous path Gid-Vid-Gid is determined as the data attribute type to be sampled, namely the data attribute type to be sampled is the video attribute. Obviously, the current randomly sampled identification node is the second identification node, and the sampled node set includes the user identification node 4, so the previous adjacent identification node is the user identification node 4, at this time, a random walk (deepwalk) algorithm may be adopted to randomly sample the video identification node belonging to the video attribute, it can be understood that the random walk algorithm is not limited in the present application, any random walk algorithm may be adopted, a metapath2vec algorithm (a vertex embedding method for a heterogeneous information network) is adopted in the present application, and a specific sampling process is described as follows.
Referring to fig. 6 again, knowing that the current neighbor identification node is the user identification node 4 and the attribute type of the data to be sampled is the video attribute, the server may obtain the target preselected identification node from the heterogeneous graph, that is, the video identification node 1 belonging to the video attribute and the video identification node2 belonging to the video attribute in fig. 6. Further, the server obtains edge weights of associated edges between the two target preselected identification nodes and a previous adjacent identification node (i.e. the user identification node 4), respectively, as shown in fig. 6, an edge weight E of a third associated edge between the user identification node 4 and the video identification node 1(gid4,vid1)Equal to 3, the edge weight E of the third associated edge between the user identification node 4 and the video identification node2(gid4,vid2)Equal to 4, so the total edge weight is equal to 7; the random sampling probability p (vid1| gid4) of the video identifying node 1 is 3/7, and the random sampling probability p (vid2| gid4) of the video identifying node2 is 4/7. In the embodiment of the application, if the identification nodes in the heterogeneous composition are randomly sampled 100 times according to the random sampling heterogeneous path Gid-Vid-Tid-Vid-Gid, the second identification node in the heterogeneous sampling sequence is probably the video identification node 143 times, and is probably the video identification node2 57 times.
If the process of randomly sampling the identification nodes in the heterogeneous graph according to the randomly sampled heterogeneous path is understood as random walk, the process can be expressed by formula (1), where formula (1) is as follows:
Figure BDA0003027658400000231
wherein P represents a randomly sampled heterogeneous path, such as the randomly sampled heterogeneous path Gid-Vid-Tid-Vid-Gid illustrated in fig. 6; i represents the number of random walk steps, e.g. 1 represents samplingThe first identification node of the sample randomly walks to the second identification node of the sample, for example, the user identification node 4 randomly walks to the video identification node 2;
Figure BDA0003027658400000241
to represent
Figure BDA0003027658400000242
Random walk to vi+1The random walk probability of (a), corresponding to the random sampling probability described above; v denotes an identification node which identifies the node,
Figure BDA0003027658400000243
representing the t type of identification node;
Figure BDA0003027658400000244
to represent
Figure BDA0003027658400000245
V. an identification node of type t +1 in the neighbour identification nodes ofi +1Identifying a node for a target, wherein the node belongs to a type t +1 and corresponds to a data attribute type to be sampled; ei,i+1To represent
Figure BDA0003027658400000246
And vi+1Edge weights in between; phi (v)i+1) Denotes vi+1The node type to which the node belongs;
Figure BDA0003027658400000247
denotes vi+1And
Figure BDA0003027658400000248
there is an associated edge between them.
According to the formula (1), when the ith step of random walk is performed, only the predefined node types in the randomly sampled heterogeneous path, i.e. the data attribute types, are considered, and
Figure BDA0003027658400000249
the greater the weight of the connected edgeThe greater the random walk probability.
Referring to fig. 6 again, assuming that the target identification node sampled from the heterogeneous graph at the 2 nd time is the video identification node2, since j is 2 and is smaller than S (S is 5), the video identification node2 is added into the sampled node set, and the current sampled node set includes the user identification node 4 and the video identification node 2. It can be understood that the subsequent random sampling is consistent with the process of the sampling identification node, and therefore is not repeated here one by one, until j is equal to S, a heterogeneous sampling sequence is generated according to the sampled node set and the target identification node.
Step S206, determining the data attribute type indicated by the random sampling isomorphic path as a target data attribute type, and randomly sampling the identification nodes belonging to the target data attribute type in the heteromorphic image to obtain an isomorphic sampling sequence; the isomorphic sampling sequence contains at least two identified nodes that belong to the same data attribute type.
Specifically, the isomorphic sampling sequence includes a first isomorphic sampling sequence and a second isomorphic sampling sequence; the specific process of obtaining the isomorphic sampling sequence may include: if the target data attribute type is the data attribute type corresponding to the video identification node, randomly sampling the video identification node in the heteromorphic graph to generate a first isomorphic sampling sequence containing at least two video identification nodes; the first isomorphic sampling sequence is used for representing the topological relation among video identification nodes in the heterogeneous graph; if the target data attribute type is the data attribute type corresponding to the heterogeneous identification node, randomly sampling the heterogeneous identification node in the heterogeneous graph to generate a second homogeneous sampling sequence comprising at least two heterogeneous identification nodes; the second isomorphic sampling sequence is used for representing the topological relation among the heterogeneous identification nodes in the heterogeneous graph.
Firstly, determining a target data attribute type indicated in a random sampling isomorphic path, wherein the embodiment of the application exemplifies video attributes and label attributes, and then a server can adopt a random walk algorithm to randomly sample video identification nodes belonging to the video attributes to generate a first isomorphic sampling sequence containing at least two video identification nodes; the first isomorphic sampling sequence is used for representing the topological relation among video identification nodes in the heterogeneous graph. The server can adopt a random walk algorithm to randomly sample the label identification nodes belonging to the label attribute, and generate a second isomorphic sampling sequence comprising at least two label identification nodes; the second isomorphic sampling sequence is used for representing the topological relation among the label identification nodes in the heterogeneous graph.
It is to be understood that the application does not limit the random walk algorithm, and any random walk algorithm may be adopted, for example, any random walk algorithm may be used in a Node2Vec (an algorithm that uses vector modeling for nodes in a network graph), a second-order PageRank (a link analysis algorithm) algorithm, a second-order SimRank (a collaborative filtering recommendation algorithm) algorithm, and a second-order RWR (a restart random walk algorithm) algorithm.
And step S207, generating a video feature vector corresponding to the video identifier according to the heterogeneous sampling sequence and the homogeneous sampling sequence.
Specifically, please refer to the description of step S104 in the embodiment corresponding to fig. 3 for a specific implementation process of step S207, which is not described herein again.
In summary, the embodiment of the present application discloses a video vectorization method based on graph representation learning, which aims to construct a heterogeneous graph by using various heterogeneous information, such as a video browsing user, a video publishing account, a video tag, a video and the like, and an association relationship between the various heterogeneous information, and then perform unsupervised graph representation learning based on the heterogeneous graph, for example, two random walk algorithms of metapath2vec and node2vec are fused, perform depwalk on the heterogeneous graph to generate a random sampling sequence, perform a skipgram on a sequence result of the random sampling, and learn a topological structure of a network heterogeneous graph, so as to obtain the vectorization expression of the video. The video feature vector obtained by the method contains the topological relation among various heterogeneous information, so that the vector can be applied to various downstream service scenes, such as video clustering, recommended video recall, associated video recommendation and the like, an operator side can be assisted to mine new video topics, and a recommendation system can be accessed to improve consumption indexes such as click, duration, retention and the like.
In this embodiment of the present application, the video feature vector is generated according to a heterogeneous sampling sequence and a homogeneous sampling sequence, and since the heterogeneous sampling sequence includes at least two identification nodes belonging to different data attribute types, the video feature vector may include an association relationship between the identification nodes belonging to different data attribute types, and similarly, since the homogeneous sampling sequence includes at least two identification nodes belonging to the same data attribute type, the video feature vector may include an association relationship between the identification nodes determined by the associated heterogeneous information; the data attribute type of the associated heterogeneous information is different from the data attribute type of the video. Therefore, the video feature vector in the application can cover the features of the associated heterogeneous information associated with the video and the features of the association relationship between the video and the associated heterogeneous information, namely the video feature vector contains the multi-element information features; if the video feature vector in the application is applied to an actual scene, such as a scene of video recommendation, video recall or video clustering, the association relationship between the video and the actual application scene can be accurately represented due to the fact that the video feature vector contains diversified information features, and therefore the application accuracy rate of the video in the actual application scene can be improved. In addition, the topological structure relation between the heterogeneous image learning video and the associated heterogeneous information is constructed, the existing supervision training method is avoided, and the application range of the video feature vector is wider.
Further, please refer to fig. 7, and fig. 7 is a schematic flowchart of a data processing method according to an embodiment of the present application. As shown in fig. 7, the data processing procedure may include the following steps S2031 to S2033, and the steps S2031 to S2033 are a specific embodiment of the step S203 in the embodiment corresponding to fig. 4.
Step S2031, according to the incidence relation between every two videos, a first incidence edge between the video identification nodes and the edge weight of the first incidence edge are determined.
Specifically, the video includes a first video and a second video; the video identification node comprises a first video identification node corresponding to the first video, anda second video identification node corresponding to the second video; obtaining effective video sequences respectively associated with N video browsing users; n is a positive integer; the N active video sequences include an active video sequence LxX is a positive integer and x is less than or equal to the total number of sequences of N valid video sequences; active video sequence LxThe effective videos in (1) are sorted according to the time sequence of the related video browsing users for browsing the videos; the ratio of the effective browsing time length of the video corresponding to the effective video to the total video time length of the effective video is larger than a browsing ratio threshold value; if the first video and the second video are in the valid video sequence L respectivelyxIs a neighboring position, the valid video sequence L is determinedxThe first video and the second video have adjacent position relation; determining the effective video sequences with adjacent position relation as related effective video sequences in the N effective video sequences; the associated effective video sequence is used for representing that a first associated edge exists between the first video identification node and the second video identification node; and counting the number of the associated sequences associated with the effective video sequences, and determining the number of the associated sequences as the edge weight of the first associated edge.
Referring to fig. 8, fig. 8 is a schematic view of a data processing scenario according to an embodiment of the present disclosure. As shown in fig. 8, the server obtains valid video sequences respectively associated with N video browsing users, in fig. 8, let N be 2, that is, 2 valid video sequences, which are respectively the valid video sequence 1 and the valid video sequence 2 in fig. 8. Wherein the active video sequence 1 is of a video browsing user 70a, the active video sequence 2 is of a video browsing user 70c, the video browsing user 70a corresponds to a video browsing user group 70d (including one or more video browsing users having related attributes with the video browsing user 70a, such as 20 year old shenzhen female), and the video browsing user 70c corresponds to a video browsing user group 70e (including one or more video browsing users having related attributes with the video browsing user 70c, such as 30 year old nanjing female).
The active video sequence 1 includes 3 active videos, as shown in fig. 8, which are a video 701b, a video 702b, and a video 703 b. The video browsing user 70a browses the video 701b at a rate of 19:15, and watches the whole video, the video 701b is published by the video publishing account 200000, and the video tags carried by the video browsing user include a fun and a movie. The video browsing user 70a browses the video 702b at 19:00 and only watches the first half of video content, the video 702b is published by the video publishing account 200000, and the video tags carried by the video browsing user include fun and movies. The video browsing user 70a browses the video 703b at 18:36, the effective time for watching the video 703b is 25 seconds, the video 703b is published by the video publishing account 100000, and the video tag carried by the video publishing account 100000 comprises sports. The effective video sequence 2 also includes 3 effective videos, as shown in fig. 8, which are a video 701b, a video 704b, and a video 702b, and a basic situation of each video is a basic situation of a video in the effective video sequence 1, so that details are not repeated here one by one, and reference may be made to the description of the video in the effective video sequence 1 and fig. 8.
It is understood that the embodiment of the present application only exemplifies two effective video sequences, and the total number of effective video sequences for actually constructing the heteromorphic graph may be any number, which is not limited herein. After the server acquires the two effective video sequences, a heteromorphic graph can be constructed according to the two effective video sequences, and various information for constructing the identification node is firstly determined; after determining the multiple kinds of heterogeneous information, the server performs identification processing on the multiple kinds of heterogeneous information, as shown in fig. 8, the server identifies a video 701b as a video identification vid1, identifies a video identification vid1 as a video identification node 1, identifies a video 702b as a video identification vid2, identifies a video identification vid2 as a video identification node2, identifies a video 703b as a video identification vid3, identifies a video identification vid3 as a video identification node 3, identifies a video 704b as a video identification vid4, and identifies a video identification vid4 as a video identification node 4. The server identifies the video browsing user group 70d as the user identification gid5, identifies the user identification gid5 as the user identification node 5, identifies the video browsing user group 70e as the user identification gid6, and identifies the user identification gid6 as the user identification node 6. The processing procedures of the video publishing account and the video tag are the same as those described above, and therefore, the details are not repeated one by one, please refer to fig. 8 or the processing procedure of the video. After the server obtains the identification nodes of the heterogeneous information, the associated edges between the identification nodes and the edge weight of the associated edges are determined according to the association relation between the heterogeneous information.
In the embodiment of the present application, a video 701b in fig. 8 is taken as a first video, and a video 702b is taken as a second video for example, and the association relationship between other videos can be referred to in the following description of the videos 701b and 702b, so that the video identifier nodes may include a video identifier node 1 corresponding to the video 701b and a video identifier node2 corresponding to the video 702 b. Traversing the two effective video sequences in fig. 8, obviously, the positions of the video 701b and the video 702b in the effective video sequence 1 are adjacent positions, the server may determine that the effective video type 1 is an associated effective video sequence of the videos 701b and 701 b; in active video sequence 2, video 701b is not adjacent to video 702b, so it can be determined that active video type 2 is not the associated active video sequence of videos 701b and 701 b; the server counts the number of the associated sequences associated with the valid video sequences, i.e. 1, and determines 1 as the edge weight of the first associated edge between the video identification node 1 and the video identification node 2.
In the embodiment of the present application, for the sake of appearance and clarity, in the abnormal pattern constructed in fig. 8, if the edge weight of the associated edge is 1, only the edge weight greater than 1 is drawn without drawing in the abnormal pattern.
Step S2032, according to the incidence relation between every two video labels, determining a second incidence edge between the label identification nodes and the edge weight of the second incidence edge.
Specifically, the associated heterogeneous information includes at least two video tags; the heterogeneous identification nodes comprise a first label identification node corresponding to the first video label and a second label identification node corresponding to the second video label; if the same video exists between the video associated with the first video tag and the video associated with the second video tag, determining the same video as the associated video; the associated video is used for representing that a second associated edge exists between the first label identification node and the second label identification node; and counting the video quantity of the associated video, and determining the video quantity as the edge weight of the second associated edge.
It is to be understood that the associated heterogeneous information may be any information associated with the video, and the embodiment of the present application is not limited thereto. For convenience of description and understanding, in the embodiments of the present application, a video tag is taken as an example for description, and then the heterogeneous identification node may include a tag identification node. Referring again to fig. 8, fig. 8 illustrates 3 types of video tags, sports, movies, and fun, respectively. According to the four videos in fig. 8, that is, the video 701b, the video 702b, the video 703b, and the video 704b, it is known that the same video, that is, the video 701b and the video 702b, exists between the two video tags of the movie, and therefore, the video 701b and the video 702b can be determined to be related videos of the beat and the movie, and further, it can be determined that a second related edge exists between the tag identification node 12 corresponding to the beat and the tag identification node 11 corresponding to the movie, and the edge weight of the edge is 2 which is the number of videos of the related videos.
Step S2033, determining a third associated edge between the video identifier node and the heterogeneous identifier node and an edge weight of the third associated edge according to the association relationship between the video and the associated heterogeneous information.
Specifically, the associated heterogeneous information includes a video browsing user group; the heterogeneous identification nodes comprise user identification nodes corresponding to the video browsing user group; in a video browsing period, acquiring the effective browsing times of a video browsing user group aiming at a video; the effective browsing times refer to the times of effective browsing of the video by video browsing users in the video browsing user group; if the effective browsing times are larger than the effective browsing times threshold value, determining that a third correlation edge exists between the video identification node and the user identification node; in the video browsing user group, video browsing users who effectively browse videos are determined as video browsing users associated with the videos, and the number of the users associated with the video browsing users is determined as the edge weight of the third associated edge.
In this step, it is assumed that the video browsing user group is a 20-year-old Shenzhen female group, that is, the group may include one or more 20-year-old female video browsing users located in the Shenzhen region; the video browsing period is 1 week. Then, counting the effective browsing times of 20-year-old female video browsing users located in the Shenzhen region in one week, if the effective browsing times of the video browsing user a in one week for the video q is 2 times, the effective browsing times of the video browsing user b in one week for the video q is 1 time, the effective browsing times of the video browsing user c in one week for the video q is 1 time, and other video browsing users in the video browsing user group do not browse the video q in one week or browse the video q inefficiently, so in one week, for the video q, the effective browsing times of the video browsing user group are 4.
If the threshold of the effective browsing times is equal to or greater than 4 (for example, 100), determining that a third associated edge does not exist between the user identification node corresponding to the video browsing user group and the video identification node corresponding to the video q; if the threshold of the effective browsing times is less than 4 (for example, 2), it is determined that a third associated edge exists between the user identifier node corresponding to the video browsing user group and the video identifier node corresponding to the video q, at this time, the video browsing user a, the video browsing user b, and the video browsing user c are determined as associated video browsing users of the video q, and further, the edge weight of the third associated edge may be determined to be 3.
It is to be understood that the numbers referred to above are only assumed for the sake of understanding and description, and have no practical effect.
Optionally, the associated heterogeneous information includes at least two video accounts; the incidence relation comprises an account incidence relation; the heterogeneous identification nodes comprise account identification nodes corresponding to at least two video accounts respectively; acquiring a related video account in the at least two video accounts, wherein the related video account has an account association relation with the video; the account association relation is used for representing that a video publishing user publishes a video through an associated video account; determining account identification nodes corresponding to the associated video accounts as associated account identification nodes; and a third associated edge exists between the video identification node and the associated account identification node, and the edge weight of the third associated edge is a constant parameter.
The video account may be a video publishing account, that is, an account for publishing a video. The server acquires a video publishing account of each video, please refer to fig. 8 again, if the video 701b and the video 702b are published by the video publishing account 200000, a third associated edge exists between the account identification node 8 corresponding to the video publishing account 200000 and the video identification node 1 corresponding to the video 701b, and the edge weight of the edge may be a constant parameter, for example, 1; a third association edge exists between the account identification node 8 corresponding to the video publishing account 200000 and the video identification node2 corresponding to the video 702 b. If the video 703b is issued by the video issuance account 100000, a third association edge exists between the account identification node 7 corresponding to the video issuance account 100000 and the video identification node 3 corresponding to the video 703b, and the edge weight of the third association edge is 1; after the video 704b is published by the video publishing account 300000, a third associated edge exists between the account identification node 9 corresponding to the video publishing account 300000 and the video identification node 4 corresponding to the video 704b, and the edge weight of the edge is 1. The topological relationship between the video identification node and the account identification node is shown in a heterogeneous graph in fig. 8.
Optionally, the associated heterogeneous information includes at least two video tags; the incidence relation comprises a label incidence relation; the heterogeneous identification nodes comprise label identification nodes corresponding to at least two video labels respectively; acquiring an associated video tag having a tag association relation with a video from at least two video tags; the label incidence relation is used for representing that the video is marked with an associated video label; determining a label identification node corresponding to the associated video label as an associated label identification node; and a third associated edge exists between the video identification node and the associated label identification node, and the edge weight of the third associated edge is a constant parameter.
Referring to fig. 8 again, the server obtains a video tag carried by each video, where the video 701b carries two video tags, that is, a laugh and a movie, and a third associated edge exists between the tag identification node 12 corresponding to the laugh and the video identification node 1 corresponding to the video 701b, where the edge weight of the third associated edge may be a constant parameter, for example, 1; a third associated edge exists between the label identification node 11 corresponding to the movie and the video identification node 1 corresponding to the video 701b, and the edge weight of the edge is 1; the video 702b carries two video tags, namely a laugh and a movie, and a third associated edge exists between the tag identification node 12 corresponding to the laugh and the video identification node2 corresponding to the video 702b, wherein the edge weight of the edge is 1; a third associated edge exists between the label identification node 11 corresponding to the movie and the video identification node2 corresponding to the video 702b, and the edge weight of the edge is 1; if the video 703b carries a sports video tag, a third associated edge exists between the tag identification node 10 corresponding to the sports and the video identification node 3 corresponding to the video 703b, and the edge weight of the edge is 1; if the video 704b carries a sports video tag, a third associated edge exists between the tag identification node 10 corresponding to the sports and the video identification node 4 corresponding to the video 704b, and the edge weight of the edge is 1; the topological relationship between the video identification node and the label identification node is shown as a heterogeneous graph in fig. 8.
In this embodiment of the present application, the video feature vector is generated according to a heterogeneous sampling sequence and a homogeneous sampling sequence, and since the heterogeneous sampling sequence includes at least two identification nodes belonging to different data attribute types, the video feature vector may include an association relationship between the identification nodes belonging to different data attribute types, and similarly, since the homogeneous sampling sequence includes at least two identification nodes belonging to the same data attribute type, the video feature vector may include an association relationship between the identification nodes determined by the associated heterogeneous information; the data attribute type of the associated heterogeneous information is different from the data attribute type of the video. Therefore, the video feature vector in the application can cover the features of the associated heterogeneous information associated with the video and the features of the association relationship between the video and the associated heterogeneous information, namely the video feature vector contains the multi-element information features; if the video feature vector in the application is applied to an actual scene, such as a scene of video recommendation, video recall or video clustering, the association relationship between the video and the actual application scene can be accurately represented due to the fact that the video feature vector contains diversified information features, and therefore the application accuracy rate of the video in the actual application scene can be improved. In addition, the topological structure relation between the heterogeneous image learning video and the associated heterogeneous information is constructed, the existing supervision training method is avoided, and the application range of the video feature vector is wider.
Further, please refer to fig. 9, where fig. 9 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application. The data processing means may be a computer program (including program code) running on a computer device, for example, an application software; the apparatus may be used to perform the corresponding steps in the methods provided by the embodiments of the present application. As shown in fig. 9, the data processing apparatus 1 may include: an acquisition data module 11, a first generation module 12, a sampling node module 13, and a second generation module 14.
The data acquisition module 11 is configured to acquire a video identifier of a video and associated heterogeneous information associated with the video; the data attribute type of the video is different from the data attribute type of the associated heterogeneous information;
the first generating module 12 is configured to determine both the video identifier and the heterogeneous information identifier associated with the heterogeneous information as an identifier node, and generate a heterogeneous graph including the identifier node;
the sampling node module 13 is configured to perform identifier node sampling on the heterogeneous graph to obtain a heterogeneous sampling sequence and an isomorphic sampling sequence; the heterogeneous sampling sequence comprises at least two identification nodes belonging to different data attribute types, and the homogeneous sampling sequence comprises at least two identification nodes belonging to the same data attribute type;
and a second generating module 14, configured to generate a video feature vector corresponding to the video identifier according to the heterogeneous sampling sequence and the homogeneous sampling sequence.
For specific functional implementation manners of the data obtaining module 11, the first generating module 12, the sampling node module 13, and the second generating module 14, reference may be made to steps S101 to S104 in the embodiment corresponding to fig. 3, which is not described herein again.
Referring to fig. 9 again, the number of videos is at least two, and the number of associated heterogeneous information is at least two;
the first generation module 12 may include: a first determination unit 121 and a first generation unit 122.
A first determining unit 121, configured to determine, according to an association relationship between every two videos, an association relationship between every two pieces of association heterogeneous information, and an association relationship between a video and association heterogeneous information, an association edge between the identification nodes, and an edge weight of the association edge;
the first generating unit 122 is configured to generate an abnormal graph according to the identification node, the associated edge, and the edge weight of the associated edge.
For specific functional implementation of the first determining unit 121 and the first generating unit 122, reference may be made to step S102 in the embodiment corresponding to fig. 3, which is not described herein again.
Referring to fig. 9 again, the identification nodes include video identification nodes belonging to the video identification and heterogeneous identification nodes belonging to the heterogeneous information identification; the associated edges comprise a first associated edge, a second associated edge and a third associated edge;
the first determination unit 121 may include: a first determination subunit 1211, a second determination subunit 1212, and a third determination subunit 1213.
A first determining subunit 1211, configured to determine, according to an association relationship between every two videos, a first association edge between the video identification nodes and an edge weight of the first association edge;
a second determining subunit 1212, configured to determine, according to an association relationship between every two pieces of association heterogeneous information, a second association edge between the heterogeneous identifier nodes and an edge weight of the second association edge;
a third determining subunit 1213, configured to determine, according to the association relationship between the video and the associated heterogeneous information, a third associated edge between the video identifier node and the heterogeneous identifier node, and an edge weight of the third associated edge.
For specific functional implementation manners of the first determining subunit 1211, the second determining subunit 1212, and the third determining subunit 1213, reference may be made to steps S1021 to S1023 in the embodiment corresponding to fig. 7, which is not described herein again.
Referring to fig. 9 again, the video includes a first video and a second video; the video identification nodes comprise a first video identification node corresponding to the first video and a second video identification node corresponding to the second video;
the first determining subunit 1211 may include: an acquisition sequence sub-unit 12111, a determine position sub-unit 12112, a determine sequence sub-unit 12113, and a statistical sequence sub-unit 12114.
An acquiring sequence sub-unit 12111 configured to acquire valid video sequences respectively associated with the N video browsing users; n is a positive integer; the N active video sequences include an active video sequence LxX is a positive integer and x is less than or equal to the total number of sequences of N valid video sequences; active video sequence LxThe effective videos in (1) are sorted according to the time sequence of the related video browsing users for browsing the videos; the ratio of the effective browsing time length of the video corresponding to the effective video to the total video time length of the effective video is larger than a browsing ratio threshold value;
a determine position subunit 12112 for determining if the first video and the second video are in the active video sequence L, respectivelyxIs a neighboring position, the valid video sequence L is determinedxThe first video and the second video have adjacent position relation;
a determination sequence subunit 12113 configured to determine, among the N effective video sequences, an effective video sequence having an adjacent positional relationship as an associated effective video sequence; the associated effective video sequence is used for representing that a first associated edge exists between the first video identification node and the second video identification node;
a statistics sequence subunit 12114, configured to count the number of association sequences associated with the valid video sequences, and determine the number of association sequences as an edge weight of the first association edge.
The specific functional implementation manners of the acquiring sequence sub-unit 12111, the determining position sub-unit 12112, the determining sequence sub-unit 12113, and the counting sequence sub-unit 12114 may refer to step S1021 in the embodiment corresponding to fig. 7, which is not described herein again.
Referring to fig. 9 again, the associated heterogeneous information includes first associated heterogeneous information and second associated heterogeneous information; the heterogeneous identification nodes comprise a first heterogeneous identification node corresponding to the first associated heterogeneous information and a second heterogeneous identification node corresponding to the second associated heterogeneous information;
the second determining subunit 1212 may include: a determine video sub-unit 12121 and a statistic video sub-unit 12122.
A video determining subunit 12121, configured to determine, if the same video exists between the video associated with the first associated heterogeneous information and the video associated with the second associated heterogeneous information, the same video as the associated video; the associated video is used for representing that a second associated edge exists between the first heterogeneous identification node and the second heterogeneous identification node;
a statistics video subunit 12122, configured to count the number of videos of the associated video, and determine the number of videos as the edge weight of the second associated edge.
The specific functional implementation manners of the video determining subunit 12121 and the video statistics subunit 12122 may refer to step S1022 in the embodiment corresponding to fig. 7, which is not described herein again.
Referring to fig. 9 again, the associated heterogeneous information includes a video browsing user group; the heterogeneous identification nodes comprise user identification nodes corresponding to the video browsing user group;
the third determining subunit 1213 may include: an acquisition times sub-unit 12131, a determine association sub-unit 12132, and a determine weights sub-unit 12133.
An obtaining times subunit 12131, configured to obtain, in a video browsing period, an effective browsing time of the video browsing user group for the video; the effective browsing times refer to the times of effective browsing of the video by video browsing users in the video browsing user group;
a determining association subunit 12132, configured to determine that a third association edge exists between the video identifier node and the user identifier node if the effective browsing times are greater than the effective browsing times threshold;
a determining weight subunit 12133, configured to determine, in the video browsing user group, video browsing users who effectively browse videos as associated video browsing users of the videos, and determine the number of users of the associated video browsing users as edge weights of the third associated edge.
The specific functional implementation manners of the obtaining times sub-unit 12131, the association determining sub-unit 12132, and the weight determining sub-unit 12133 may refer to step S1023 in the embodiment corresponding to fig. 7, which is not described herein again.
Referring to fig. 9 again, the associated heterogeneous information includes at least two video accounts; the incidence relation comprises an account incidence relation; the heterogeneous identification nodes comprise account identification nodes corresponding to at least two video accounts respectively;
the third determining subunit 1213 may include: an acquire account sub-unit 12134 and a determine account sub-unit 12135.
An account acquiring subunit 12134, configured to acquire, from among the at least two video accounts, an associated video account having an account association relationship with the video; the account association relation is used for representing that a video publishing user publishes a video through an associated video account;
an account determining subunit 12135, configured to determine an account identification node corresponding to the associated video account as an associated account identification node; and a third associated edge exists between the video identification node and the associated account identification node, and the edge weight of the third associated edge is a constant parameter.
The specific functional implementation manners of the account number obtaining subunit 12134 and the account number determining subunit 12135 may refer to step S1023 in the embodiment corresponding to fig. 7, which is not described herein again.
Referring to fig. 9 again, the associated heterogeneous information includes at least two video tags; the incidence relation comprises a label incidence relation; the heterogeneous identification nodes comprise label identification nodes corresponding to at least two video labels respectively;
the third determining subunit 1213 may include: a get label sub-element 12136 and a determine label sub-element 12137.
An obtaining tag subunit 12136, configured to obtain, from the at least two video tags, an associated video tag having a tag association relationship with the video; the label incidence relation is used for representing that the video is marked with an associated video label;
a determine label subunit 12137, configured to determine a label identification node corresponding to the associated video label as an associated label identification node; and a third associated edge exists between the video identification node and the associated label identification node, and the edge weight of the third associated edge is a constant parameter.
The specific functional implementation manners of the acquiring tag sub-unit 12136 and the determining tag sub-unit 12137 may refer to step S1023 in the embodiment corresponding to fig. 7, which is not described herein again.
Referring again to fig. 9, the sampling node module 13 may include: a first acquisition unit 131, a first sampling unit 132, and a second sampling unit 133.
A first obtaining unit 131, configured to obtain a random sampling heterogeneous path and a random sampling homogeneous path; the random sampling heterogeneous path is used for indicating the type sampling sequence of the sampled data attribute types when the identification nodes of different data attribute types are sampled; the random sampling isomorphic path is used for indicating the data attribute type sampled when the identification nodes of the same data attribute type are sampled;
the first sampling unit 132 is configured to randomly sample the identification nodes in the heterogeneous map according to the type sampling sequence indicated by the random sampling heterogeneous path to obtain a heterogeneous sampling sequence;
the second sampling unit 133 is configured to determine the data attribute type indicated by the random sampling isomorphic path as a target data attribute type, and randomly sample an identification node belonging to the target data attribute type in the heteromorphic graph to obtain an isomorphic sampling sequence.
For specific functional implementation manners of the first obtaining unit 131, the first sampling unit 132, and the second sampling unit 133, reference may be made to step S204 to step S206 in the embodiment corresponding to fig. 4, which is not described herein again.
Referring to fig. 9 again, the first sampling unit 132 may include: a fourth determination subunit 1321, a sampling target subunit 1322, an add node subunit 1323, and a generate sequence subunit 1324.
A fourth determining subunit 1321, configured to determine, according to the type sampling order, the data attribute type of the jth required sampling in the random sampling heterogeneous path as the data attribute type to be sampled; j is a positive integer less than or equal to S, and S is the total number of nodes of the identification nodes required to be sampled based on the random sampling heterogeneous path;
a sampling target subunit 1322, configured to sample a target identification node from the heterogeneous graph according to the sampled node set and the attribute type of the data to be sampled; the data attribute type to which the target identification node belongs is a data attribute type to be sampled; the sampled node set includes sampled identification nodes;
an add node subunit 1323, configured to add the target identification node to the sampled node set if j is smaller than S;
a generating sequence subunit 1324, configured to, if j is equal to S, generate a heterogeneous sampling sequence according to the sampled node set and the target identification node, where the target identification node is a last identification node in the heterogeneous sampling sequence.
For specific functional implementation manners of the fourth determining subunit 1321, the sampling target subunit 1322, the adding node subunit 1323, and the generating sequence subunit 1324, reference may be made to step S205 in the embodiment corresponding to fig. 4, which is not described herein again.
Referring to fig. 9 again, the sampled node set includes a previous neighboring identification node, which is the last identification node in the sampled node set;
the sampling target subunit 1322 may include: an acquisition node sub-unit 13221, a summation weight sub-unit 13222, a determination probability sub-unit 13223, and a sampling node sub-unit 13224.
The obtaining node subunit 13221 is configured to obtain, according to the attribute type of the data to be sampled, w target preselected identification nodes having associated edges with the previous adjacent identification node from the heterogeneous graph; wherein w is a positive integer; the data attribute types to which the w target preselected identification nodes belong are to-be-sampled data attribute types;
the summation weight subunit 13222 is configured to obtain edge weights of associated edges between the previous adjacent identifier node and each target preselected identifier node, and sum the obtained edge weights to obtain a total edge weight; the w target preselected identification nodes comprise target preselected identification nodes YmWherein m is a positive integer andm is less than or equal to w;
a probability determining subunit 13223, configured to determine the previous neighboring identification node and the target preselected identification node YmEdge weight Z of the associated edge betweenmThe ratio of the total edge weight to the total edge weight is determined as a target preselected identification node YmRandom sampling probability of (2);
the sampling node subunit 13224 is configured to perform random sampling on the w target preselected identification nodes according to the random sampling probability respectively corresponding to each target preselected identification node, so as to obtain target identification nodes; in the heterogeneous sampling sequence, the former adjacent identification node is the last identification node of the target identification node.
The specific functional implementation manners of the obtaining node subunit 13221, the summing weight subunit 13222, the probability determining subunit 13223, and the sampling node subunit 13224 may refer to step S205 in the embodiment corresponding to fig. 4, which is not described herein again.
Referring to fig. 9 again, the isomorphic sampling sequence includes a first isomorphic sampling sequence and a second isomorphic sampling sequence;
the second sampling unit 133 may include: a first generation sub-unit 1331 and a second generation sub-unit 1332.
A first generating subunit 1331, configured to, if the target data attribute type is a data attribute type corresponding to the video identifier node, randomly sample the video identifier node in the heteromorphic graph, and generate a first isomorphic sampling sequence including at least two video identifier nodes; the first isomorphic sampling sequence is used for representing the topological relation among video identification nodes in the heterogeneous graph;
a second generating subunit 1332, configured to, if the target data attribute type is the data attribute type corresponding to the heterogeneous identification node, perform random sampling on the heterogeneous identification node in the heterogeneous configuration, and generate a second homogeneous sampling sequence including at least two heterogeneous identification nodes; the second isomorphic sampling sequence is used for representing the topological relation among the heterogeneous identification nodes in the heterogeneous graph.
For specific functional implementation manners of the first generating sub-unit 1331 and the second generating sub-unit 1332, reference may be made to step S206 in the embodiment corresponding to fig. 4, which is not described herein again.
Referring again to fig. 9, the second generating module 14 may include: a second determining unit 141, a second obtaining unit 142, an adjustment model unit 143, and a second generating unit 144.
A second determining unit 141, configured to determine the heterogeneous sampling sequence and the homogeneous sampling sequence as at least two random sampling sequences; the at least two random sampling sequences comprise a random sampling sequence SaA is a positive integer and a is less than or equal to the total number of sequences of the at least two random sampling sequences;
a second obtaining unit 142 for obtaining the random sampling sequence SaThe true code label of each identification node in the node; random sampling sequence SaIncluding identifying node DbAnd with the identification node DbAdjacent identification nodes with position incidence relation, b is a positive integer and is less than or equal to the random sampling sequence SaIdentifying a total number of nodes of the node; the vector dimension of each real encoding label is equal to the total number of nodes of the identification nodes in the heterogeneous graph;
a second obtaining unit 142, further configured to identify the node DbTrue code label CbInputting the predicted coding labels into an initial word coding model to obtain the predicted coding labels of adjacent identification nodes;
the model adjusting unit 143 is configured to adjust model parameters in the initial word coding model according to the real coding labels of the adjacent identification nodes and the predictive coding labels of the adjacent identification nodes, so as to obtain a target word coding model;
the second generating unit 144 is configured to input the video identifier to the target word coding model, so as to obtain a video feature vector corresponding to the video identifier.
For specific functional implementation manners of the second determining unit 141, the second obtaining unit 142, the model adjusting unit 143, and the second generating unit 144, reference may be made to step S104 in the embodiment corresponding to fig. 3, which is not described herein again.
In this embodiment of the present application, the video feature vector is generated according to a heterogeneous sampling sequence and a homogeneous sampling sequence, and since the heterogeneous sampling sequence includes at least two identification nodes belonging to different data attribute types, the video feature vector may include an association relationship between the identification nodes belonging to different data attribute types, and similarly, since the homogeneous sampling sequence includes at least two identification nodes belonging to the same data attribute type, the video feature vector may include an association relationship between the identification nodes determined by the associated heterogeneous information; the data attribute type of the associated heterogeneous information is different from the data attribute type of the video. Therefore, the video feature vector in the application can cover the features of the associated heterogeneous information associated with the video and the features of the association relationship between the video and the associated heterogeneous information, namely the video feature vector contains the multi-element information features; if the video feature vector in the application is applied to an actual scene, such as a scene of video recommendation, video recall or video clustering, the association relationship between the video and the actual application scene can be accurately represented due to the fact that the video feature vector contains diversified information features, and therefore the application accuracy rate of the video in the actual application scene can be improved. In addition, the topological structure relation between the heterogeneous image learning video and the associated heterogeneous information is constructed, the existing supervision training method is avoided, and the application range of the video feature vector is wider.
Further, please refer to fig. 10, where fig. 10 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown in fig. 10, the computer device 1000 may be the server in the embodiment corresponding to fig. 3, and the computer device 1000 may include: at least one processor 1001, such as a CPU, at least one network interface 1004, a user interface 1003, memory 1005, at least one communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display (Display) and a Keyboard (Keyboard), and the network interface 1004 may optionally include a standard wired interface and a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 1005 may optionally also be at least one storage device located remotely from the aforementioned processor 1001. As shown in fig. 10, the memory 1005, which is one type of computer storage medium, may include an operating system, a network communication module, a user interface module, and a device control application program.
In the computer device 1000 shown in fig. 10, the network interface 1004 may provide a network communication function; the user interface 1003 is an interface for providing a user with input; and the processor 1001 may be used to invoke a device control application stored in the memory 1005 to implement:
acquiring a video identifier of a video and associated heterogeneous information associated with the video; the data attribute type of the video is different from the data attribute type of the associated heterogeneous information;
determining the video identifier and the heterogeneous information identifier associated with the heterogeneous information as identifier nodes, and generating a heterogeneous graph containing the identifier nodes;
carrying out identification node sampling on the heterogeneous graph to obtain a heterogeneous sampling sequence and an isomorphic sampling sequence; the heterogeneous sampling sequence comprises at least two identification nodes belonging to different data attribute types, and the homogeneous sampling sequence comprises at least two identification nodes belonging to the same data attribute type;
and generating a video feature vector corresponding to the video identifier according to the heterogeneous sampling sequence and the homogeneous sampling sequence.
It should be understood that the computer device 1000 described in this embodiment of the present application may perform the description of the data processing method in the embodiment corresponding to fig. 3, fig. 4, and fig. 7, and may also perform the description of the data processing apparatus 1 in the embodiment corresponding to fig. 9, which is not described herein again. In addition, the beneficial effects of the same method are not described in detail.
An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, where the computer program includes program instructions, and when the program instructions are executed by a processor, the data processing method provided in each step in fig. 3, fig. 4, and fig. 7 is implemented, which may specifically refer to the implementation manner provided in each step in fig. 3, fig. 4, and fig. 7, and is not described herein again. In addition, the beneficial effects of the same method are not described in detail.
The computer readable storage medium may be the data processing apparatus provided in any of the foregoing embodiments or an internal storage unit of the computer device, such as a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Memory Card (SMC), a Secure Digital (SD) card, a flash card (flash card), and the like, provided on the computer device. Further, the computer-readable storage medium may also include both an internal storage unit and an external storage device of the computer device. The computer-readable storage medium is used for storing the computer program and other programs and data required by the computer device. The computer readable storage medium may also be used to temporarily store data that has been output or is to be output.
Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instruction from the computer-readable storage medium, and executes the computer instruction, so that the computer device can execute the description of the data processing method in the embodiments corresponding to fig. 3, fig. 4, and fig. 7, which is not described herein again. In addition, the beneficial effects of the same method are not described in detail.
The terms "first," "second," and the like in the description and in the claims and drawings of the embodiments of the present application are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "comprises" and any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, apparatus, product, or apparatus that comprises a list of steps or elements is not limited to the listed steps or modules, but may alternatively include other steps or modules not listed or inherent to such process, method, apparatus, product, or apparatus.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The method and the related apparatus provided by the embodiments of the present application are described with reference to the flowchart and/or the structural diagram of the method provided by the embodiments of the present application, and each flow and/or block of the flowchart and/or the structural diagram of the method, and the combination of the flow and/or block in the flowchart and/or the block diagram can be specifically implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block or blocks of the block diagram. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block or blocks of the block diagram. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block or blocks.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.

Claims (15)

1. A data processing method, comprising:
acquiring a video identifier of a video and associated heterogeneous information associated with the video; the data attribute type of the video is different from the data attribute type of the associated heterogeneous information;
determining the video identifier and the heterogeneous information identifier of the associated heterogeneous information as identifier nodes, and generating a heterogeneous graph containing the identifier nodes;
carrying out identification node sampling on the heterogeneous graph to obtain a heterogeneous sampling sequence and an isomorphic sampling sequence; the heterogeneous sampling sequence comprises at least two identification nodes belonging to different data attribute types, and the homogeneous sampling sequence comprises at least two identification nodes belonging to the same data attribute type;
and generating a video feature vector corresponding to the video identifier according to the heterogeneous sampling sequence and the homogeneous sampling sequence.
2. The method of claim 1, wherein the number of videos is at least two, and the number of associated heterogeneous information is at least two;
the generating of the heterogeneous graph containing the identification nodes comprises:
determining an associated edge between the identification nodes and an edge weight of the associated edge according to an association relationship between every two videos, an association relationship between every two associated heterogeneous information and an association relationship between the videos and the associated heterogeneous information;
and generating the abnormal graph according to the identification node, the associated edge and the edge weight of the associated edge.
3. The method according to claim 2, wherein the identification nodes comprise a video identification node belonging to the video identification and a heterogeneous identification node belonging to the heterogeneous information identification; the associated edges comprise a first associated edge, a second associated edge and a third associated edge;
determining the associated edge between the identification nodes and the edge weight of the associated edge according to the association relationship between every two videos, the association relationship between every two associated heterogeneous information and the association relationship between the videos and the associated heterogeneous information, including:
determining the first associated edge between the video identification nodes and the edge weight of the first associated edge according to the association relationship between every two videos;
determining the second association edge between the heterogeneous identification nodes and the edge weight of the second association edge according to the association relationship between every two pieces of associated heterogeneous information;
and determining the third associated edge between the video identification node and the heterogeneous identification node and the edge weight of the third associated edge according to the association relationship between the video and the associated heterogeneous information.
4. The method of claim 3, wherein the video comprises a first video and a second video; the video identification nodes comprise a first video identification node corresponding to the first video and a second video identification node corresponding to the second video;
determining the first associated edge between the video identification nodes and the edge weight of the first associated edge according to the association relationship between every two videos, including:
obtaining effective video sequences respectively associated with N video browsing users; n is a positive integer; the N active video sequences include an active video sequence LxX is a positive integer and x is less than or equal toA total number of sequences equal to the N valid video sequences; the active video sequence LxThe effective videos in (1) are sorted according to the time sequence of the related video browsing users for browsing the videos; the ratio of the effective browsing time length of the video corresponding to the effective video to the total video time length of the effective video is larger than a browsing ratio threshold value;
if the first video and the second video are in the valid video sequence L respectivelyxIs a neighboring position, the active video sequence L is determinedxHaving a neighboring positional relationship for the first video and the second video;
determining an active video sequence having the adjacent position relationship among the N active video sequences as an associated active video sequence; the associated valid video sequence is used for representing that the first associated edge exists between the first video identification node and the second video identification node;
and counting the number of the associated sequences of the associated effective video sequences, and determining the number of the associated sequences as the edge weight of the first associated edge.
5. The method of claim 3, wherein the associated heterogeneous information comprises first associated heterogeneous information and second associated heterogeneous information; the heterogeneous identification nodes comprise a first heterogeneous identification node corresponding to the first associated heterogeneous information and a second heterogeneous identification node corresponding to the second associated heterogeneous information;
determining the second associated edge between the heterogeneous identification nodes and the edge weight of the second associated edge according to the association relationship between every two pieces of associated heterogeneous information, including:
if the same video exists between the video associated with the first associated heterogeneous information and the video associated with the second associated heterogeneous information, determining the same video as an associated video; the associated video is used for representing that the second associated edge exists between the first heterogeneous identification node and the second heterogeneous identification node;
and counting the video quantity of the associated video, and determining the video quantity as the edge weight of the second associated edge.
6. The method of claim 3, wherein the associated heterogeneous information comprises a video browsing user group; the heterogeneous identification nodes comprise user identification nodes corresponding to the video browsing user group;
determining the third associated edge between the video identification node and the heterogeneous identification node and the edge weight of the third associated edge according to the association relationship between the video and the associated heterogeneous information, including:
in a video browsing period, acquiring the effective browsing times of the video browsing user group aiming at the video; the effective browsing times refer to the times of effective browsing of the video by video browsing users in the video browsing user group;
if the effective browsing times are larger than an effective browsing time threshold value, determining that a third associated edge exists between the video identification node and the user identification node;
and in the video browsing user group, determining the video browsing users who effectively browse the video as the video browsing users associated with the video, and determining the number of the users associated with the video browsing users as the edge weight of the third associated edge.
7. The method of claim 3, wherein the associated heterogeneous information comprises at least two video accounts; the incidence relation comprises an account incidence relation; the heterogeneous identification nodes comprise account identification nodes corresponding to the at least two video accounts respectively;
determining the third associated edge between the video identification node and the heterogeneous identification node and the edge weight of the third associated edge according to the association relationship between the video and the associated heterogeneous information, including:
acquiring a related video account which has the account association relation with the video from the at least two video accounts; the account association relation is used for representing that a video publishing user publishes the video through the associated video account;
determining account identification nodes corresponding to the associated video accounts as associated account identification nodes; the third association edge exists between the video identification node and the association account identification node, and the edge weight of the third association edge is a constant parameter.
8. The method of claim 3, wherein the associated heterogeneous information comprises at least two video tags; the incidence relation comprises a label incidence relation; the heterogeneous identification nodes comprise label identification nodes corresponding to the at least two video labels respectively;
determining the third associated edge between the video identification node and the heterogeneous identification node and the edge weight of the third associated edge according to the association relationship between the video and the associated heterogeneous information, including:
acquiring an associated video tag which has the tag association relation with the video from the at least two video tags; the label incidence relation is used for representing that the video is marked with the incidence video label;
determining a label identification node corresponding to the associated video label as an associated label identification node; the third associated edge exists between the video identification node and the associated label identification node, and the edge weight of the third associated edge is a constant parameter.
9. The method of claim 3, wherein the sampling the heterogeneous graph for the identified nodes to obtain a heterogeneous sampling sequence and a homogeneous sampling sequence comprises:
acquiring a random sampling heterogeneous path and a random sampling isomorphic path; the random sampling heterogeneous path is used for indicating a type sampling sequence of the sampled data attribute types when the identification nodes of different data attribute types are sampled; the random sampling isomorphic path is used for indicating a data attribute type sampled when the identification nodes of the same data attribute type are sampled;
according to the type sampling sequence indicated by the random sampling heterogeneous path, randomly sampling the identification nodes in the heterogeneous graph to obtain the heterogeneous sampling sequence;
and determining the data attribute type indicated by the random sampling isomorphic path as a target data attribute type, and randomly sampling the identification nodes belonging to the target data attribute type in the heteromorphic image to obtain the isomorphic sampling sequence.
10. The method according to claim 9, wherein the randomly sampling the identified nodes in the heterogeneous graph according to the type sampling order indicated by the randomly sampling heterogeneous path to obtain the heterogeneous sampling sequence comprises:
determining the data attribute type of the jth required sampling in the random sampling heterogeneous path as the data attribute type to be sampled according to the type sampling sequence; j is a positive integer less than or equal to S, and S is the total number of nodes of the identification nodes required to be sampled based on the random sampling heterogeneous path;
sampling target identification nodes from the abnormal graph according to the sampled node set and the attribute type of the data to be sampled; the data attribute type to which the target identification node belongs is the data attribute type to be sampled; the sampled node set includes sampled identification nodes;
if j is smaller than S, adding the target identification node into the sampled node set;
and if j is equal to S, generating the heterogeneous sampling sequence according to the sampled node set and the target identification node, wherein the target identification node is the last identification node in the heterogeneous sampling sequence.
11. The method of claim 10, wherein the sampled set of nodes includes a predecessor identification node, the predecessor identification node being a last identification node in the sampled set of nodes;
the sampling a target identification node from the abnormal graph according to the sampled node set and the attribute type of the data to be sampled comprises the following steps:
acquiring w target preselected identification nodes with associated edges with the front adjacent identification nodes from the abnormal graph according to the attribute type of the data to be sampled; wherein w is a positive integer; the data attribute type to which the w target preselected identification nodes belong is the data attribute type to be sampled;
respectively acquiring the edge weight of the associated edge between the front adjacent identification node and each target preselected identification node, and summing the acquired edge weights to obtain a total edge weight; the w target preselected identification nodes comprise target preselected identification nodes YmWherein m is a positive integer and m is less than or equal to w;
the front adjacent identification node and the target preselected identification node Y are connectedmEdge weight Z of the associated edge betweenmThe ratio of the total edge weight to the total edge weight is determined as the target preselected identification node YmRandom sampling probability of (2);
randomly sampling the w target preselected identification nodes according to the random sampling probability corresponding to each target preselected identification node to obtain the target identification nodes; in the heterogeneous sampling sequence, the former adjacent identification node is the last identification node of the target identification node.
12. The method of claim 9, wherein the sequence of isomorphic samples comprises a first sequence of isomorphic samples and a second sequence of isomorphic samples;
the randomly sampling the identification nodes belonging to the target data attribute type in the heteromorphic graph to obtain the isomorphic sampling sequence includes:
if the target data attribute type is the data attribute type corresponding to the video identification node, randomly sampling the video identification node in the heteromorphic graph to generate a first isomorphic sampling sequence containing at least two video identification nodes; the first isomorphic sampling sequence is used for representing the topological relation among the video identification nodes in the heteromorphic graph;
if the target data attribute type is the data attribute type corresponding to the heterogeneous identification node, randomly sampling the heterogeneous identification node in the heterogeneous graph to generate a second homogeneous sampling sequence comprising at least two heterogeneous identification nodes; the second isomorphic sampling sequence is used for representing the topological relation among the heterogeneous identification nodes in the heterogeneous graph.
13. The method according to claim 1, wherein the generating a video feature vector corresponding to the video identifier according to the heterogeneous sampling sequence and the homogeneous sampling sequence comprises:
determining the heterogeneous sampling sequence and the homogeneous sampling sequence as at least two random sampling sequences; the at least two random sampling sequences comprise a random sampling sequence SaA is a positive integer and a is less than or equal to the total number of sequences of the at least two random sampling sequences;
obtaining the random sampling sequence SaThe true code label of each identification node in the node; the random sampling sequence SaIncluding identifying node DbAnd with the identification node DbAdjacent identification nodes with position incidence relation, b is a positive integer and is less than or equal to the random sampling sequence SaIdentifying a total number of nodes of the node; the vector dimension of each real code label is equal to the total number of nodes of the identified nodes in the heterogeneous graph;
identify the node DbTrue code label CbInputting the predicted coding label into an initial word coding model to obtain a predicted coding label of the adjacent identification node;
adjusting model parameters in the initial word coding model according to the real coding labels of the adjacent identification nodes and the predictive coding labels of the adjacent identification nodes to obtain a target word coding model;
and inputting the video identification to the target word coding model to obtain a video feature vector corresponding to the video identification.
14. A computer device, comprising: a processor, a memory, and a network interface;
the processor is connected with the memory and the network interface, wherein the network interface is used for providing data communication functions, the memory is used for storing a computer program, and the processor is used for calling the computer program to enable the computer device to execute the method of any one of claims 1 to 13.
15. A computer-readable storage medium, in which a computer program is stored which is adapted to be loaded and executed by a processor to cause a computer device having said processor to carry out the method of any one of claims 1 to 13.
CN202110420502.XA 2021-04-19 2021-04-19 Data processing method, data processing equipment and computer readable storage medium Pending CN113761272A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110420502.XA CN113761272A (en) 2021-04-19 2021-04-19 Data processing method, data processing equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110420502.XA CN113761272A (en) 2021-04-19 2021-04-19 Data processing method, data processing equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN113761272A true CN113761272A (en) 2021-12-07

Family

ID=78787021

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110420502.XA Pending CN113761272A (en) 2021-04-19 2021-04-19 Data processing method, data processing equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN113761272A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089722A (en) * 2023-02-15 2023-05-09 北京欧拉认知智能科技有限公司 Implementation method, device, computing equipment and storage medium based on graph yield label

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089722A (en) * 2023-02-15 2023-05-09 北京欧拉认知智能科技有限公司 Implementation method, device, computing equipment and storage medium based on graph yield label
CN116089722B (en) * 2023-02-15 2023-11-21 北京欧拉认知智能科技有限公司 Implementation method, device, computing equipment and storage medium based on graph yield label

Similar Documents

Publication Publication Date Title
CN111368210B (en) Information recommendation method and device based on artificial intelligence and electronic equipment
US9934515B1 (en) Content recommendation system using a neural network language model
CN110287412B (en) Content recommendation method, recommendation model generation method, device, and storage medium
CN111382190B (en) Object recommendation method and device based on intelligence and storage medium
CN109471978B (en) Electronic resource recommendation method and device
CN112989209B (en) Content recommendation method, device and storage medium
CN111885399A (en) Content distribution method, content distribution device, electronic equipment and storage medium
CN112052387A (en) Content recommendation method and device and computer readable storage medium
CN112231563A (en) Content recommendation method and device and storage medium
CN117836765A (en) Click prediction based on multimodal hypergraph
CN115221396A (en) Information recommendation method and device based on artificial intelligence and electronic equipment
CN114201516A (en) User portrait construction method, information recommendation method and related device
CN112269943B (en) Information recommendation system and method
CN111597361B (en) Multimedia data processing method, device, storage medium and equipment
CN113761272A (en) Data processing method, data processing equipment and computer readable storage medium
CN115131052A (en) Data processing method, computer equipment and storage medium
CN116956183A (en) Multimedia resource recommendation method, model training method, device and storage medium
CN115795156A (en) Material recall and neural network training method, device, equipment and storage medium
CN115168609A (en) Text matching method and device, computer equipment and storage medium
CN116484085A (en) Information delivery method, device, equipment, storage medium and program product
CN114357242A (en) Training evaluation method and device based on recall model, equipment and storage medium
CN114357301A (en) Data processing method, device and readable storage medium
CN115203516A (en) Information recommendation method, device, equipment and storage medium based on artificial intelligence
CN113792163B (en) Multimedia recommendation method and device, electronic equipment and storage medium
CN114996561B (en) Information recommendation method and device based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination