CN112861009A

CN112861009A - Artificial intelligence based media account recommendation method and device and electronic equipment

Info

Publication number: CN112861009A
Application number: CN202110227065.XA
Authority: CN
Inventors: 黄梓琪
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-03-01
Filing date: 2021-03-01
Publication date: 2021-05-28

Abstract

The application provides a media account recommending method and device based on artificial intelligence, electronic equipment and a computer-readable storage medium; the method comprises the following steps: the method comprises the steps of obtaining a plurality of content characteristics of at least one interactive media account of a user account, and obtaining a plurality of content characteristics corresponding to a plurality of candidate media accounts respectively; clustering and residual error processing are carried out on a plurality of content features of each interactive media account to obtain a content vector of the interactive media account; clustering and residual error processing are respectively carried out on the plurality of content features of each candidate media account to obtain a content vector of each candidate media account; determining candidate media accounts to be recommended based on content similarity between the content vector of each interactive media account and the content vectors of the candidate media accounts; and executing recommendation operation corresponding to the user account based on the candidate media account to be recommended. By the method and the device, the accuracy of recommending the account number can be improved.

Description

Artificial intelligence based media account recommendation method and device and electronic equipment

Technical Field

The present disclosure relates to artificial intelligence and block chain technologies, and in particular, to a method and an apparatus for recommending media account based on artificial intelligence, an electronic device, and a computer-readable storage medium.

Background

Artificial Intelligence (AI) is a theory, method and technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results.

The media account recommendation is an important application of artificial intelligence, an important component of the media account recommendation is the media account recommendation, a media account recommendation system in the related technology can learn attention behaviors of a user to obtain a similar relation between accounts, but for accounts without posterior information (namely cold start accounts), the similar relation between the accounts is difficult to depict, so that the accounts recommended in the mode cannot effectively meet the interest of the user, and bad experience is brought to the user.

Disclosure of Invention

The embodiment of the application provides a media account recommending method and device based on artificial intelligence, electronic equipment and a computer readable storage medium, and content vectors of accounts can be mined to improve recommending accuracy.

The technical scheme of the embodiment of the application is realized as follows:

the embodiment of the application provides a media account recommending method based on artificial intelligence, which comprises the following steps:

the method comprises the steps of obtaining a plurality of content characteristics of at least one interactive media account of a user account, and obtaining a plurality of content characteristics corresponding to a plurality of candidate media accounts respectively;

clustering and residual error processing are carried out on a plurality of content features of each interactive media account to obtain a content vector of the interactive media account;

clustering and residual error processing are respectively carried out on the plurality of content features of each candidate media account to obtain a content vector of each candidate media account;

determining candidate media accounts to be recommended based on content similarity between the content vector of each interactive media account and the content vectors of the candidate media accounts;

and executing recommendation operation corresponding to the user account based on the candidate media account to be recommended.

The embodiment of the application provides a media account number recommendation device based on artificial intelligence, includes:

the system comprises a characteristic module, a service module and a service module, wherein the characteristic module is used for acquiring a plurality of content characteristics of at least one interactive media account of a user account and acquiring a plurality of content characteristics corresponding to a plurality of candidate media accounts;

the vector module is used for carrying out clustering processing and residual error processing on a plurality of content features of each interactive media account to obtain a content vector of the interactive media account;

the vector module is further configured to perform clustering processing and residual error processing on the plurality of content features of each candidate media account to obtain a content vector of each candidate media account;

the similarity module is used for determining candidate media accounts to be recommended based on the content similarity between the content vector of each interactive media account and the content vectors of the candidate media accounts;

and the recommending module is used for executing recommending operation corresponding to the user account based on the candidate media account to be recommended.

In the foregoing solution, the feature module is further configured to: executing the following processing aiming at each interactive media account: extracting corresponding content features from a plurality of information issued by the interactive media account to serve as the content features of the interactive media account; performing the following for each of the candidate media accounts: extracting corresponding content features from a plurality of information issued by the candidate media accounts to serve as the content features of the candidate media accounts; wherein the type of information comprises at least one of: video information, text information, image information.

In the foregoing solution, the vector module is further configured to: executing the following processing aiming at each interactive media account: clustering a plurality of content characteristics of the interactive media account to obtain at least one first clustering center; performing residual error processing on a plurality of content features of the interactive media account based on the at least one first clustering center to obtain a content vector of the interactive media account; performing the following for each of the candidate media accounts: clustering a plurality of content features of the candidate media accounts to obtain at least one second clustering center; and residual errors of a plurality of content features of the candidate media accounts are processed based on the at least one second clustering center to obtain content vectors of the candidate media accounts.

In the above scheme, the content vector of the interactive media account is determined by a media account learning model, where the media account learning model includes a local clustering core network and a normalization network; the vector module is further configured to: performing the following for each of the first cluster centers: determining, by the local clustering core network, a first residual distribution of a plurality of content features of the interactive media account corresponding to the first clustering center; performing the following for each of the first cluster centers: normalizing the first residual distribution corresponding to the first clustering center through the normalization network to obtain a normalization result corresponding to the first clustering center; and carrying out integral normalization processing on the normalization result of each first clustering center through the normalization network to obtain the content vector of the interactive media account.

In the above scheme, the media account learning model further includes a squeezing activation network; after the normalization result of each first clustering center is subjected to integral normalization processing through the normalization network to obtain the content vector of the interactive media account, the vector module is further configured to: performing channel-based average pooling on the content vector of the interactive media account through the extrusion activation network to obtain global content characteristics of each channel corresponding to the content vector of the interactive media account; performing full connection processing on the content vector of the interactive media account corresponding to the global content feature of each channel through the extrusion activation network to obtain an activation value of the content vector of the interactive media account corresponding to each channel; and performing dot multiplication on the content vector of the interactive media account corresponding to the activation value of each channel and the original content characteristic of each channel in the content vector of the interactive media account, and updating the content vector of the interactive media account based on the dot multiplication result.

In the foregoing solution, the vector module is further configured to: performing convolution processing on each content characteristic of the interactive media account through the local clustering core network to obtain a corresponding convolution result; performing maximum likelihood function processing on the convolution result of each content feature of the interactive media account through the local clustering core network to obtain a corresponding maximum likelihood processing result; determining, by the local clustering core network, a first residual between each content feature of each interactive media account and the first clustering center; and taking the maximum likelihood processing result corresponding to each content feature of the interactive media account as a weight, and performing weighted summation processing on a first residual error corresponding to each content feature of the interactive media account to obtain a first residual error distribution of a plurality of content features of the interactive media account corresponding to the first clustering center.

In the above scheme, the content vector of the interactive media account is determined by a media account learning model, where the media account learning model includes a local clustering core network and a normalization network; the vector module is further configured to: performing the following for each of the second cluster centers: determining, by the local clustering core network, a second residual distribution of a plurality of content features of the candidate media accounts corresponding to the second clustering center; performing the following for each of the second cluster centers: normalizing second residual distribution of the plurality of content features of the candidate media account corresponding to the second clustering center through the normalization network to obtain a corresponding normalization result; and carrying out integral normalization processing on normalization results corresponding to the plurality of content characteristics of the candidate media account through the normalization network to obtain a content vector of the candidate media account.

In the foregoing solution, the vector module is further configured to: performing convolution processing on each content feature of the candidate media account through the local clustering core network to obtain a corresponding convolution result; performing maximum likelihood function processing on the convolution result of each content feature of the candidate media account through the local clustering core network to obtain a corresponding maximum likelihood processing result; determining, by the local clustering core network, a second residual between each content feature of the candidate media account and the second clustering center; and taking the maximum likelihood processing result corresponding to each content feature of the candidate media account as a weight, and performing weighted summation processing on a second residual error corresponding to each content feature of the candidate media account to obtain a second residual error distribution of the plurality of content features corresponding to the second clustering center.

In the above scheme, the media account learning model further includes a squeezing activation network; the vector module is further configured to: after the normalization result of each second clustering center is subjected to integral normalization processing through the normalization network to obtain content vectors of the candidate media accounts, the content vectors of the candidate media accounts are subjected to channel-based average pooling processing through the extrusion activation network to obtain global content characteristics of the content vectors of the candidate media accounts corresponding to each channel; performing full connection processing on the content vectors of the candidate media accounts corresponding to the global content characteristics of each channel through the extrusion activation network to obtain activation values of the content vectors of the candidate media accounts corresponding to each channel; and performing dot multiplication on the content vector of the candidate media account corresponding to the activation value of each channel and the original content characteristic of each channel in the content vector of the candidate media account, and updating the content vector of the candidate media account based on the dot multiplication result.

In the foregoing solution, the similarity module is further configured to: determining account number similarity between the account number vector of each interactive media account number and the account number vectors of the candidate media account numbers; fusing the content similarity and the account similarity between each interactive media account and the candidate media accounts to obtain the similarity between each interactive media account and the candidate media accounts; and performing descending sorting on the similarity between each interactive media account and the plurality of candidate media accounts, and selecting at least one candidate media account which is sorted in the descending sorting result and is ranked in the top as a candidate media account to be recommended.

In the foregoing solution, the similarity module is further configured to: performing the following for each of the candidate media accounts: extracting a plurality of account features from the account information of the candidate media accounts, and compressing the account features of the candidate media accounts to obtain account vectors of the candidate media accounts; executing the following processing aiming at each interactive media account: extracting a plurality of account characteristics from the account information of the interactive media account, and compressing the account characteristics of the interactive media account to obtain an account vector of the interactive media account; determining account number similarity between the account number vector of each interactive media account number and the account number vectors of the candidate media account numbers; for each interactive media account, executing the following processing: determining account number similarity between the account number vector of the interactive media account number and the account number vector of each candidate media account number, and determining content similarity between the content vector of the interactive media account number and the content vector of each candidate media account number; and carrying out average processing on the account number similarity and the content similarity.

In the foregoing solution, the similarity module is further configured to: embedding the plurality of account characteristics of the candidate media accounts to obtain a plurality of account embedding characteristics of the candidate media accounts; based on the weights of the account characteristics of the candidate media accounts, carrying out weighted summation processing on the account embedding characteristics of the candidate media accounts to obtain account vectors corresponding to the candidate media accounts; embedding the plurality of account characteristics of the interactive media account to obtain a plurality of account embedding characteristics of the interactive media account; and carrying out weighted summation processing on the plurality of account embedding characteristics of the interactive media account based on the weights of the plurality of account characteristics of the interactive media account to obtain an account vector corresponding to the interactive media account.

In the foregoing solution, the similarity module is further configured to: embedding the plurality of account characteristics and the plurality of content characteristics of the candidate media account to obtain a plurality of account characteristic embedding characteristics and a plurality of content embedding characteristics of the candidate media account; based on the weights of the multiple account number embedding characteristics and the multiple content embedding characteristics of the candidate media account numbers, carrying out weighted summation processing on the multiple account number embedding characteristics and the multiple content embedding characteristics of the candidate media account numbers to obtain account number vectors corresponding to the candidate media account numbers; embedding the plurality of account characteristics and the plurality of content characteristics of the interactive media account to obtain a plurality of account embedding characteristics and a plurality of content embedding characteristics of the interactive media account; and carrying out weighted summation processing on the plurality of account embedding characteristics and the plurality of content embedding characteristics of the interactive media account based on the plurality of account characteristics and the weight of the plurality of content characteristics of the interactive media account to obtain an account vector corresponding to the interactive media account.

In the above scheme, the content vector of the interactive media account is determined by a media account learning model, where the media account learning model includes a local clustering core network and a normalization network; the apparatus further comprises a training module to: before acquiring a plurality of content characteristics of at least one interactive media account of a user account and acquiring a plurality of content characteristics corresponding to a plurality of candidate media accounts respectively, training a media account learning model in the following way: acquiring a plurality of media account number samples, and constructing a plurality of first-type triple samples based on the number of associated users of the plurality of media account number samples; performing disassembly processing on the first triple samples to obtain a plurality of media account samples, so as to predict a sample content vector of each media account sample through the media account learning model; determining a second type of triple sample meeting the training condition according to the sample content vector; and substituting the sample content vector corresponding to each media account sample in the second type of triple samples into a triple loss function to determine parameters of the media account learning model when the triple loss function obtains a minimum value.

In the foregoing solution, the training module is further configured to: taking any one media account sample as a first media account sample, obtaining a second media account sample with the same label as the first media account sample from a plurality of media account samples, and determining a first content distance between a sample content vector of the second media account sample and a sample content vector of the first media account sample; obtaining a third media account sample with a label different from that of the first media account sample from the plurality of media account samples, and determining a second content distance between a sample content vector of the third media account sample and a sample content vector of the first media account sample; extracting a second media account sample and a third media account sample which meet the following training conditions from the plurality of second media account samples and the plurality of third media account samples: a second content distance corresponding to the third media account sample is greater than a first content distance corresponding to the second media account sample; a difference between a second content distance corresponding to the third media account sample and a first content distance corresponding to the second media account sample is less than a first threshold; and combining the first media account sample, a second media account sample meeting the training condition and a third media account sample into the second triple sample.

In the foregoing solution, the training module is further configured to: acquiring a correlated user of each residual media account sample; wherein the remaining media account samples are media account samples that are different from the first media account sample in the plurality of media account samples; determining a user intersection between the associated users of each of the remaining media account samples and the associated users of the first media account sample; determining the remaining media account samples with the number of elements in the user interaction exceeding a second threshold as second media account samples with the same label as the first media account samples; determining the remaining media account samples with the number of elements in the user interaction being less than a third threshold as a third media account sample with a different label than the first media account sample.

An embodiment of the present application provides an electronic device, including:

a memory for storing executable instructions;

and the processor is used for realizing the artificial intelligence based media account recommendation method provided by the embodiment of the application when the executable instructions stored in the memory are executed.

The embodiment of the application provides a computer-readable storage medium, which stores executable instructions and is used for realizing the artificial intelligence-based media account recommendation method provided by the embodiment of the application when being executed by a processor.

The embodiment of the application has the following beneficial effects:

by acquiring the content characteristics of the media accounts, clustering the content characteristics and performing residual error processing, the characteristic distribution difference of the content characteristics can be hidden, only the distribution difference between the content characteristics and a clustering center is reserved, the characteristic distribution of the media accounts on the content dimension is learned at high efficiency, and the relationship between the accounts is established on the content dimension.

Drawings

FIGS. 1A-1D are schematic diagrams illustrating a vector learning method in the related art;

FIG. 2A is a schematic structural diagram of an artificial intelligence based media account recommendation system according to an embodiment of the present application;

fig. 2B is an application schematic diagram of a block chain-based media account recommendation method according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of an electronic device provided in an embodiment of the present application;

fig. 4 is a schematic structural diagram of a media account learning model of a media account recommendation method based on artificial intelligence according to an embodiment of the present application;

5A-5C are schematic flow charts of a method for recommending media account based on artificial intelligence according to an embodiment of the present application;

fig. 6 is a schematic application architecture diagram of a blockchain network according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of a blockchain in the blockchain network 600 according to an embodiment of the present invention;

fig. 8 is a functional architecture diagram of a blockchain network 600 according to an embodiment of the present invention;

FIG. 9 is a schematic diagram illustrating a principle of artificial intelligence based media account recommendation provided by an embodiment of the application;

fig. 10 is a schematic diagram of a training sample provided in an embodiment of the present application.

Detailed Description

In order to make the objectives, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the attached drawings, the described embodiments should not be considered as limiting the present application, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.

In the following description, references to the terms "first \ second \ third" are only to distinguish similar objects and do not denote a particular order, but rather the terms "first \ second \ third" are used to interchange specific orders or sequences, where appropriate, so as to enable the embodiments of the application described herein to be practiced in other than the order shown or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.

Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.

1) Inception V3: the inclusion v3 network is a very deep convolutional network developed by google that is pre-trained on the pre-training set ImageNet and can be used to extract frame features.

2) One-to-three scenes: i.e. video overlays. In the information flow product, after a user clicks a certain video on the main information flow, the user enters a video playing page, the user can slide and push other videos continuously on the page, and the video floating layer refers to the video recommending page.

3) Loss of triad: the triple loss was used first in the face recognition task, with the goal of having faces belonging to the same person as close as possible in feature space and as far as possible from other faces, training a new vector representation of a face by means of online triple minimization, which defines a triple concept < reference sample, positive sample, negative sample >, with the goal that the distance of the reference sample from the positive sample is less than the distance of the reference sample from the negative sample, or the absolute value of the difference between the two is greater than a certain threshold.

4) Media accounts are accounts for publishing (e.g., publishing in an information stream) articles or videos, some accounts belong to official accounts, publish news information and the like, such as news public numbers and newspaper public numbers, and some accounts belong to media accounts, and publish articles in specific fields, such as entertainment character public numbers, artificial intelligence public numbers and the like.

5) And (4) collaborative filtering, wherein the interested information of the user is recommended by utilizing the preference of a group with mutual interest and common experience.

6) One-to-three scenes: the video floating layer in the information flow product responds to the click operation of a user on a certain video on the main information flow, enters a video playing page, can slide and push other videos continuously on the page, and the video floating layer is the video playing page.

7) A Blockchain Network (Blockchain Network) incorporates new blocks into a set of nodes of a Blockchain in a consensus manner.

8) Intelligent Contracts (Smart Contracts), also known as chain codes (chaincodes) or application codes, are programs deployed in nodes of a blockchain network, and the nodes execute the intelligent Contracts called in received transactions to perform operations of updating or querying key-value data of a state database.

9) Consensus (Consensus), a process in a blockchain network, is used to agree on a transaction in a block between the nodes involved, the agreed block to be appended to the end of the blockchain and used to update the state database.

In the related technology, accounts can be recommended based on collaborative filtering of users, user information and account information are not considered in the collaborative filtering based on users, similarity among different users is mainly discovered according to preference information of users to accounts, personalized recommendation is performed by using the user similarity, in the collaborative filtering based on users, a user group similar to the user interest is observed by taking the users as a center, and other accounts interested by the user group are recommended to the users.

In the related technology, a word vector learning model can be used for account vector learning, a behavior information graph of a user is constructed based on an account sequence concerned by the user, and vector characteristic information of each account is obtained by learning by introducing a deep walking algorithm. Referring to fig. 1A, fig. 1A is a schematic diagram of a vector learning method in the related art, first, account sequences focused by a user are obtained, account sequences focused by the user 1 are A, B, C, D and E, account sequences focused by the user 2 are A, C, D and F, account sequences focused by the user 3 are E, C, B, F and a, referring to fig. 1B, fig. 1B is a schematic diagram of a vector learning method in the related art, composition walking is performed based on the account sequences in fig. 1A, for example, according to the sequence of the user 1, walking diagrams from a to B, from B to C, from C to D, and from D to E can be constructed, and walking diagrams continue to be constructed according to the account sequences of the user 2 and the user 3 according to the same construction mode, referring to fig. 1C, fig. 1C is a schematic diagram of a vector learning method in the related art, fig. 1C is a deep walking sequence obtained based on fig. 1B, the method includes the steps that an arbitrary account node is used as a starting point, side exploration with an arbitrary length is conducted, a plurality of random deep walking sequences are obtained, for example, sequences from A to B are shown in fig. 1D, fig. 1D is a schematic diagram of a vector learning method in the related art, fig. 1D shows a word vector learning model, finally, the deep walking sequences generated in fig. 1C are input into the word vector learning model to conduct account vector learning, the first layer in the word vector learning model is an input layer, the second layer is a hidden layer (outputting a learned account vector), the third layer is an output layer (conducting training), P output by the third layer represents a positive result, N represents a negative result, and the word vector learning model is trained through the output result.

In the embodiment of the application, it is found that when an account vector is obtained based on collaborative filtering of a user, it is biased to learn a similar relationship between accounts according to a user's attention behavior, so that it needs posterior data of an account, for an account without posterior data, this way cannot effectively learn a vector representation of the account, and it has a bias biased towards popular accounts. When the account vector learning is performed by using the word vector learning model, although the similarity between accounts can be learned, it is still difficult to effectively learn the account vector of an account with few occurrences or even no interaction with a user. The above disadvantages result in that account vectors obtained based on a collaborative manner and a word vector model cannot be used in an account cold start scene, because accounts in the account cold start scene are all new accounts, there is not enough posterior data nor popular accounts.

In view of the foregoing problems, embodiments of the present application provide a media account recommendation method and apparatus based on artificial intelligence, an electronic device, and a computer-readable storage medium, which can establish account vectors of multiple accounts from a content dimension, so as to effectively improve recommendation accuracy for all accounts. In the following, an exemplary application will be explained when the device is implemented as a server.

Referring to fig. 2A, fig. 2A is a schematic structural diagram of an artificial intelligence based media account recommendation system according to an embodiment of the present disclosure, where the media account recommendation system may be used to support recommendation scenarios of various media accounts, for example, an application scenario of recommending media accounts publishing documents and/or videos, and the like, in the media account recommendation system, a terminal 400 is connected to a server 200 through a network, and the network may be a wide area network or a local area network, or a combination of the two.

In some embodiments, the functions of the media account recommendation system are implemented based on modules in the server 200, in a process that a user uses a client, the terminal 400 uses a collected media account as a training sample, trains a media account learning model based on the obtained training sample, integrates the trained media account learning model into the server, and in response to the terminal 400 receiving a play operation of the user for a certain video, the terminal 400 sends a recommendation request instruction to the server 200, the recommendation request instruction carries an interactive media account concerned by the user, the server 200 determines an account vector of the interactive media account through the media account learning model, the server 200 obtains a candidate media account from the database 500, the server 200 determines an account vector of the candidate media account through the media account learning model, and determines a similarity between the two account vectors, and determining the candidate media account meeting the similarity condition as a candidate media account to be recommended to send to the terminal 400, and sending information of the candidate media account to be recommended to the terminal 400, such as an attention link of the candidate media account or a link of content published by the candidate media account, so that the terminal 400 adds the candidate media account to an attention list or directly presents the published content of the candidate media account.

In other embodiments, the account vectors of the interactive media accounts may also be determined by the terminal through a media account learning model, the server 200 determines the account vectors of the candidate media accounts through the media account learning model, determines the similarity between the two account vectors, and adds the candidate media accounts meeting the similarity condition to the attention list or directly presents the published content of the candidate media accounts.

In some embodiments, the server 200 may be an independent physical server, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, and a big data and artificial intelligence platform. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the embodiment of the present application is not limited.

Referring to fig. 2B, fig. 2B is an application schematic diagram of a media account recommendation method based on a block chain according to an embodiment of the present application, and an exemplary application of a block chain network according to the embodiment of the present application is described below. Fig. 2B includes a blockchain network 600 (the blockchain network 600 is exemplarily shown to include a node 610-1 and a node 610-2), a server 200, a database 500, and a terminal 400, which are respectively described below.

The server 200 (mapped as node 610-2) and the terminal 400 (mapped as node 610-1) may each join the blockchain network 600 as a node therein, and the mapping of the terminal 400 as node 610-1 of the blockchain network 600 is exemplarily shown in fig. 2B, where each node (e.g., node 610-1, node 610-2) has a consensus function and an accounting (i.e., maintaining a state database, such as a key-value database) function.

The interactive media account of the terminal 400 is recorded in the status database of each node (e.g., the node 610-1), so that the terminal 400 can query the interactive media account recorded in the status database.

In some embodiments, in response to a play operation for a certain video, multiple servers 200 (each mapped to a node in the blockchain network) determine candidate media accounts to be recommended, and when the number of nodes passing through the consensus exceeds a threshold value of the number of nodes for a certain candidate media account to be recommended, determining that the consensus passes, the server 200 (mapped as the node 610-2) sends the candidate media account numbers to be recommended that the consensus passes to the terminal 400 (mapped as the node 610-1), and presenting a video playing page corresponding to the video, presenting the video content of the video and the media account number to be recommended in the video playing page, responding to the interactive operation of the user aiming at the media account number to be recommended, displaying that the media account number to be recommended is marked as an interactive media account number, and performing uplink storage on the interactive media account number.

Next, a structure of an electronic device for implementing an artificial intelligence based media account recommendation method according to an embodiment of the present application is described, and as described above, the electronic device according to an embodiment of the present application may be the server 200 in fig. 2A. Referring to fig. 3, fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application, and the server 200 shown in fig. 3 includes: at least one processor 210, memory 250, at least one network interface 220. The various components in server 200 are coupled together by a bus system 240. It is understood that the bus system 240 is used to enable communications among the components. The bus system 240 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 240 in fig. 3.

The Processor 210 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor, or the like.

The memory 250 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. Memory 250 optionally includes one or more storage devices physically located remotely from processor 210.

The memory 250 includes volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. The nonvolatile memory may be a Read Only Memory (ROM), and the volatile memory may be a Random Access Memory (RAM). The memory 250 described in embodiments herein is intended to comprise any suitable type of memory.

In some embodiments, memory 250 is capable of storing data, examples of which include programs, modules, and data structures, or a subset or superset thereof, to support various operations, as exemplified below.

An operating system 251 including system programs for processing various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and processing hardware-based tasks; a network communication module 252 for communicating to other computing devices via one or more (wired or wireless) network interfaces 220, exemplary network interfaces 220 including: bluetooth, wireless compatibility authentication (WiFi), and Universal Serial Bus (USB), among others.

In some embodiments, the artificial intelligence based media account recommendation device provided by the embodiments of the present application may be implemented in software, and fig. 3 illustrates an artificial intelligence based media account recommendation device 255 stored in a memory 250, which may be software in the form of programs, plug-ins, and the like, and includes the following software modules: a feature module 2551, a vector module 2552, a similarity module 2553, a recommendation module 2554 and a training module 2555, which are logical and thus can be arbitrarily combined or further split according to the implemented functions, which will be described below.

The method for recommending media accounts based on artificial intelligence according to the embodiment of the present application will be described below with reference to an exemplary application and implementation of the server 200 according to the embodiment of the present application. Referring to fig. 4, fig. 4 is a schematic structural diagram of an artificial intelligence-based media account learning model, which may be applied to a public number recommendation system, where the media account learning model includes a local clustering core network, a normalization network, and a squeeze activation network, the local clustering core network includes a convolution layer, a maximum likelihood function layer, and a local clustering core layer, the normalization network includes a first normalization layer and a second normalization layer, the squeeze activation network includes a squeeze layer and an activation layer, and the squeeze activation network improves representation quality of network generation by explicitly modeling interdependencies between channels of convolution characteristics thereof.

The account learning process through the media account learning model is as follows: inputting a plurality of content features into a local clustering core layer to obtain a plurality of clustering centers, inputting the content features into a convolution layer, performing convolution processing on each content feature of a media account through the convolution layer to obtain a corresponding convolution result, performing maximum likelihood function processing on the convolution result of each content feature of the media account through a maximum likelihood function layer to obtain a corresponding maximum likelihood processing result, determining a residual error between each content feature of each media account and a certain clustering center through the local clustering core layer, performing weighted summation processing on the residual error corresponding to each content feature of the media account by taking the corresponding maximum likelihood processing result of each content feature of the media account as a weight to obtain residual error distribution of the plurality of content features of the media account corresponding to the clustering center, and performing normalization processing on the residual error distribution corresponding to the clustering center through a first normalization layer of a normalization network, obtaining the normalization result of the corresponding clustering center, performing integral normalization processing on the normalization result of each clustering center through a second normalization layer of the normalization network to obtain the content vector of the media account, activating the compression layer of the network through compression, the content vector of the interactive media account is subjected to channel-based average pooling to obtain the global content characteristics of each channel corresponding to the content vector of the media account, an activation layer of a network is activated by extrusion, and performing full connection processing on the global content features of each channel corresponding to the content vector of the media account to obtain an activation value of each channel corresponding to the content vector of the media account, and performing point multiplication processing on the activation value of each channel corresponding to the content vector of the media account and the original content features of each channel in the content vector of the media account to update the content vector of the media account based on the point multiplication processing result.

In the following, a method for executing the media account recommendation system provided in the embodiment of the present application by the server 200 in fig. 2A is taken as an example to describe the artificial intelligence based media account recommendation method provided in the embodiment of the present application, where the media account recommendation system includes a training phase and an application phase. Firstly, training of a model in the artificial intelligence based media account recommendation method provided by the embodiment of the application is explained.

In some embodiments, the content vector of the interactive media account is determined by a media account learning model, the media account learning model including at least a local clustering core network and a normalization network; training a media account learning model by: acquiring a plurality of media account number samples, and constructing a plurality of first triple samples based on the number of associated users of the plurality of media account number samples; performing disassembly processing on the plurality of first-type triple samples to obtain a plurality of media account samples, and predicting a sample content vector of each media account sample through a media account learning model; determining a second type of triple sample meeting the training condition according to the sample content vector; and substituting the sample content vector corresponding to each media account sample in the second type of triple samples into the triple loss function to determine parameters of the media account learning model when the triple loss function obtains the minimum value.

As an example, training is performed in an online triple minimization manner, assuming that a database stores hundreds of thousands of media account samples, if a triple sample is directly constructed based on hundreds of thousands of media samples, the calculation amount is too large, and it is difficult to traverse all triples, so that a plurality of media account samples are obtained from the hundreds of thousands of media account samples to construct a plurality of first-type triple samples, for example, when 10 first-type triple samples are constructed, each triple sample includes three media account samples, specifically, 10 media samples are first obtained at random as reference samples of each first-type triple sample, for a reference sample of any first-type triple sample, one positive sample and one negative sample of the corresponding reference sample are obtained, where the positive sample and the reference sample have the same label, the negative samples are different from the labels of the reference samples, wherein whether the labels are the same is determined by the number of associated users (for example, the number of fans), if the number of associated users common to two media account samples exceeds a second threshold, it is determined that the two media account samples have the same label, if the number of associated users common to the two media account samples is less than a third threshold, it is determined that the two media account samples have different labels, and the third threshold is less than the second threshold, and when one positive sample corresponding to the reference sample is obtained, one positive sample can be randomly obtained from the media account samples having the same label as the reference sample, or a media account sample having the highest number of associated users common to the media account samples having the same label as the reference sample is obtained as a positive sample, and the negative sample is obtained in the same way, therefore, the positive samples and the negative samples corresponding to the reference sample are combined with the reference sample to form a first type of triplet corresponding to the reference sample, and the 10 reference samples are processed in a similar manner, so that 10 first type of triplets are constructed.

Taking the above example, 10 first-type triple samples are decomposed to obtain a plurality of media account samples, that is, 30 media account samples are obtained, the thirty media account samples are reference samples, positive samples, and negative samples in the 10 first-type triple samples, and then a sample content vector of each media account sample is predicted through a media account learning model (an initialized model or a model obtained through several times of training), that is, 30 sample content vectors respectively corresponding to the 30 media account samples are obtained.

In some embodiments, the determining, according to the sample content vector, the second type of triple samples that meet the training condition may be implemented by the following technical solutions: taking any one media account sample as a first media account sample, obtaining a second media account sample with the same label as the first media account sample from a plurality of media account samples, and determining a first content distance between a sample content vector of the second media account sample and a sample content vector of the first media account sample; acquiring a third media account sample with a label different from that of the first media account sample from the plurality of media account samples, and determining a second content distance between a sample content vector of the third media account sample and a sample content vector of the first media account sample; extracting a second media account sample and a third media account sample which meet the following training conditions from the plurality of second media account samples and the plurality of third media account samples: the second content distance corresponding to the third media account sample is greater than the first content distance corresponding to the second media account sample; the difference between the second content distance corresponding to the third media account sample and the first content distance corresponding to the second media account sample is less than a first threshold; and forming a second triple sample by the first media account sample, the second media account sample meeting the training condition and the third media account sample.

Taking any one of 30 media account samples as a first media account sample a (the role played in the triplet is equivalent to the reference sample), obtaining at least one second media account sample having the same label as the first media account sample from 29 media account samples (the role played in the triplet is equivalent to the positive sample), for example, 10 second media account samples exist, determining a first content distance between a sample content vector of the 10 second media account samples and a sample content vector of the first media account sample a, then 10 first content distances exist, obtaining at least one third media account sample having a different label from the first media account sample a from 29 media account samples (the role played in the triplet is equivalent to the negative sample), for example, 5 third media account samples exist, determining second content distances between sample content vectors of 5 third media account samples and sample content vectors of the first media account sample A, if 5 second content distances exist, extracting second media account samples and third media account samples meeting the following training conditions from 10 second media account samples and 5 third media account samples: the second content distance corresponding to the third media account sample is greater than the first content distance corresponding to the second media account sample; a difference between a second content distance corresponding to the third media account sample and a first content distance corresponding to the second media account sample is smaller than a first threshold, for example, the first distance corresponding to the second media account sample B is 1, the second distance corresponding to the third media account sample C is 5, and the first threshold is 4.5, then since the first distance is smaller than the second distance and the difference between the second distance and the first distance is 4, which is smaller than the first threshold 4.5, it is determined that the second media account sample B and the third media account sample C satisfy the training condition, so that the first media account sample a, the second media account sample B and the third media account sample C satisfying the training condition form a second type triple sample, the second type triple sample satisfies the training condition, which is described by taking any one media account sample among 30 media account samples as the first media account sample, the 30 media account samples are required to be traversed as first media account samples, so that a plurality of second-type triple samples meeting the training conditions are obtained.

As an example, the condition satisfied by the first-type triple sample is the label dimension, the condition satisfied by the second-type triple sample is the label dimension and the content vector dimension, 9000(30 × 30) triple samples can be obtained based on the theory of 10 first-type triples (30 media account samples are disassembled), so that effective expansion of the triples is realized, and then the second-type triple sample satisfying the training condition is determined by the label from the 9000 triple samples, so that the second-type triple sample for parameter updating is effectively obtained by a small amount of calculation.

In some embodiments, the obtaining of the second media account sample having the same label as the first media account sample from the multiple media account samples and the obtaining of the third media account sample having a different label from the first media account sample from the multiple media account samples may be implemented by the following technical solutions: acquiring a correlated user of each residual media account sample; the remaining media account samples are media account samples which are different from the first media account sample in the multiple media account samples; determining a user intersection between the associated users of each remaining media account sample and the associated users of the first media account sample; determining the remaining media account samples with the number of elements in the user interaction exceeding a second threshold as second media account samples with the same labels as the first media account samples; determining the remaining media account samples with the number of elements in the user interaction being less than a third threshold as a third media account sample with a different label than the first media account sample.

As an example, any one of 30 media account samples is taken as a first media account sample a (the role played in the triplet is equivalent to the reference sample), at least one second media account sample having the same label as the first media account sample is obtained from 29 media account samples (the role played in the triplet is equivalent to the positive sample), specifically, the second media account sample has the same label as the first media account sample, the third media account sample has a different label from the first media account sample, and here, whether the labels are the same is determined by the number of associated users (which may be the number of fans), and the number of associated users common to the two media account samples exceeds a second threshold, if it is determined that the two media account samples have the same label, and the number of common associated users of the two media account samples is smaller than a third threshold, it is determined that the two media account samples have different labels, and the third threshold is smaller than a second threshold, when one second media account sample corresponding to the first media account sample is obtained, one of the media account samples having the same label as the first media account sample may be randomly obtained as the second media account sample, or one media account sample having the highest number of common associated users may be obtained as the second media account sample from the media account samples having the same label as the first media account sample, and the third media account sample is also obtained in the same manner.

Continuing with the example of the method for executing the media account recommendation system provided by the embodiment of the present application by the server 200 in fig. 2A, a media account recommendation method based on artificial intelligence provided by the embodiment of the present application is described, and an application of a model in the media account recommendation method based on artificial intelligence provided by the embodiment of the present application is described below. Referring to fig. 5A, fig. 5A is a flowchart illustrating a method for recommending a media account based on artificial intelligence according to an embodiment of the present application, which will be described with reference to steps 101 to 105 shown in fig. 5A.

In step 101, a plurality of content features of at least one interactive media account of the user account are obtained, and a plurality of content features corresponding to a plurality of candidate media accounts are obtained.

In some embodiments, the obtaining of the plurality of content features of the at least one interactive media account of the user account in step 101 may be implemented by the following technical solutions: the following processing is performed for each interactive media account: extracting corresponding content characteristics from a plurality of information issued by the interactive media account to be used as the content characteristics of the interactive media account; in step 101, a plurality of content features respectively corresponding to a plurality of candidate media accounts are obtained, which can be implemented by the following technical scheme: performing the following for each candidate media account: extracting corresponding content characteristics from a plurality of information issued by the candidate media accounts to serve as the content characteristics of the candidate media accounts; wherein the type of information comprises at least one of: video information, text information, image information.

As an example, at least one interactive media account is associated with a user account, where the interactive media account is a media account concerned by the user account, or the interactive media account is a media account that the user account participates in interaction, and for each interactive media account, a content feature corresponding to the interactive media account is obtained, where the content feature is derived from information issued by the interactive media account, for example, 10 pieces of information are issued by the interactive media account, and the 10 pieces of information include at least one of: video information, text information, image information. The corresponding content features can be extracted from each piece of information, so that 10 content features corresponding to the interactive media accounts are obtained, the candidate media accounts are media accounts which are not concerned by the user account or interact with the user account, the candidate media accounts comprise cold-starting accounts, the cold-starting accounts belong to media accounts which do not have posterior data or have less posterior data, and can be defined by account creation time or interactive user number, for example, the account creation time does not exceed a time threshold (the account creation time does not exceed 7 days), for example, the interactive people number of the accounts does not exceed a people number threshold (the interactive user number does not exceed 10 people), and the obtaining mode of the content features of the candidate media accounts can refer to the obtaining mode of the content features of the interactive media accounts.

In step 102, clustering and residual error processing are performed on a plurality of content features of each interactive media account to obtain a content vector of the interactive media account.

In some embodiments, referring to fig. 5B, fig. 5B is a flowchart illustrating a method for recommending media account numbers based on artificial intelligence according to an embodiment of the present disclosure, and in step 102, clustering and residual error processing are performed on a plurality of content features of each interactive media account number to obtain a content vector of the interactive media account number may be implemented by executing steps 1021 to 1022 shown in fig. 5B for each interactive media account number.

In step 1021, a plurality of content features of the interactive media account are clustered to obtain at least one first clustering center.

As an example, a plurality of content features of a certain interactive media account are clustered by a local clustering core network in a media account learning model, and an operator for performing the clustering process is updated in a training process of the media account learning model, so that in an inference stage, the plurality of content features of the interactive media account are taken as input, and the plurality of content features are clustered by the local clustering core network in the media account learning model to obtain at least one first clustering center, and in step 1021, the above process needs to be performed for each interactive media account, because a processing object of the media account learning model is a single media account.

In step 1022, residual error processing is performed on the plurality of content features of the interactive media account based on the at least one first clustering center, so as to obtain a content vector of the interactive media account.

In some embodiments, the content vector of the interactive media account is determined by a media account learning model, the media account learning model comprising a local clustering core network and a normalization network; in step 1022, residual error processing is performed on a plurality of content features of the interactive media account based on at least one first clustering center to obtain a content vector of the interactive media account, which may be implemented by the following technical scheme: performing the following processing for each first cluster center: determining a first residual distribution of a plurality of content characteristics of the interactive media account corresponding to a first clustering center through a local clustering core network; performing the following processing for each first cluster center: normalizing the first residual distribution corresponding to the first clustering center through a normalization network to obtain a normalization result corresponding to the first clustering center; and carrying out integral normalization processing on the normalization result of each first clustering center through a normalization network to obtain the content vector of the interactive media account.

In some embodiments, the determining, by the local clustering core network, the first residual distribution of the plurality of content features of the interactive media account corresponding to the first clustering center may be implemented by the following technical solutions: performing convolution processing on each content characteristic of the interactive media account through a local clustering core network to obtain a corresponding convolution result; performing maximum likelihood function processing on the convolution result of each content feature of the interactive media account through a local clustering core network to obtain a corresponding maximum likelihood processing result; determining a first residual error between each content feature of each interactive media account and a first clustering center through a local clustering core network; and taking the maximum likelihood processing result corresponding to each content feature of the interactive media account as a weight, and performing weighted summation processing on a first residual error corresponding to each content feature of the interactive media account to obtain a first residual error distribution of a plurality of content features of the interactive media account corresponding to a first clustering center.

As an example, a plurality of content features are input to a convolution layer of a local clustering core network, each content feature of an interactive media account is convolved by the convolution layer to obtain a corresponding convolution result, the convolution result of each content feature of the interactive media account is convolved by a maximum likelihood function layer of the local clustering core network to obtain a corresponding maximum likelihood processing result, the maximum likelihood function is a so-ftmax function, a residual error between each content feature of each interactive media account and a certain clustering center a is determined by a local clustering core layer of the local clustering core network, if 10 content features are obtained, 10 residual errors correspondingly exist, the residual errors are difference values between each content feature and the clustering center a, the corresponding maximum likelihood processing result of each content feature of the interactive media account is taken as a weight, the residual error corresponding to each content feature of the interactive media account is subjected to weighted summation processing to obtain the residual error distribution of a plurality of content features of the interactive media account corresponding to the cluster center, the same processing is carried out on each cluster center through the processing mode to obtain the residual error distribution corresponding to each cluster center, then the normalization processing is carried out on the residual error distribution corresponding to each cluster center (for example, cluster centers A-C) through a first normalization layer of a normalization network to obtain the normalization results respectively corresponding to each cluster center, namely the normalization result respectively corresponding to the cluster center A, the normalization result respectively corresponding to the cluster center B and the normalization result respectively corresponding to the cluster center C, the characteristic global significance feature respectively corresponding to a plurality of clusters is learned through obtaining the residual error distribution of each cluster, and certain distribution of the features in the cluster range of each cluster is expressed respectively, the distribution eliminates the characteristic distribution difference of the content characteristics, only retains the effective distribution difference of the content characteristics and the clustering centers, and performs integral normalization processing on the normalization result of each clustering center (for example, the clustering centers A-C) through a second normalization layer of the normalization network to obtain the content vector of the interactive media account.

In some embodiments, the media account learning model further comprises a squeeze activation network; after the normalization result of each first clustering center is subjected to integral normalization processing through a normalization network to obtain a content vector of an interactive media account, the content vector of the interactive media account is subjected to channel-based average pooling processing through an extrusion activation network to obtain global content characteristics of each channel corresponding to the content vector of the interactive media account; performing full connection processing on the global content characteristics of each channel corresponding to the content vector of the interactive media account through extruding the activation network to obtain an activation value of each channel corresponding to the content vector of the interactive media account; and performing point multiplication on the activation value of each channel corresponding to the content vector of the interactive media account and the original content characteristic of each channel in the content vector of the interactive media account, and updating the content vector of the interactive media account based on the point multiplication result.

As an example, the content vector output by the normalized network has learned the content dimension characteristics of the interactive media account, in order to further optimize the expression effect of the content vector on the content dimension characteristics of the interactive media account, it is necessary to perform characteristic enhancement processing on the content vector output by the normalized network, perform channel-based average pooling processing on the content vector of the interactive media account by squeezing and activating a squeezing layer of the network, obtain the global content characteristics of each channel corresponding to the content vector of the interactive media account, perform full connection processing on the global content characteristics of each channel corresponding to the content vector of the interactive media account by squeezing and activating an activating layer of the network, obtain the activation value of each channel corresponding to the content vector of the interactive media account, perform point multiplication processing on the activation value of each channel corresponding to the content vector of the interactive media account and the original content characteristics of each channel in the content vector of the interactive media account, to update the content vector of the interactive media account based on the point multiplication processing result.

The extrusion activation network mainly comprises an extrusion layer and an activation layer, the characteristic relation of channels in the convolution process is mixed with the spatial relation learned by convolution kernels, the mixture is extracted through the extrusion activation network, so that the characteristic relation of the channels is directly learned, because the convolution is only operated in a local space, enough information is difficult to obtain to extract the relation between the channels, the whole spatial characteristic on one channel is coded into a global characteristic (average pooling processing based on the channels) through the extrusion layer, in principle, a more complex aggregation strategy can also be adopted, the extrusion layer obtains global description characteristics, then another operation is needed to capture the relation between the channels, the nonlinear relation between the channels is learned through the activation layer, and the learned relation is not mutually exclusive, and based on the method, a mechanism in the form of an S-shaped activation function is adopted, in order to reduce complexity and improve generalization capability, a structure comprising two full-connection layers is adopted, wherein the first full-connection layer plays a role in dimension reduction, a dimension reduction coefficient is a hyper-parameter, then a linear rectification activation function is adopted for activation processing, the last full-connection layer is used for recovering original dimensions, finally, the learned activation value (value is 0-1) of each channel is multiplied by the original content characteristics input to an extrusion layer, the process actually learns the weight coefficient of each channel, and therefore the media account number learning model has better distinguishing capability on the characteristics of each channel.

In step 103, the clustering process and the residual error process are performed on the plurality of content features of each candidate media account, so as to obtain a content vector of each candidate media account.

In some embodiments, in step 103, the clustering process and the residual error process are performed on the plurality of content features of each candidate media account respectively to obtain a content vector of each candidate media account, which may be implemented by the following technical solutions: performing the following for each candidate media account: clustering a plurality of content characteristics of the candidate media accounts to obtain at least one second clustering center; and performing residual error processing on a plurality of content features of the candidate media account based on at least one second clustering center to obtain a content vector of the candidate media account.

In some embodiments, the content vector of the interactive media account is determined by a media account learning model, the media account learning model comprising a local clustering core network and a normalization network; the clustering process is performed on the plurality of content features of the candidate media account to obtain at least one second clustering center, and the clustering process can be implemented by the following technical scheme: the following processing is performed for each second cluster center: determining second residual distribution of a plurality of content characteristics of the candidate media accounts corresponding to a second clustering center through a local clustering core network; the following processing is performed for each second cluster center: normalizing the second residual distribution of the plurality of content features of the candidate media account corresponding to the second clustering center through a normalization network to obtain a corresponding normalization result; and carrying out integral normalization processing on the normalization results corresponding to the plurality of content characteristics of the candidate media account through a normalization network to obtain the content vector of the candidate media account.

In some embodiments, the determining, by the local clustering core network, the second residual distribution of the second clustering center corresponding to the plurality of content features of the candidate media account may be implemented by the following technical solutions: performing convolution processing on each content characteristic of the candidate media account through a local clustering core network to obtain a corresponding convolution result; performing maximum likelihood function processing on the convolution result of each content feature of the candidate media account through a local clustering core network to obtain a corresponding maximum likelihood processing result; determining a second residual error between each content feature of the candidate media account and a second clustering center through a local clustering core network; and taking the maximum likelihood processing result corresponding to each content feature of the candidate media account as a weight, and performing weighted summation processing on a second residual error corresponding to each content feature of the candidate media account to obtain second residual error distribution of a plurality of content features corresponding to a second clustering center.

In some embodiments, the media account learning model further comprises a squeeze activation network; after the normalization result of each second clustering center is subjected to integral normalization processing through a normalization network to obtain content vectors of candidate media accounts, the content vectors of the candidate media accounts are subjected to channel-based average pooling processing through an extrusion activation network to obtain global content characteristics of the content vectors of the candidate media accounts corresponding to each channel; performing full connection processing on the global content characteristics of each channel corresponding to the content vector of the candidate media account through extruding the activation network to obtain an activation value of each channel corresponding to the content vector of the candidate media account; and performing point multiplication on the activation value of each channel corresponding to the content vector of the candidate media account and the original content characteristic of each channel in the content vector of the candidate media account, and updating the content vector of the candidate media account based on the result of the point multiplication.

As an example, the specific implementation of the embodiment corresponding to the candidate media account may refer to the specific implementation described in the embodiment corresponding to the interactive media account, and the difference is only that the processing object is changed from the interactive media account to the candidate media account.

In some embodiments, when the content features are processed through the media account learning model, the content features can be subjected to dimension-increasing processing and grouping processing, so that model parameters can be effectively reduced, and reasoning efficiency is effectively improved.

In step 104, candidate media accounts to be recommended are determined based on content similarity between the content vector of each interactive media account and the content vectors of the candidate media accounts.

In some embodiments, referring to fig. 5C, fig. 5C is a flowchart of a method for recommending media account numbers based on artificial intelligence provided in this embodiment, and determining candidate media account numbers to be recommended in step 104 based on content similarity between a content vector of each interactive media account number and content vectors of multiple candidate media account numbers may be implemented by performing steps 1041 to 1043 shown in fig. 5C.

In step 1041, account similarity between the account vector of each interactive media account and the account vectors of the plurality of candidate media accounts is determined.

In some embodiments, the determining of the account number similarity between the account number vector of each interactive media account number and the account number vectors of the candidate media account numbers in step 1041 may be implemented by the following technical solutions: performing the following for each candidate media account: extracting a plurality of account characteristics from account information of the candidate media accounts, and compressing the account characteristics of the candidate media accounts to obtain account vectors of the candidate media accounts; the following processing is performed for each interactive media account: extracting a plurality of account characteristics from account information of the interactive media account, and compressing the account characteristics of the interactive media account to obtain an account vector of the interactive media account; account similarity between the account vector of each interactive media account and account vectors of a plurality of candidate media accounts is determined.

As an example, when an interactive media account and a candidate media account are learned, learning may be performed from a content dimension, and also learning may be performed from a dimension of the account itself, that is, an account vector of the interactive media account and an account vector of the candidate media account are learned separately, and then account similarities between the account vector of each interactive media account and the account vectors of a plurality of candidate media accounts are determined, for example, there are two interactive media accounts a and B, 10 account similarities between the account vector of the interactive media account a and the account vectors of 10 candidate media accounts are determined, and 10 account similarities between the account vector of the interactive media account B and the account vectors of 10 candidate media accounts are determined.

In some embodiments, the compressing the multiple account features of the candidate media account to obtain the account vector of the candidate media account may be implemented by the following technical solutions: embedding the plurality of account characteristics of the candidate media accounts to obtain the weight of the plurality of account embedding characteristics of the candidate media accounts; and carrying out weighted summation processing on the plurality of account embedded features of the candidate media accounts based on the weights of the plurality of account features of the candidate media accounts to obtain account vectors corresponding to the candidate media accounts.

In some embodiments, the compressing the multiple account features of the candidate media account to obtain the account vector of the candidate media account may be implemented by the following technical solutions: embedding the plurality of account characteristics and the plurality of content characteristics of the candidate media accounts to obtain a plurality of account embedding characteristics and a plurality of content embedding characteristics of the candidate media accounts; and carrying out weighted summation processing on the plurality of account embedding characteristics and the plurality of content embedding characteristics of the candidate media accounts based on the weights of the plurality of account characteristics and the plurality of content characteristics of the candidate media accounts to obtain account vectors corresponding to the candidate media accounts.

As an example, when compressing multiple account features of candidate media accounts, the multiple account features of the candidate media accounts are used as input, and for each candidate media account, the account feature corresponding to the candidate media account is acquired, where the account feature is derived from account information of the candidate media account, and the account information includes at least one of: the grade information, the category information and the label information can extract corresponding account characteristics from each piece of information, the account characteristics are extracted from the information, so the account characteristics are sparse characteristics, the acquisition mode of the account characteristics of the interactive media account can refer to the acquisition mode of the account characteristics of the candidate media account, the account characteristics of the candidate media account can be acquired and then can be embedded through a neural network model, the account embedding characteristics of the account characteristics of the candidate media account are obtained (the account embedding characteristics are dense expressions of the account characteristics as the sparse characteristics), the account embedding characteristics of the candidate media account are weighted and summed based on the weights of the account characteristics of the candidate media account, account vectors corresponding to the candidate media account are obtained, the weights related to the embedding processing are prior data or are obtained through training, the training mode is similar to that of the word vector model.

As an example, when compressing the account features of the candidate media accounts, the account features of the candidate media accounts are combined with the content features as input, so that the learned account vectors include richer account information, and the recommendation efficiency is improved.

In some embodiments, the compressing the multiple account features of the interactive media account to obtain the account vector of the interactive media account may be implemented by the following technical solutions: embedding the plurality of account characteristics of the interactive media account to obtain a plurality of account embedding characteristics of the interactive media account; based on the weights of the account characteristics of the interactive media accounts, weighting and summing the account embedding characteristics of the interactive media accounts to obtain account vectors corresponding to the interactive media accounts.

In some embodiments, the compressing the multiple account features of the interactive media account to obtain the account vector of the interactive media account may be implemented by the following technical solutions: embedding the plurality of account characteristics and the plurality of content characteristics of the interactive media account to obtain a plurality of account embedding characteristics and a plurality of content embedding characteristics of the interactive media account; based on the multiple account number characteristics and the weights of the multiple content characteristics of the interactive media account number, carrying out weighted summation processing on the multiple account number embedding characteristics and the multiple content embedding characteristics of the interactive media account number to obtain an account number vector corresponding to the interactive media account number.

As an example, the specific implementation of the embodiment of the corresponding interactive media account may refer to the specific implementation described in the embodiment of the corresponding candidate media account, and the difference is only that the processing object is changed from the candidate media account to the interactive media account.

In step 1042, the content similarity and the account similarity between each interactive media account and the candidate media accounts are fused to obtain the similarity between each interactive media account and the candidate media accounts.

In some embodiments, in step 1042, the content similarity and the account similarity between each interactive media account and a plurality of candidate media accounts are fused, which may be implemented by the following technical solutions: for each interactive media account, the following processing is executed: determining account number similarity between the account number vector of the interactive media account number and the account number vector of each candidate media account number, and determining content similarity between the content vector of the interactive media account number and the content vector of each candidate media account number; and carrying out average processing on the account number similarity and the content similarity.

For example, when the account similarity and the content similarity are merged, a method other than the averaging process may be used, for example, a weighted summation process is performed on the account similarity and the content similarity according to a pre-assigned weight.

As an example, when content similarity and account similarity between each interactive media account and a plurality of candidate media accounts are fused, for example, for an interactive media account a, account similarity between an account vector of the interactive media account a and an account vector of each candidate media account is determined, and content similarity between a content vector of the interactive media account a and a content vector of each candidate media account is determined; and carrying out average processing on the account number similarity and the content similarity to obtain the similarity between the interactive media account A and each candidate media account, and executing similar processing aiming at all the interactive media accounts.

In step 1043, the similarity between each interactive media account and the multiple candidate media accounts is sorted in a descending order, and at least one candidate media account sorted in the descending order result in the top order is selected as a candidate media account to be recommended.

In some embodiments, interactive media account numbers and matching pairs of candidate media account numbers are constructed, the matching pairs are constructed by any one of the interactive media account numbers and any one of the candidate media account numbers, the matching pairs are subjected to global descending order sorting according to the similarity between the interactive media account numbers and the candidate media account numbers in the matching pairs, and the candidate media account number of at least one matching pair which is ranked in the descending order sorting result and is ranked in the top is selected as the candidate media account number to be recommended.

In some embodiments, the following is performed for each interactive media account: and performing descending sorting on the similarity between the interactive media account and a plurality of candidate media accounts, and selecting at least one previous candidate media account as a to-be-recommended media account corresponding to the interactive media account in the descending sorting result of each interactive media account.

As an example, when performing descending sort, all similarity degrees may be subjected to descending sort, for example, the similarity degrees between the interactive media account a and the 3 candidate media accounts a-c are respectively 0.5, 0.6, and 0.2, the similarity degrees between the interactive media account B and the 3 candidate media accounts a-c are respectively 0.7, 0.3, and 0.8, then 0.5, 0.6, 0.2, 0.3, 0.7, and 0.8 are subjected to descending sort to obtain two candidate media accounts B and c ranked first, the two candidate media accounts a and c ranked first are taken as media accounts to be recommended, when performing descending sort, the similarity degrees between the interactive media account a and the 3 candidate media accounts a-c are respectively 0.5, 0.6, and 0.2, and obtain 1 candidate media account B ranked first, and the similarity degrees between the interactive media account B and the 3 candidate media accounts a-c are respectively 0.7.7, 0.3 and 0.8, obtaining the top 1 candidate media account c, and then taking the candidate media accounts b and c as the media accounts to be recommended.

In some embodiments, content vectors or account vectors may be used separately, if the account vectors are used separately for recommendation, the account similarity between each interactive media account and a plurality of candidate media accounts is sorted in a descending order, at least one candidate media account ranked in the descending order result in the top order is selected as a candidate media account to be recommended, if the content vectors are used separately for recommendation, the content similarity between each interactive media account and the plurality of candidate media accounts is sorted in a descending order, at least one candidate media account ranked in the top order in the descending order result is selected as a candidate media account to be recommended, and the specific sorting manner and the manner of selecting the media accounts to be recommended may refer to the implementation manner of sorting based on the similarity obtained by fusion.

In some embodiments, the vector utilization mode is selected in different scenarios, and when the number of messages issued by the candidate media account is less than the first message number threshold, determining that the candidate media accounts and the interactive media accounts are to be characterized with account vectors, and ranking based on account similarity, when the number of messages issued by the candidate media account is not less than the first message number threshold and is less than the second message number threshold, it is determined that the candidate media account numbers and the interactive media account numbers will be characterized with account number vectors and content vectors, and the account number similarity and the content similarity are fused, and are sorted based on the similarity obtained by the fusion, and when the number of the information issued by the candidate media accounts is not less than the second information number threshold value, determining that the candidate media accounts and the interactive media accounts are represented by content vectors, and sequencing based on the content similarity.

In step 105, a recommendation operation of the corresponding user account is performed based on the candidate media account to be recommended.

As an example, information (e.g., video) issued by the candidate media account to be recommended is recommended to the user in a one-drag-three scenario, or attention recommendation is directly performed on the candidate media account to be recommended.

Here, in the embodiment of the present application, a blockchain technology may be further combined, after a terminal acquires a candidate media account to be recommended and marks the candidate media account as a new interactive media account, a transaction for storing the new interactive media account is generated, and the generated transaction is submitted to a node of a blockchain network, so that the node identifies the transaction together and then stores the new interactive media account to the blockchain network; before the new interactive media account is stored in the blockchain network, the terminal can also carry out hash processing on the new interactive media account to obtain summary information corresponding to the new interactive media account; and storing the obtained abstract information of the new interactive media account to the block chain network. By the method, the new interactive media account is prevented from being tampered, the safety of the new interactive media account is improved, and a malicious user or a malicious program is prevented from tampering the interactive media account of the user.

Referring to fig. 6, fig. 6 is a schematic diagram of an application architecture of a blockchain network provided in the embodiment of the present application, including a service body 800, a blockchain network 600 (exemplarily showing nodes 610-1 to 610-3), and an authentication center 700, which are respectively described below.

The type of blockchain network 600 is flexible and may be, for example, any of a public chain, a private chain, or a federation chain. Taking a public link as an example, electronic devices such as a user terminal and a server of any service entity can access the blockchain network 600 without authorization; taking a federation chain as an example, a computer device (e.g., a terminal/server) under the jurisdiction of a service entity after obtaining authorization may access the blockchain network 600, and in this case, become a client node in the blockchain network 600.

In some embodiments, the traffic master may be a terminal, the client node 410 may act as a sole watcher of the blockchain network 600, i.e., providing functionality that supports the traffic master to initiate transactions (e.g., for uplink storage of data or querying of data on the chain), and the client node may be implemented by default or selectively (e.g., depending on the specific traffic requirements of the traffic master) with respect to the functions of the nodes 610-1 to 610-3 of the blockchain network 600, such as the ranking function, consensus service, and ledger function, etc. Therefore, the data and the service processing logic of the service subject can be migrated to the blockchain network 600 to the maximum extent, and the credibility and traceability of the data and service processing process are realized through the blockchain network 600.

Nodes in blockchain network 600 receive transactions submitted by client node 410 from business entity 800, perform transactions to update or query ledgers, and various intermediate or final results of performing transactions may be returned for display in the business entity's client node.

For example, the client node 410 may subscribe to events of interest in the blockchain network 600, such as transactions occurring in a particular organization/channel in the blockchain network 600, and the corresponding transaction notifications are pushed by the nodes 610-1 through 610-3 to the client node 410, thereby triggering the corresponding business logic in the client node 410.

An exemplary application of the blockchain is described below by taking an example in which a service agent accesses a blockchain network to implement media account recommendation.

Referring to fig. 6, a service entity 800 related to media account recommendation registers from a certificate authority 700 to obtain a digital certificate, where the digital certificate includes a public key of the service entity and a digital signature signed by the certificate authority 700 for the public key and identity information of the service entity, and is used to be attached to a transaction together with the digital signature of the service entity for the transaction, and is sent to a blockchain network, so that the blockchain network takes the digital certificate and signature from the transaction, verifies the authenticity of the message (i.e. whether the message is not tampered) and the identity information of the service entity sending the message, and verifies the blockchain network according to the identity, for example, whether the service entity has the right to initiate the transaction. Clients running computer devices (e.g., terminals or servers) hosted by the business entity may request access from the blockchain network 600 to become client nodes.

The service body 800 client node 410 is configured to present a media account to be recommended, for example, in response to a play operation for a certain video, present a video play page corresponding to the video, and present video content of the video and the media account to be recommended in the video play page, and in response to an interaction operation for the media account to be recommended by a user, display that the media account to be recommended has been marked as an interactive media account, where the terminal sends the interactive media account to the blockchain network 600.

The operation of sending the interactive media account to the blockchain network 600 may be to set service logic in the client node 410 in advance, when the terminal obtains the interactive media account, the client node 410 automatically sends the interactive media account to the blockchain network 600, when sending, the client node 410 generates a transaction corresponding to the storage operation according to the interactive media account, specifies an intelligent contract that needs to be called to implement the storage operation and parameters transferred to the intelligent contract in the transaction, and the transaction also carries a digital certificate of the client node 410 and a signed digital signature (for example, a secret key in the digital certificate of the client node 410 is used to encrypt a summary of the transaction), and broadcasts the transaction to the nodes 610-1 to 610-3 in the blockchain network 600.

When the nodes 610-1 to 610-3 in the blockchain network 600 receive the transaction, the digital certificate and the digital signature carried by the transaction are verified, and after the verification is successful, whether the service body 800 has the transaction right is determined according to the identity of the service body 800 carried in the transaction, and any verification judgment of the digital signature and the right verification will result in the transaction failure. After successful verification, the nodes 610-1 through 610-3 sign their own digital signatures (e.g., by encrypting the digest of the transaction using the private key of node 610-1) and continue to broadcast in the blockchain network 600.

After the nodes 610-1 to 610-3 in the blockchain network 600 receive the transaction successfully verified, the transaction is filled into a new block and broadcast. When a new block is broadcasted by the nodes 610-1 to 610-3 in the block chain network 600, a consensus process is performed on the new block (the above nodes may be used as consensus nodes), and if the consensus is successful, the new block is appended to the tail of the block chain stored in the new block chain, and the state database is updated according to the transaction result, so as to execute the transaction in the new block: for transactions submitting updated interactive media account numbers, the interactive media account numbers are added to the status database.

As an example of the blockchain, referring to fig. 7, fig. 7 is a schematic structural diagram of the blockchain in the blockchain network 600 provided in this embodiment of the present application, where fig. 7 shows creating a block, a block 2, and a block 3, each block has a different height, a header of each block may include hash values of all transactions in the block and also include hash values of all transactions in a previous block, a record of a newly generated transaction is filled into the block and is added to a tail of the blockchain after being identified by nodes in the blockchain network, so as to form a chain growth, and a chain structure based on hash values between blocks ensures tamper-proof and forgery-proof of transactions in the block.

An exemplary functional architecture of a block chain network provided in the embodiment of the present application is described below, referring to fig. 8, fig. 8 is a functional architecture schematic diagram of a block chain network 600 provided in the embodiment of the present application, where the block chain network includes an application layer 601, a consensus layer 602, a network layer 603, a data layer 604, and a resource layer 605, which are described below separately.

The resource layer 605 encapsulates the computing, storage, and communication resources that implement the various nodes 610-1 through 610-3 in the blockchain network 600.

The data layer 604 encapsulates various data structures that implement the ledger, including blockchains implemented in files in a file system, state databases of the key-value type, and presence certificates (e.g., hash trees of transactions in blocks).

The network layer 603 encapsulates the functions of a Point-to-Point (P2P) network protocol, a data propagation mechanism and a data verification mechanism, an access authentication mechanism, and service agent identity management.

Wherein the P2P network protocol implements communication between nodes 610-1 to 610-3 in the blockchain network 600, the data propagation mechanism ensures propagation of transactions in the blockchain network 600, and the data verification mechanism implements reliability of data transmission between the nodes 610-1 to 610-3 based on cryptography methods (e.g., digital certificates, digital signatures, public/private key pairs); the access authentication mechanism is used for authenticating the identity of the service subject added to the block chain network 600 according to an actual service scene, and endowing the service subject with the authority of accessing the block chain network 600 when the authentication is passed; the business entity identity management is used to store the identity of the business entity that is allowed to access blockchain network 600, as well as the permissions (e.g., the types of transactions that can be initiated).

The consensus layer 602 encapsulates the functionality of the mechanisms for nodes 610-1 through 610-3 in the blockchain network 600 to agree on a block (i.e., consensus mechanisms), transaction management, and ledger management. The consensus mechanism comprises consensus algorithms such as POS, POW and DPOS, and the pluggable consensus algorithm is supported.

The transaction management is used for verifying the digital signatures carried in the transactions received by the nodes 610-1 to 610-3, verifying the identity information of the business entity, and judging and confirming whether the business entity has the authority to perform the transaction (reading the relevant information from the identity management of the business entity) according to the identity information; for the service entities authorized to access the blockchain network 600, the service entities have digital certificates issued by the certificate authority, and the service entities sign the submitted transactions by using the private keys in their digital certificates, thereby declaring their own legal identities.

The ledger administration is used to maintain blockchains and state databases. For the block with the consensus, adding the block to the tail of the block chain; executing the transaction in the acquired consensus block, updating the key-value pairs in the state database when the transaction comprises an update operation, querying the key-value pairs in the state database when the transaction comprises a query operation and returning a query result to the client node of the business entity. Supporting query operations for multiple dimensions of a state database, comprising: querying the chunk based on the chunk sequence number (e.g., hash value of the transaction); inquiring the block according to the block hash value; inquiring a block according to the transaction serial number; inquiring the transaction according to the transaction serial number; inquiring account data of a business main body according to an account (serial number) of the business main body; and inquiring the block chain in the channel according to the channel name.

The application layer 601 encapsulates various services that the blockchain network can implement, including tracing, crediting, and verifying transactions.

In the following, an exemplary application of the embodiment of the application in an actual account recommendation scenario will be described, which may effectively improve recommendation accuracy of a cold-start account, for example, according to a focused account list of a user, a candidate media account most similar to a focused account is determined through a content vector of the focused account (an interactive account) and a content vector of a candidate media account (e.g., a cold-start account), and then information (e.g., a video) issued by the determined candidate media account is recommended to the user in a one-to-three scenario, or focus recommendation is directly performed on the determined candidate media account.

In some embodiments, referring to fig. 9, fig. 9 is a schematic diagram illustrating a principle of artificial intelligence based media account recommendation provided in an embodiment of the present application, and a method for artificial intelligence based media account recommendation provided in an embodiment of the present application learns content characteristics of a video issued by a media account by using a media account learning model, and trains the media account learning model in an online triple minimization manner, which is described below.

Firstly, videos issued by a media account and corresponding content features, such as video embedding features (E-budding features), are acquired, the content features of each acquired video are used as input of a media account learning model, the media account learning model outputs the features of the media account, then, an extrusion activation network is connected to perform feature enhancement processing on the features of the media account so as to output content vectors of the media account, the media account learning model is trained in a mode of online triple minimization in a training process, the mode of offline triple minimization corresponds to the mode of online triple minimization, all training data are input into the media account learning model at the beginning of the offline triple, triples with intermediate difficulty are selected from all triples before each iteration, and the media account learning model is trained once by taking the triples with intermediate difficulty as input, since all triples obtained based on the training data are traversed before each round of training iteration, the efficiency of mining the intermediate difficulty triples offline is low, online triple minimization is traversed in batch training samples, and all triples do not need to be traversed to find the intermediate difficulty triples, so that the training efficiency is improved.

In some embodiments, the media account learning model is a supervised model, and when the content vector is learned by using the media account learning model, a supervision signal is also required, and the supervision signal can be constructed by fan data of the media accounts, for example, if two media accounts have more common fans, the two media accounts are more similar, so that triplets can be constructed based on the supervision signal, and then intermediate difficulty triplets meeting the supervision signal are obtained in an online triplet minimization manner.

In some embodiments, the intermediate difficulty triplets are defined based on the distance between the reference sample and the negative and positive samples in the triplets, and referring to fig. 10, fig. 10 is a schematic diagram of training samples provided in an embodiment of the present application, and the triplets can be classified into three categories according to the recognition difficulty of the triplets: a simple triplet, a simple triplet being a triplet corresponding to a loss of zero, formalized defined as the distance of the reference sample from the negative sample > the distance of the reference sample from the positive sample + a set threshold, a difficult triplet (corresponding to a difficult negative sample), formalized defined as the distance of the negative sample from the reference sample being smaller than the distance of the reference sample from the positive sample, an intermediate difficulty triplet (corresponding to an intermediate difficulty negative sample), the distance of the negative sample from the reference sample being larger than the distance of the reference sample from the positive sample, but not yet so as to make the loss zero, formalized defined as the distance of the reference sample from the positive sample < the distance of the reference sample from the negative sample < the distance of the reference sample from the positive sample + a set threshold. Simple triplets (corresponding to simple negative samples) are simple because such triplets are easy to identify, and therefore, too many triplets of this type do not need to be constructed, otherwise, the training efficiency is reduced, and if a difficult triplet is adopted, the training effect may be affected, and therefore, an intermediate difficult triplet needs to be selected through online triplet minimization.

The method comprises the steps of selecting content features of 100 videos recently issued by media accounts as video features for learning content vectors during training, randomly extracting 10 media accounts in training data each time as input of a media account learning model, accepting the number of videos with different lengths as input of the media account learning model during prediction, measuring click rate indexes on model prediction effects, and improving the click rate indexes after training by 4 percentage points compared with test results before training.

In some embodiments, an Enhanced graph Embedding model (EGES) with Side Information is used to learn account vectors of media accounts, an interactive media account sequence concerned by each user is obtained, content features and other features (such as account level, account perpendicularity and the like) of the interactive media accounts are pulled and input into the EGES model to obtain the account vectors of the interactive media accounts, so that the concerned sequence of the user is fitted based on the EGES model, the content features and other features (such as account level, account perpendicularity and the like) of candidate media accounts are obtained and input into the EGES model to obtain the account vectors of the candidate media accounts, and the learned account vectors of the media accounts contain more abundant Information due to the input of the features such as account level and the like.

The artificial intelligence-based media account recommendation method provided by the embodiment of the application starts from information issued by accounts without depending on posterior data, applies a media account learning model for clustering and residual processing to an account cold start task, adopts the media account learning model to learn vector data of videos issued by accounts on the basis of existing content characteristics, trains vector representation of the accounts in an online triple minimization mode, learns content information of the videos issued by the accounts in a content vector output by the media account learning model obtained in the mode, and establishes a relationship between the accounts in a content dimension, so that the problem of cold start of new accounts can be solved from the content vector (similarity) angle of the accounts.

Continuing with the exemplary structure of the artificial intelligence based media account recommendation device 255 provided by the embodiments of the present application as software modules, in some embodiments, as shown in fig. 3, the software modules stored in the artificial intelligence based media account recommendation device 255 of the storage 250 may include: a feature module 2551, configured to obtain a plurality of content features of at least one interactive media account of the user account, and obtain a plurality of content features corresponding to a plurality of candidate media accounts, respectively; a vector module 2552, configured to perform clustering processing and residual error processing on multiple content features of each interactive media account to obtain a content vector of the interactive media account; the vector module 2552 is further configured to perform clustering processing and residual error processing on the multiple content features of each candidate media account, so as to obtain a content vector of each candidate media account; a similarity module 2553, configured to determine candidate media accounts to be recommended based on content similarities between the content vector of each interactive media account and the content vectors of the multiple candidate media accounts; and the recommending module 2554 is used for executing recommending operation of the corresponding user account based on the candidate media account to be recommended.

In some embodiments, the features module 2551, is further to: the following processing is performed for each interactive media account: extracting corresponding content characteristics from a plurality of information issued by the interactive media account to be used as the content characteristics of the interactive media account; performing the following for each candidate media account: extracting corresponding content characteristics from a plurality of information issued by the candidate media accounts to serve as the content characteristics of the candidate media accounts; wherein the type of information comprises at least one of: video information, text information, image information.

In some embodiments, vector module 2552, is further configured to: the following processing is performed for each interactive media account: clustering a plurality of content characteristics of the interactive media account to obtain at least one first clustering center; performing residual error processing on a plurality of content features of the interactive media account based on at least one first clustering center to obtain a content vector of the interactive media account; performing the following for each candidate media account: clustering a plurality of content characteristics of the candidate media accounts to obtain at least one second clustering center; and performing residual error processing on a plurality of content features of the candidate media account based on at least one second clustering center to obtain a content vector of the candidate media account.

In some embodiments, the content vector of the interactive media account is determined by a media account learning model, the media account learning model comprising a local clustering core network and a normalization network; a vector module 2552, further to: performing the following processing for each first cluster center: determining a first residual distribution of a plurality of content characteristics of the interactive media account corresponding to a first clustering center through a local clustering core network; performing the following processing for each first cluster center: normalizing the first residual distribution corresponding to the first clustering center through a normalization network to obtain a normalization result corresponding to the first clustering center; and carrying out integral normalization processing on the normalization result of each first clustering center through a normalization network to obtain the content vector of the interactive media account.

In some embodiments, the media account learning model further comprises a squeeze activation network; after the normalization result of each first clustering center is subjected to the overall normalization processing through the normalization network to obtain the content vector of the interactive media account, the vector module 2552 is further configured to: performing channel-based average pooling on the content vector of the interactive media account by extruding and activating the network to obtain the global content characteristics of each channel corresponding to the content vector of the interactive media account; performing full connection processing on the global content characteristics of each channel corresponding to the content vector of the interactive media account through extruding the activation network to obtain an activation value of each channel corresponding to the content vector of the interactive media account; and performing point multiplication on the activation value of each channel corresponding to the content vector of the interactive media account and the original content characteristic of each channel in the content vector of the interactive media account, and updating the content vector of the interactive media account based on the point multiplication result.

In some embodiments, vector module 2552, is further configured to: performing convolution processing on each content characteristic of the interactive media account through a local clustering core network to obtain a corresponding convolution result; performing maximum likelihood function processing on the convolution result of each content feature of the interactive media account through a local clustering core network to obtain a corresponding maximum likelihood processing result; determining a first residual error between each content feature of each interactive media account and a first clustering center through a local clustering core network; and taking the maximum likelihood processing result corresponding to each content feature of the interactive media account as a weight, and performing weighted summation processing on a first residual error corresponding to each content feature of the interactive media account to obtain a first residual error distribution of a plurality of content features of the interactive media account corresponding to a first clustering center.

In some embodiments, the content vector of the interactive media account is determined by a media account learning model, the media account learning model comprising a local clustering core network and a normalization network; a vector module 2552, further to: the following processing is performed for each second cluster center: determining second residual distribution of a plurality of content characteristics of the candidate media accounts corresponding to a second clustering center through a local clustering core network; the following processing is performed for each second cluster center: normalizing the second residual distribution of the plurality of content features of the candidate media account corresponding to the second clustering center through a normalization network to obtain a corresponding normalization result; and carrying out integral normalization processing on the normalization results corresponding to the plurality of content characteristics of the candidate media account through a normalization network to obtain the content vector of the candidate media account.

In some embodiments, vector module 2552, is further configured to: performing convolution processing on each content characteristic of the candidate media account through a local clustering core network to obtain a corresponding convolution result; performing maximum likelihood function processing on the convolution result of each content feature of the candidate media account through a local clustering core network to obtain a corresponding maximum likelihood processing result; determining a second residual error between each content feature of the candidate media account and a second clustering center through a local clustering core network; and taking the maximum likelihood processing result corresponding to each content feature of the candidate media account as a weight, and performing weighted summation processing on a second residual error corresponding to each content feature of the candidate media account to obtain second residual error distribution of a plurality of content features corresponding to a second clustering center.

In some embodiments, the media account learning model further comprises a squeeze activation network; a vector module 2552, further to: after the normalization result of each second clustering center is subjected to integral normalization processing through a normalization network to obtain content vectors of candidate media accounts, the content vectors of the candidate media accounts are subjected to channel-based average pooling processing through an extrusion activation network to obtain global content characteristics of the content vectors of the candidate media accounts corresponding to each channel; performing full connection processing on the global content characteristics of each channel corresponding to the content vector of the candidate media account through extruding the activation network to obtain an activation value of each channel corresponding to the content vector of the candidate media account; and performing point multiplication on the activation value of each channel corresponding to the content vector of the candidate media account and the original content characteristic of each channel in the content vector of the candidate media account, and updating the content vector of the candidate media account based on the result of the point multiplication.

In some embodiments, similarity module 2553 is further configured to: determining account number similarity between an account number vector of each interactive media account number and account number vectors of a plurality of candidate media account numbers; fusing the content similarity and the account similarity between each interactive media account and a plurality of candidate media accounts to obtain the similarity between each interactive media account and the plurality of candidate media accounts; and performing descending sorting on the similarity between each interactive media account and a plurality of candidate media accounts, and selecting at least one candidate media account which is sorted in the descending sorting result and is ranked in the front as a candidate media account to be recommended.

In some embodiments, similarity module 2553 is further configured to: performing the following for each candidate media account: extracting a plurality of account characteristics from account information of the candidate media accounts, and compressing the account characteristics of the candidate media accounts to obtain account vectors of the candidate media accounts; the following processing is performed for each interactive media account: extracting a plurality of account characteristics from account information of the interactive media account, and compressing the account characteristics of the interactive media account to obtain an account vector of the interactive media account; determining account number similarity between an account number vector of each interactive media account number and account number vectors of a plurality of candidate media account numbers; for each interactive media account, the following processing is executed: determining account number similarity between the account number vector of the interactive media account number and the account number vector of each candidate media account number, and determining content similarity between the content vector of the interactive media account number and the content vector of each candidate media account number; and carrying out average processing on the account number similarity and the content similarity.

In some embodiments, similarity module 2553 is further configured to: embedding the plurality of account characteristics of the candidate media accounts to obtain the weights of the plurality of account characteristics of the candidate media accounts; based on the weights of the account features of the candidate media accounts, carrying out weighted summation processing on the account features of the candidate media accounts to obtain account vectors corresponding to the candidate media accounts; embedding the plurality of account characteristics of the interactive media account to obtain the weights of the plurality of account characteristics of the interactive media account; based on the weights of the account characteristics of the interactive media account, the account characteristics of the interactive media account are subjected to weighted summation processing to obtain an account vector corresponding to the interactive media account.

In some embodiments, similarity module 2553 is further configured to: embedding the plurality of account characteristics and the plurality of content characteristics of the candidate media accounts to obtain the weights of the plurality of account characteristics and the plurality of content characteristics of the candidate media accounts; based on the multiple account characteristics and the weights of the multiple content characteristics of the candidate media accounts, carrying out weighted summation processing on the multiple account characteristics and the multiple content characteristics of the candidate media accounts to obtain account vectors corresponding to the candidate media accounts; embedding the plurality of account characteristics and the plurality of content characteristics of the interactive media account to obtain the plurality of account characteristics and the plurality of content characteristics of the interactive media account; based on the multiple account characteristics and the weights of the multiple content characteristics of the interactive media account, performing weighted summation processing on the multiple account characteristics and the multiple content characteristics of the interactive media account to obtain an account vector corresponding to the interactive media account.

In some embodiments, the content vector of the interactive media account is determined by a media account learning model, the media account learning model comprising a local clustering core network and a normalization network; the apparatus further comprises a training module 2555 for: before acquiring a plurality of content characteristics of at least one interactive media account of a user account and acquiring a plurality of content characteristics corresponding to a plurality of candidate media accounts respectively, training a media account learning model in the following way: acquiring a plurality of media account number samples, and constructing a plurality of first triple samples based on the number of associated users of the plurality of media account number samples; performing disassembly processing on the plurality of first-type triple samples to obtain a plurality of media account samples, and predicting a sample content vector of each media account sample through a media account learning model; determining a second type of triple sample meeting the training condition according to the sample content vector; and substituting the sample content vector corresponding to each media account sample in the second type of triple samples into the triple loss function to determine parameters of the media account learning model when the triple loss function obtains the minimum value.

In some embodiments, training module 2555 is further configured to: taking any one media account sample as a first media account sample, obtaining a second media account sample with the same label as the first media account sample from a plurality of media account samples, and determining a first content distance between a sample content vector of the second media account sample and a sample content vector of the first media account sample; acquiring a third media account sample with a label different from that of the first media account sample from the plurality of media account samples, and determining a second content distance between a sample content vector of the third media account sample and a sample content vector of the first media account sample; extracting a second media account sample and a third media account sample which meet the following training conditions from the plurality of second media account samples and the plurality of third media account samples: the second content distance corresponding to the third media account sample is greater than the first content distance corresponding to the second media account sample; the difference between the second content distance corresponding to the third media account sample and the first content distance corresponding to the second media account sample is less than a first threshold; and forming a second triple sample by the first media account sample, the second media account sample meeting the training condition and the third media account sample.

In some embodiments, training module 2555 is further configured to: acquiring a correlated user of each residual media account sample; the remaining media account samples are media account samples which are different from the first media account sample in the multiple media account samples; determining a user intersection between the associated users of each remaining media account sample and the associated users of the first media account sample; determining the remaining media account samples with the number of elements in the user interaction exceeding a second threshold as second media account samples with the same labels as the first media account samples; determining the remaining media account samples with the number of elements in the user interaction being less than a third threshold as a third media account sample with a different label than the first media account sample.

Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, so that the computer device executes the artificial intelligence based media account recommendation method in the embodiment of the application.

Embodiments of the present application provide a computer-readable storage medium storing executable instructions, which when executed by a processor, perform the artificial intelligence based media account recommendation method provided by embodiments of the present application, for example, the artificial intelligence based media account recommendation method shown in fig. 5A-5C.

In some embodiments, the computer-readable storage medium may be memory such as FRAM, ROM, PROM, EP ROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.

In some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.

By way of example, executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (H TML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).

By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.

In summary, by obtaining the content features of the media account, clustering the content features, and performing residual error processing, the feature distribution difference of the content features can be hidden, only the distribution difference between the content features and the clustering center is reserved, the feature distribution of the media account in the content dimension is learned with high efficiency, so as to establish the relationship between the accounts in the content dimension, and because the interaction between the accounts and the user in the recommendation system is performed with the content as a carrier to a greater extent, the account vectors of the media accounts are depicted more accurately, and the accuracy of media account recommendation based on the similarity of the account vectors is improved.

The above description is only an example of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims

1. A media account recommending method based on artificial intelligence is characterized by comprising the following steps:

2. The method of claim 1,

the acquiring of the multiple content features of at least one interactive media account of the user account includes:

executing the following processing aiming at each interactive media account: extracting corresponding content features from a plurality of information issued by the interactive media account to serve as the content features of the interactive media account;

the obtaining of the plurality of content features corresponding to the plurality of candidate media accounts respectively includes:

performing the following for each of the candidate media accounts: extracting corresponding content features from a plurality of information issued by the candidate media accounts to serve as the content features of the candidate media accounts;

wherein the type of information comprises at least one of: video information, text information, image information.

3. The method of claim 1,

the clustering and residual error processing of the multiple content features of each interactive media account to obtain the content vector of the interactive media account includes:

executing the following processing aiming at each interactive media account:

clustering a plurality of content characteristics of the interactive media account to obtain at least one first clustering center;

performing residual error processing on a plurality of content features of the interactive media account based on the at least one first clustering center to obtain a content vector of the interactive media account;

the clustering and residual error processing are respectively performed on the plurality of content features of each candidate media account to obtain a content vector of each candidate media account, and the method comprises the following steps:

performing the following for each of the candidate media accounts:

clustering a plurality of content features of the candidate media accounts to obtain at least one second clustering center;

and residual errors of a plurality of content features of the candidate media accounts are processed based on the at least one second clustering center to obtain content vectors of the candidate media accounts.

4. The method of claim 3,

the content vector of the interactive media account is determined through a media account learning model, and the media account learning model comprises a local clustering core network and a normalization network;

the performing residual error processing on the plurality of content features of the interactive media account based on the at least one first clustering center to obtain a content vector of the interactive media account includes:

performing the following for each of the first cluster centers: determining, by the local clustering core network, a first residual distribution of a plurality of content features of the interactive media account corresponding to the first clustering center;

performing the following for each of the first cluster centers: normalizing the first residual distribution corresponding to the first clustering center through the normalization network to obtain a normalization result corresponding to the first clustering center;

and carrying out integral normalization processing on the normalization result of each first clustering center through the normalization network to obtain the content vector of the interactive media account.

5. The method of claim 4,

the media account learning model further comprises a squeeze activation network;

after the normalization result of each first clustering center is subjected to integral normalization processing through the normalization network to obtain the content vector of the interactive media account, the method further comprises the following steps:

performing channel-based average pooling on the content vector of the interactive media account through the extrusion activation network to obtain global content characteristics of each channel corresponding to the content vector of the interactive media account;

performing full connection processing on the content vector of the interactive media account corresponding to the global content feature of each channel through the extrusion activation network to obtain an activation value of the content vector of the interactive media account corresponding to each channel;

and performing dot multiplication on the content vector of the interactive media account corresponding to the activation value of each channel and the original content characteristic of each channel in the content vector of the interactive media account, and updating the content vector of the interactive media account based on the dot multiplication result.

6. The method of claim 4, wherein determining, by the local clustering core network, a first residual distribution of the plurality of content features of the interactive media account corresponding to the first clustering center comprises:

performing convolution processing on each content characteristic of the interactive media account through the local clustering core network to obtain a corresponding convolution result;

performing maximum likelihood function processing on the convolution result of each content feature of the interactive media account through the local clustering core network to obtain a corresponding maximum likelihood processing result;

determining, by the local clustering core network, a first residual between each content feature of each interactive media account and the first clustering center;

and taking the maximum likelihood processing result corresponding to each content feature of the interactive media account as a weight, and performing weighted summation processing on a first residual error corresponding to each content feature of the interactive media account to obtain a first residual error distribution of a plurality of content features of the interactive media account corresponding to the first clustering center.

7. The method of claim 1, wherein determining candidate media accounts to be recommended based on content similarity between the content vector of each interactive media account and the content vectors of the candidate media accounts comprises:

determining account number similarity between the account number vector of each interactive media account number and the account number vectors of the candidate media account numbers;

fusing the content similarity and the account similarity between each interactive media account and the candidate media accounts to obtain the similarity between each interactive media account and the candidate media accounts;

and performing descending sorting on the similarity between each interactive media account and the plurality of candidate media accounts, and selecting at least one candidate media account which is sorted in the descending sorting result and is ranked in the top as a candidate media account to be recommended.

8. The method of claim 7,

the determining account number similarity between the account number vector of each interactive media account number and the account number vectors of the candidate media account numbers includes:

performing the following for each of the candidate media accounts: extracting a plurality of account features from the account information of the candidate media accounts, and compressing the account features of the candidate media accounts to obtain account vectors of the candidate media accounts;

executing the following processing aiming at each interactive media account: extracting a plurality of account characteristics from the account information of the interactive media account, and compressing the account characteristics of the interactive media account to obtain an account vector of the interactive media account;

the fusing the content similarity and the account similarity between each interactive media account and the candidate media accounts comprises:

for each interactive media account, executing the following processing:

determining account number similarity between the account number vector of the interactive media account number and the account number vector of each candidate media account number, and determining content similarity between the content vector of the interactive media account number and the content vector of each candidate media account number;

and carrying out average processing on the account number similarity and the content similarity.

9. The method of claim 8,

compressing the plurality of account features of the candidate media account to obtain an account vector of the candidate media account, including:

embedding the plurality of account characteristics of the candidate media accounts to obtain account embedding characteristics of the plurality of account characteristics of the candidate media accounts;

based on the weights of the account characteristics of the candidate media accounts, carrying out weighted summation processing on the account embedding characteristics of the candidate media accounts to obtain account vectors corresponding to the candidate media accounts;

compressing the plurality of account features of the interactive media account to obtain an account vector of the interactive media account, including:

embedding the plurality of account characteristics of the interactive media account to obtain account embedding characteristics of the plurality of account characteristics of the interactive media account;

and carrying out weighted summation processing on the plurality of account embedding characteristics of the interactive media account based on the weights of the plurality of account characteristics of the interactive media account to obtain an account vector corresponding to the interactive media account.

10. The method of claim 1,

the content vector of the interactive media account is determined through a media account learning model;

before obtaining a plurality of content features of at least one interactive media account of a user account and obtaining a plurality of content features corresponding to a plurality of candidate media accounts, the method comprises the following steps:

training the media account learning model by:

acquiring a plurality of media account number samples, and constructing a plurality of first-type triple samples based on the number of associated users of the plurality of media account number samples;

performing disassembly processing on the first triple samples to obtain a plurality of media account samples, so as to predict a sample content vector of each media account sample through the media account learning model;

determining a second type of triple sample meeting the training condition according to the sample content vector;

and substituting the sample content vector corresponding to each media account sample in the second type of triple samples into a triple loss function to determine parameters of the media account learning model when the triple loss function obtains a minimum value.

11. The method of claim 10, wherein determining a second type of triple sample that meets a training condition based on the sample content vector comprises:

taking any one media account sample as a first media account sample, obtaining a second media account sample with the same label as the first media account sample from a plurality of media account samples, and determining a first content distance between a sample content vector of the second media account sample and a sample content vector of the first media account sample;

obtaining a third media account sample with a label different from that of the first media account sample from the plurality of media account samples, and determining a second content distance between a sample content vector of the third media account sample and a sample content vector of the first media account sample;

extracting a second media account sample and a third media account sample which meet the following training conditions from the plurality of second media account samples and the plurality of third media account samples:

a second content distance corresponding to the third media account sample is greater than a first content distance corresponding to the second media account sample;

a difference between a second content distance corresponding to the third media account sample and a first content distance corresponding to the second media account sample is less than a first threshold;

and combining the first media account sample, a second media account sample meeting the training condition and a third media account sample into the second triple sample.

12. The method of claim 11, wherein obtaining a second sample of media accounts from the plurality of samples of media accounts having a same label as the first sample of media accounts comprises:

acquiring a correlated user of each residual media account sample;

wherein the remaining media account samples are media account samples that are different from the first media account sample in the plurality of media account samples;

determining a user intersection between the associated users of each of the remaining media account samples and the associated users of the first media account sample;

determining the remaining media account samples with the number of elements in the user interaction exceeding a second threshold as second media account samples with the same label as the first media account samples;

the obtaining a third media account sample from the plurality of media account samples with a different label than the first media account sample comprises:

determining the remaining media account number samples with the number of elements in the user interaction being less than a third threshold as third media account number samples with different labels from the first media account number sample;

wherein the second threshold is greater than the third threshold.

13. An artificial intelligence based media account recommendation device, comprising:

14. An electronic device, comprising:

a memory for storing executable instructions;

a processor configured to execute the executable instructions stored in the memory to implement the artificial intelligence based media account recommendation method of any of claims 1-12.

15. A computer-readable storage medium storing executable instructions for implementing the artificial intelligence based media account recommendation method of any one of claims 1 to 12 when executed by a processor.