CN117235371A

CN117235371A - Video recommendation method, model training method and device

Info

Publication number: CN117235371A
Application number: CN202311411913.8A
Authority: CN
Inventors: 张帅兵
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2023-10-27
Filing date: 2023-10-27
Publication date: 2023-12-15

Abstract

The invention provides a video recommendation method, a model training method and a device, relates to the technical field of artificial intelligence, in particular to the technical fields of deep learning, large models and the like, and comprises the following steps: the method comprises the steps of obtaining behavior data of a user aiming at a first video, inputting the behavior data into a large model, generating a user portrait of the user and a video portrait of the first video based on the behavior data through the large model, and determining a recommended video of the user according to the user portrait of the user and the video portrait of the first video.

Description

Video recommendation method, model training method and device

Technical Field

The disclosure relates to the technical field of artificial intelligence, in particular to the technical fields of deep learning, large models and the like, and particularly relates to a video recommendation method, a model training method, a device, electronic equipment, a storage medium and a computer program product.

Background

The short video is developed rapidly, the frequency and the duration of the user on the short video platform are increased, the behavior data produced in the video list are also various and disordered, along with the continuous development of the artificial intelligence technology, the large language model has the advantages of good generalization and the like, and is widely applied to the fields of information extraction, text credibility evaluation, machine translation and the like, however, the video recommendation method in the related technology has the problem of lower accuracy in the video recommendation process.

Disclosure of Invention

The present disclosure provides a video recommendation method, a model training method, an apparatus, an electronic device, a storage medium, and a computer program product.

According to a first aspect of the present disclosure, a video recommendation method is provided, including: acquiring behavior data of a user aiming at a first video; inputting the behavior data into a large model, and generating a user portrait of the user and a video portrait of the first video based on the behavior data through the large model; and determining recommended video of the user according to the user portrait of the user and the video portrait of the first video.

According to a second aspect of the present disclosure, a model training method is presented, comprising: obtaining a training sample, wherein the training sample comprises sample behavior data of a sample user aiming at a sample video, a sample user image of the sample user and a sample video image of the sample video; and training the large model according to the training sample.

According to a third aspect of the present disclosure, there is provided a video recommendation apparatus, comprising: the acquisition module is used for acquiring behavior data of a user aiming at the first video; the generation module is used for inputting the behavior data into a large model, and generating a user portrait of the user and a video portrait of the first video based on the behavior data through the large model; and the recommending module is used for determining recommended videos of the user according to the user portrait of the user and the video portrait of the first video.

According to a fourth aspect of the present disclosure, a model training apparatus is provided, including a first acquisition module configured to acquire a training sample, where the training sample is used for sample behavior data of a sample user for a sample video, a sample user image of the sample user, and a sample video image of the sample video; and the training module is used for training the large model according to the training sample.

According to a fifth aspect of the present disclosure, there is provided an electronic device, comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the video recommendation method set forth in the first aspect or the model training method set forth in the second aspect.

According to a sixth aspect of the present disclosure, a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the video recommendation method set forth in the first aspect or the model training method set forth in the second aspect is provided.

According to a seventh aspect of the present disclosure, a computer program product is presented, comprising a computer program which, when executed by a processor, implements the video recommendation method presented in the first aspect above, or implements the model training method presented in the second aspect above.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a flowchart of a video recommendation method according to an embodiment of the disclosure;

FIG. 2 is a flowchart of a video recommendation method according to another embodiment of the disclosure;

FIG. 3 is a flowchart illustrating a video recommendation method according to another embodiment of the disclosure;

FIG. 4 is a flowchart of a video recommendation method according to another embodiment of the disclosure;

FIG. 5 is a flow chart of a model training method according to another embodiment of the present disclosure;

FIG. 6 is a schematic diagram of a video recommendation apparatus according to an embodiment of the disclosure;

FIG. 7 is a schematic diagram of a model training apparatus according to an embodiment of the disclosure;

Fig. 8 is a schematic block diagram of an electronic device of an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Artificial intelligence (Artificial Intelligence, AI for short) is a piece of technical science that studies, develops theories, methods, techniques and application systems for simulating, extending and expanding human intelligence. At present, the AI technology has the advantages of high automation degree, high accuracy and low cost, and is widely applied.

Deep Learning (DL) is a new research direction in the field of Machine Learning (ML), and is the inherent rule and expression level of Learning sample data, so that a Machine can analyze Learning ability like a person, can recognize data such as characters, images and sounds, and is widely applied to speech and image recognition.

The large model is a machine learning model with huge parameter scale and complexity, requires a large amount of computing resources and storage space to train and store, and often requires distributed computing and special hardware acceleration technology, and has stronger generalization capability and expression capability.

Fig. 1 is a flowchart illustrating a video recommendation method according to an embodiment of the disclosure. As shown in fig. 1, the method includes:

s101, acquiring behavior data of a user for a first video.

It should be noted that, the execution body of the advertisement recall method according to the embodiment of the present disclosure may be a hardware device having data information processing capability and/or software necessary for driving the hardware device to operate. Alternatively, the execution body may include a workstation, a server, a computer, a user terminal, and other intelligent devices. The user terminal comprises, but is not limited to, a mobile phone, a computer, intelligent voice interaction equipment, intelligent household appliances, vehicle-mounted terminals and the like.

It should be noted that, the specific manner of acquiring the behavior data of the user for the first video is not limited in this disclosure, and may be selected according to actual situations.

Optionally, an interaction log record corresponding to the user in the scene of the first video may be obtained through the network interface, and behavior data of the user for the first video may be obtained from the interaction log record.

The behavior data may be understood as interactive behavior data of the user in the scene of the first video.

Alternatively, the behavior data of the user for the first video may be positive behavior data and negative behavior data of the user for the first video.

For example, the user requests, comments, collections, complete plays, etc. forward behavior data for the first video; negative behavioral data of no interest, fast-rowing, fast-forwarding, etc. of the user for the first video.

S102, behavior data are input into a large model, and a user portrait of a user and a video portrait of a first video are generated based on the behavior data through the large model.

The large model is trained in advance by adopting a deep learning algorithm, behavior data is input into the large model, and a user portrait of a user and a video portrait of a first video can be generated by the large model based on the behavior data.

Wherein the user portrayal is a user tag, for example: gender, age, education level, user viewing preferences for different types of video, viewing habits, and the like.

For example, for user A's user portraits, user A typically views a fitness-type video at 7:00 a.m., and user A typically views a movie-narrative-type video at 8:00 a.m.

Wherein the video image is a video tag, for example: video type and video content, etc.

For example, the video portraits may be of the type of food, video, fun, etc., and may be refined according to the video content, for example: for food, the food videos can be classified into food courses, food eating and broadcasting according to the video content.

S103, determining recommended video of the user according to the user portrait of the user and the video portrait of the first video.

In the embodiment of the disclosure, after the user portrait and the video portrait are acquired, the recommended video of the user can be determined according to the user portrait of the user and the video portrait of the first video.

The specific manner of determining the recommended video of the user according to the user portrait and the video portrait is not limited in the present disclosure, and may be selected according to the actual situation.

Alternatively, the preference of the user for the first video may be determined based on the user representation of the user and the video representation of the first video, and the recommended video may be determined based on the preference.

According to the video recommendation method, behavior data of a user aiming at a first video are acquired, the behavior data are input into a large model, user portraits of the user and video portraits of the first video are generated through the large model based on the behavior data, and recommended videos of the user are determined according to the user portraits of the user and the video portraits of the first video.

Fig. 2 is a flowchart illustrating a video recommendation method according to a second embodiment of the present disclosure.

As shown in fig. 2, on the basis of the embodiment shown in fig. 1, the video recommendation method according to the embodiment of the present disclosure may specifically include the following steps:

s201, behavior data of a user for a first video is acquired.

S202, behavior data are input into a large model, and a user portrait of a user and a video portrait of a first video are generated based on the behavior data through the large model.

It should be noted that, during the process of watching video, the behavior data of the user is continuously changed, so in order to improve the accuracy of video recommendation, the user image and the video portrait need to be updated.

Optionally, if the updating condition of the portrait is currently met, acquiring behavior data of the user for the second video within a set time period from the current moment, inputting the acquired behavior data into a large model, regenerating the user portrait of the user based on the acquired behavior data through the large model, generating the video portrait of the second video, updating the original user portrait of the user into the regenerated user portrait of the user, and updating the video portrait of the first video into the video portrait of the second video.

The present disclosure is not limited to the setting of the update condition of the image, and may be set according to actual conditions.

Alternatively, the update condition of the portrait may be set such that the portrait is updated every other week; optionally, the updating condition of the portrait may be set to perform custom updating according to the behavior data of the recommended video by the user.

For the relevant content of steps S201-S202, refer to the above embodiment, and are not repeated here.

Step S103 "determining the recommended video of the user from the user portrait of the user and the video portrait of the first video" in the above embodiment may specifically include the following steps S203 and S204.

S203, determining the preference degree of the user for the first video according to the user portrait of the user and the video portrait of the first video.

The specific manner of determining the preference of the user for the first video according to the user portrait and the video portrait is not limited in this disclosure, and may be selected according to actual situations.

Alternatively, the user image and the video representation of the first video may be analyzed by artificial intelligence techniques to obtain a user's preference for the first video.

The preference degree of the user for the first video can measure the weight of the first video, when the preference degree of the first video is higher, namely the weight value of the first video is higher, and when the preference degree of the first video is lower, namely the weight value of the first video is lower.

S204, determining recommended videos based on the preference degree.

Optionally, if the preference degree corresponding to the first video is greater than the third set threshold, taking the first video as the target video, and screening candidate videos similar to the target video from the video library to be taken as recommended videos.

The setting of the third setting threshold is not limited in this disclosure, and may be set according to actual situations.

After the target video is obtained, the similarity between the target video and the video in the video library can be calculated, and the recommended video can be determined according to the similarity between the target video and the video in the video library.

Optionally, if the similarity between the target video and the similarity of the videos in the video library is greater than the similarity threshold, determining the candidate video greater than the similarity threshold as the recommended video.

Optionally, videos in the video library may be sorted according to a descending order of similarity, and M candidate videos before sorting are determined to be recommended videos, where M is a positive integer.

It should be noted that, after the recommended video is obtained, the recommended video may be recommended to the user according to a preset recommendation policy.

Optionally, the weight of the recommended video can be obtained, and the recommended video is sequentially recommended to the user according to the weight; alternatively, the recommended videos may be recommended to the user in a random manner.

According to the video recommendation method, behavior data of a user aiming at a first video are acquired, the behavior data are input into a large model, user portraits of the user and video portraits of the first video are generated through the large model based on the behavior data, preference of the user aiming at the first video is determined according to the user portraits of the user and the video portraits of the first video, and recommended videos are determined based on the preference.

Fig. 3 is a flowchart illustrating a video recommendation method according to a second embodiment of the present disclosure.

As shown in fig. 3, on the basis of the embodiment shown in fig. 3, the video recommendation method according to the embodiment of the disclosure may specifically include the following steps:

s301, behavior data of a user for a first video are acquired.

S302, behavior data are input into a large model, and a user portrait of a user and a video portrait of a first video are generated through the large model based on the behavior data.

S303, determining recommended video of the user according to the user portrait of the user and the video portrait of the first video.

For the relevant content of steps S301-S303, refer to the above embodiments, and are not repeated here.

S304, if the user is a non-loss user, obtaining the loss probability of the user according to the user portrait of the user and the video portrait of the first video.

It should be noted that, the specific manner of determining whether the user is a lost user is not limited, and may be selected according to actual situations.

Optionally, the feature data of the user may be analyzed through a pre-trained model to determine whether the user is a churn user.

Optionally, active index data of the user for the video platform may be obtained, for example: and judging whether the user is a lost user or not according to login data of the video platform, consumption data of the video platform and the like.

For example, if the user logs in to the video application within a set period of time (e.g., 7 days), the user may be determined to be a non-churn user.

The specific way of obtaining the loss probability of the user according to the user portrait and the video portrait is not limited, and the method can be selected according to actual situations.

Alternatively, the user's attrition probability may be obtained based on the user representation and the video representation through an attrition probability prediction model.

S305, if the loss probability is greater than or equal to a first set threshold, adjusting the operation strategy of the user.

In the embodiment of the application, after the loss probability is obtained, the loss probability can be compared with the first set threshold, and if the loss probability is greater than or equal to the first set threshold, the operation strategy of the user can be adjusted.

Alternatively, the user's saving policy may be determined based on the user representation and the video representation, and the user may be saved by the saving policy.

S306, if the user is a loss user, performing root cause analysis on the loss of the user according to the user portrait of the user and the video portrait of the first video.

It should be noted that, the specific process of determining whether the user is a lost user may be referred to the above embodiment, and will not be described herein.

If the user is a loss user, the loss type of the loss user can be obtained according to the user portrait of the user and the video image of the first video, so as to analyze the root cause of the loss of the user, and optionally, the root cause of the loss can be poor video quality, related advertising marketing of the video, and the like.

In summary, the video recommendation method provided by the present disclosure includes the steps of obtaining behavior data of a user for a first video, inputting the behavior data to a large model, generating a user portrait of the user and a video portrait of the first video based on the behavior data through the large model, determining a recommended video of the user according to the user portrait of the user and the video portrait of the first video, obtaining a loss probability of the user according to the user portrait of the user and the video portrait of the first video if the user is a non-loss user, adjusting an operation policy of the user if the loss probability is greater than or equal to a first set threshold, and performing root cause analysis on the user according to the user portrait of the user and the video portrait of the first video if the user is a loss user.

Fig. 4 is a flowchart of a video recommendation method according to a fourth embodiment of the present disclosure.

As shown in fig. 4, on the basis of the embodiment shown in fig. 1, the video recommendation method according to the embodiment of the present disclosure may specifically include the following steps:

s401, behavior data of a user for a first video is acquired.

S402, behavior data is input into a large model, and a user portrait of a user and a video portrait of a first video are generated based on the behavior data through the large model.

S403, determining recommended video of the user according to the user portrait of the user and the video portrait of the first video.

For the relevant content of steps S401 to S403, refer to the above embodiment, and are not repeated here.

S404, obtaining the similarity between the user images of any two users in the user group.

In the embodiment of the application, when any two users in the user group are the first user and the second user, optionally, the target similarity between the user portrait of the first user and the user portrait of the second user can be obtained.

The higher the target similarity between the user representation of the first user and the user representation of the second user, the more common feature points the first user and the second user have are indicated.

S405, dividing the user group based on the similarity to obtain a plurality of user subgroups.

It should be noted that users in the user subgroup often have a common preference, for example: preference for food tutorial video, again for example: video narrative videos are preferred.

Optionally, the plurality of second users may be sorted in descending order according to the target similarity, and the first user and the first N second users are divided into the same user subgroup, where N is a positive integer.

The number of N is not limited in this disclosure, and may be set according to actual situations. For example: n can be set to 1000; also for example: n may be set to 2000.

For example, for N of 1000, the plurality of second users may be sorted in descending order according to the target similarity, the first 1000 second users are sorted according to the sorting result, and the first user and the first 1000 second users are sorted into the same user subgroup.

Optionally, if the similarity corresponding to any two users is greater than the second set threshold, dividing any two users into the same user subgroup.

The setting of the second setting threshold is not limited in this disclosure, and may be set according to actual situations. For example: the second set threshold may be set to 90%; also for example: the second set threshold may be set to 85%.

For example, for the second set threshold of 90%, if the similarity between the third user and the fourth user is greater than 90%, the third user and the fourth user are classified into the same user subgroup.

In summary, according to the video recommendation method provided by the disclosure, behavior data of a user aiming at a first video is acquired, the behavior data is input into a large model, a user portrait of the user and a video portrait of the first video are generated through the large model based on the behavior data, a recommended video of the user is determined according to the user portrait of the user and the video portrait of the first video, the similarity between user portraits of any two users in a user group is acquired, and the user group is divided based on the similarity, so that a plurality of user subgroups are obtained.

Fig. 5 is a flow chart of a model training method according to an embodiment of the disclosure. As shown in fig. 5, the method includes:

s501, acquiring training samples, wherein the training samples comprise sample behavior data of a sample user for a sample video, sample user portraits of the sample user and sample video portraits of the sample video.

It should be noted that, the execution body of the model training method according to the embodiment of the present disclosure may be a hardware device having data information processing capability and/or software necessary for driving the hardware device to operate. Alternatively, the execution body may include a workstation, a server, a computer, a user terminal, and other intelligent devices. The user terminal comprises, but is not limited to, a mobile phone, a computer, intelligent voice interaction equipment, intelligent household appliances, vehicle-mounted terminals and the like.

It should be noted that, the specific manner of obtaining the training sample is not limited in this disclosure, and may be selected according to practical situations.

Alternatively, historical interaction behavior data of the sample user for the sample video may be collected and preprocessed, for example: and cleaning, normalizing and extracting features of the historical interaction behavior data, removing abnormal data, noise and other preprocessing operations, and acquiring sample behavior data, sample user portraits of sample users and sample video portraits of sample videos based on the preprocessed historical interaction behavior data so as to obtain training samples.

S502, training the large model according to the training sample.

In the embodiment of the disclosure, after the training sample is obtained, the large model may be trained according to the training sample.

It should be noted that, for the type of the large model, the selection may be performed according to the actual situation, which is not limited too much.

Alternatively, sample behavior data in the training samples may be input into the large model, with the large model outputting a predicted sample user representation of the sample user and a predicted sample video representation of the sample video, the large model being trained based on the predicted sample user representation and the sample user representation, the predicted sample video representation and the sample video representation.

For example, a loss function of the large model may be obtained based on the predicted sample user representation and the sample user representation, the predicted sample video representation and the sample video representation, model parameters of the large model are updated based on the loss function, and the next training sample is returned to be adopted, so that training of the large model after the model parameters are continuously adjusted until the model training end condition is satisfied.

It should be noted that, the setting of the model training ending condition is not limited in the present disclosure, and the setting of the model training ending condition may be performed according to actual situations.

Optionally, the model training ending condition may be set such that the loss function value is smaller than a preset loss threshold; optionally, the model training ending condition may be set such that the number of adjustments of the model parameters of the large model reaches a preset number of thresholds.

According to the model training method, the training samples are obtained, wherein the training samples comprise sample behavior data of a sample user aiming at a sample video, a sample user portrait of the sample user and a sample video portrait of the sample video, and the large model is trained according to the training samples, so that the large model can learn the relation between the sample behavior data and the predicted sample user portrait and the predicted sample video portrait in the training process, and the trained large model can generate the user portrait and the video portrait based on the behavior data so as to realize video recommendation.

In the technical scheme of the disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user accord with the regulations of related laws and regulations, and the public order colloquial is not violated.

According to an embodiment of the present disclosure, the present disclosure further provides a video recommendation apparatus, which is configured to implement the above-mentioned video recommendation method.

Fig. 6 is a block diagram of a video recommendation device according to an embodiment of the present disclosure.

As shown in fig. 6, the video recommendation apparatus 600 includes: an acquisition module 601, a generation module 602 and a recommendation module 603.

An obtaining module 601, configured to obtain behavior data of a user for a first video;

A generating module 602, configured to input the behavior data into a large model, and generate, based on the behavior data, a user portrait of the user and a video portrait of the first video through the large model;

a recommendation module 603, configured to determine a recommended video of the user according to the user portrait of the user and the video portrait of the first video.

In one embodiment of the present disclosure, the apparatus 600 is further for: and if the user is a non-loss user, acquiring the loss probability of the user according to the user portrait of the user and the video portrait of the first video.

In one embodiment of the present disclosure, the apparatus 600 is further for: and if the loss probability is greater than or equal to a first set threshold value, adjusting the operation strategy of the user.

In one embodiment of the present disclosure, the apparatus 600 is further for: and if the user is a loss user, carrying out root cause analysis on the loss of the user according to the user portrait of the user and the video portrait of the first video.

In one embodiment of the present disclosure, the apparatus 600 is further for: obtaining the similarity between user images of any two users in the user group; and dividing the user group based on the similarity to obtain a plurality of user subgroups.

In one embodiment of the present disclosure, the apparatus 600 is further for: and if the similarity corresponding to any two users is greater than a second set threshold, dividing the any two users into the same user subgroup.

In one embodiment of the present disclosure, the apparatus 600 is further for: obtaining target similarity between a user portrait of a first user and a user portrait of a second user, wherein the user group is used for the first user and the second user; the step of dividing the user group based on the similarity to obtain a plurality of user subgroups includes: sorting the plurality of second users in descending order according to the target similarity; dividing the first user and N second users before sorting into the same user subgroup, wherein N is a positive integer.

In one embodiment of the present disclosure, the apparatus 600 is further for: if the updating condition of the portrait is met currently, acquiring behavior data of the user for the second video within a set time length from the current moment; inputting the re-acquired behavior data into the large model, regenerating a user portrait of the user based on the re-acquired behavior data through the large model, and generating a video portrait of the second video; updating the user portrait of the original user to the regenerated user portrait of the user, and updating the video portrait of the first video to the video portrait of the second video.

In one embodiment of the present disclosure, the determining module 603 is configured to: determining a preference degree of the user for the first video according to the user portrait of the user and the video portrait of the first video; and determining the recommended video based on the preference.

In one embodiment of the present disclosure, the determining module 603 is configured to: if the preference degree corresponding to the first video is larger than a third set threshold value, taking the first video as a target video; and screening candidate videos similar to the target video from a video library to serve as the recommended video.

According to the video recommendation device, the behavior data of the user aiming at the first video is acquired and is input into the large model, the user portrait of the user and the video portrait of the first video are generated based on the behavior data through the large model, and the recommended video of the user is determined according to the user portrait of the user and the video portrait of the first video.

According to an embodiment of the present disclosure, the present disclosure further provides a model training apparatus, which is configured to implement the above-mentioned model training method.

FIG. 7 is a block diagram of a model training apparatus according to an embodiment of the present disclosure.

As shown in fig. 7, the model training apparatus 700 includes: a first acquisition module 701 and a training module 702.

A first obtaining module 701, configured to obtain a training sample, where the training sample is used for sample behavior data of a sample user for a sample video, a sample user image of the sample user, and a sample video portrait of the sample video;

and the training module 702 is used for training the large model according to the training samples.

According to the model training device, the training samples are obtained, wherein the training samples comprise sample behavior data of a sample user aiming at a sample video, a sample user portrait of the sample user and a sample video portrait of the sample video, and the large model is trained according to the training samples, so that the large model can learn the relation between the sample behavior data and a predicted sample user portrait and a predicted sample video portrait in the training process, and the trained large model can generate a user portrait and a video portrait based on the behavior data so as to realize video recommendation.

According to embodiments of the present disclosure, the present disclosure also proposes an electronic device, a readable storage medium and a computer program product.

Fig. 8 illustrates a schematic block diagram of an example electronic device 800 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The computing unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to the bus 804.

Various components in device 800 are connected to I/O interface 805, including: an input unit 806 such as a keyboard, mouse, etc.; an output unit 806, such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, etc.; and a communication unit 809, such as a network card, modem, wireless communication transceiver, or the like. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

The computing unit 801 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 801 performs the respective methods and processes described above, such as a video recommendation method, a model training method. For example, in some embodiments, the video recommendation method, the model training method, may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 808. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 800 via ROM 802 and/or communication unit 809. When the computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of the video recommendation method, the model training method described above may be performed. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the video recommendation method, the model training method, by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To address interactions with a user account, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user account; and a keyboard and pointing device (e.g., a mouse or trackball) through which a user account may present input to the computer. Other kinds of devices may also be used to propose interactions with a user account; for example, feedback presented to the user account may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user account may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user account computer having a graphical user account interface or a web browser through which a user account can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.

According to an embodiment of the present disclosure, there is also provided a computer program product, including a computer program, where the computer program, when executed by a processor, implements the steps of the video recommendation method and the model training method described in the foregoing embodiments of the present disclosure.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A video recommendation method, comprising:

acquiring behavior data of a user aiming at a first video;

inputting the behavior data into a large model, and generating a user portrait of the user and a video portrait of the first video based on the behavior data through the large model;

And determining recommended video of the user according to the user portrait of the user and the video portrait of the first video.

2. The method according to claim 1, wherein the method further comprises:

and if the user is a non-loss user, acquiring the loss probability of the user according to the user portrait of the user and the video portrait of the first video.

3. The method according to claim 2, wherein the method further comprises:

and if the loss probability is greater than or equal to a first set threshold value, adjusting the operation strategy of the user.

4. The method according to claim 1, wherein the method further comprises:

and if the user is a loss user, carrying out root cause analysis on the loss of the user according to the user portrait of the user and the video portrait of the first video.

5. The method according to claim 1, wherein the method further comprises:

obtaining the similarity between user images of any two users in the user group;

and dividing the user group based on the similarity to obtain a plurality of user subgroups.

6. The method of claim 5, wherein dividing the user group based on the similarity results in a plurality of user subgroups, comprising:

And if the similarity corresponding to any two users is greater than a second set threshold, dividing the any two users into the same user subgroup.

7. The method of claim 5, wherein the obtaining the similarity between the user images of any two users in the user group comprises:

obtaining target similarity between a user portrait of a first user and a user portrait of a second user, wherein the user group comprises the first user and the second user;

the step of dividing the user group based on the similarity to obtain a plurality of user subgroups includes:

sorting the plurality of second users in descending order according to the target similarity;

dividing the first user and N second users before sorting into the same user subgroup, wherein N is a positive integer.

8. The method according to any one of claims 1-7, further comprising:

if the updating condition of the portrait is met currently, acquiring behavior data of the user for the second video within a set time length from the current moment;

inputting the re-acquired behavior data into the large model, regenerating a user portrait of the user based on the re-acquired behavior data through the large model, and generating a video portrait of the second video;

Updating the user portrait of the original user to the regenerated user portrait of the user, and updating the video portrait of the first video to the video portrait of the second video.

9. The method of any of claims 1-7, wherein the determining the recommended video for the user based on the user representation of the user and the video representation of the first video comprises:

determining a preference degree of the user for the first video according to the user portrait of the user and the video portrait of the first video;

and determining the recommended video based on the preference.

10. The method of claim 9, wherein the determining the recommended video based on the preference comprises:

if the preference degree corresponding to the first video is larger than a third set threshold value, taking the first video as a target video;

and screening candidate videos similar to the target video from a video library to serve as the recommended video.

11. A method of model training, comprising:

obtaining a training sample, wherein the training sample comprises sample behavior data of a sample user aiming at a sample video, a sample user image of the sample user and a sample video image of the sample video;

And training the large model according to the training sample.

12. A video recommendation device, comprising:

the acquisition module is used for acquiring behavior data of a user aiming at the first video;

the generation module is used for inputting the behavior data into a large model, and generating a user portrait of the user and a video portrait of the first video based on the behavior data through the large model;

and the recommending module is used for determining recommended videos of the user according to the user portrait of the user and the video portrait of the first video.

13. The apparatus of claim 12, wherein the apparatus is further configured to:

14. The apparatus of claim 13, wherein the apparatus is further configured to:

15. The apparatus of claim 12, wherein the apparatus is further configured to:

16. The apparatus of claim 12, wherein the apparatus is further configured to:

17. The apparatus of claim 16, wherein the apparatus is configured to:

18. The apparatus of claim 16, wherein the apparatus is configured to:

obtaining target similarity between a user portrait of a first user and a user portrait of a second user, wherein the user group is used for the first user and the second user;

19. The apparatus according to any one of claims 12-18, wherein the apparatus is further configured to:

20. The apparatus according to any one of claims 12-18, wherein the determining module is configured to:

and determining the recommended video based on the preference.

21. The apparatus of claim 20, wherein the means for determining is configured to:

22. A model training device, characterized by:

the first acquisition module is used for acquiring training samples, wherein the training samples are used for sample behavior data of a sample user aiming at a sample video, sample user images of the sample user and sample video images of the sample video;

and the training module is used for training the large model according to the training sample.

23. An electronic device comprising a processor and a memory;

wherein the processor runs a program corresponding to executable program code stored in the memory by reading the executable program code for implementing the method of any one of claims 1-10 or claim 11.

24. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method of any one of claims 1-10 or claim 11.

25. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-10 or claim 11.